dapla_metadata.standards package

Subpackages

dapla_metadata.standards.name_validator module

class NamingStandardReport(validation_results)[source]

Bases: object

Report object for name standard validation.

Parameters:

validation_results (list[ValidationResult])

evaluate_result()[source]

Returns an appropriate message based on the success rate.

Return type:

str

generate_report()[source]

Format the report as a string.

Return type:

str

success_rate()[source]

Calculate the success rate as a percentage.

Returns:

The success rate as a percentage, or None if no files were validated.

Return type:

int | float | None

class ValidationResult(success, file_path)[source]

Bases: object

Result object for name standard validation.

Parameters:
  • success (bool)

  • file_path (str)

add_message(message)[source]

Add message to list.

Return type:

None

Parameters:

message (str)

add_violation(violation)[source]

Add violation to list.

Return type:

None

Parameters:

violation (str)

to_dict()[source]

Return result as a dictionary.

Return type:

dict

async validate_directory(path)[source]

Validate a file or recursively validate all files in a directory.

Return type:

AsyncGenerator[Union[AsyncGenerator, Task]]

Parameters:

path (ReadablePath | VFSPathLike | PathLike[str] | str)

dapla_metadata.standards.standard_validators module

async check_naming_standard(file_path)[source]

Check whether a given path follows the SSB naming standard.

This function checks whether the provided file_path and subdirectories thereof comply with the naming standard. Currently we only examine ‘.parquet’ files. Other files are ignored.

Parameters:

file_path (Union[str, PathLike[str]]) –

The path to a bucket, directory, or specific file to validate. This can be in the following forms: - A bucket URL in the form ‘gs://ssb-dapla-felles-data-produkt-test’ - An absolute path to a mounted bucket in the form ‘/buckets/produkt’ - Any subdirectory or file thereof

We also accept paths which don’t yet exist so that you can test if a path will comply.

Returns:

A list of validation results, including success status, checked file path, messages, and any detected violations.

Return type:

list[ValidationResult]

Examples

>>> check_naming_standard("/data/example_file.parquet").success
False
>>> check_naming_standard("/buckets/produkt/datadoc/utdata/person_data_p2021_v2.parquet").success
True
async flatten_generator(gen)[source]

Recursively flatten nested async generators.

Return type:

AsyncGenerator[Task, None]

Parameters:

gen (AsyncGenerator)

generate_validation_report(validation_results)[source]

Generate and print a formatted naming standard validation report.

This function takes a list of ValidationResult objects, creates a NamingStandardReport instance, and prints the generated report.

Parameters:
  • validation_results (list[ValidationResult]) – A list of ValidationResult objects that

  • checks. (contain the outcomes of the name standard)

Returns:

An instance of NamingStandardReport containing the validation results.

Return type:

NamingStandardReport