nudb_use.quality.specific_variables package¶
nudb_use.quality.specific_variables.gro_elevstatus module¶
Validations for the gro_elevstatus variable.
- check_gro_elevstatus(df, **kwargs)¶
Run all gro_elevstatus-specific checks on the provided DataFrame.
- Parameters:
df (
DataFrame) – DataFrame containing gro_elevstatus and supporting columns.**kwargs (
object) – Additional keyword arguments for future compatibility.
- Returns:
Errors reported by the gro_elevstatus checks.
- Return type:
list[NudbQualityError]
- subcheck_elevstatus_utd_211(utd_utdanningstype, gro_elevstatus)¶
Ensure gro_elevstatus is ‘E’ whenever utd_utdanningstype equals 211.
- Parameters:
utd_utdanningstype (
Series|None) – Series containing the utdanningstype codes.gro_elevstatus (
Series|None) – Series containing gro_elevstatus values.
- Returns:
Error when invalid combinations exist, else None.
- Return type:
NudbQualityError | None
nudb_use.quality.specific_variables.grunnskolepoeng module¶
Validations for the gr_grunnskolepoeng variable.
- check_grunnskolepoeng(df, **kwargs)¶
Run grunnskolepoeng validations on the provided dataset.
- Parameters:
df (
DataFrame) – DataFrame containing the grunnskolepoeng column.**kwargs (
object) – Currently unused keyword arguments. Passed in from parent function.
- Returns:
Errors produced by the sub-checks.
- Return type:
list[NudbQualityError]
- subcheck_grunnskolepoeng_maxval(grunnskolepoeng, max_poeng=70.0)¶
Verify that grunnskolepoeng values stay within the allowed maximum.
- Parameters:
grunnskolepoeng (
Series|None) – Series containing grunnskolepoeng values.max_poeng (
float) – Maximum allowed points.
- Returns:
Error when values exceed the maximum, else None.
- Return type:
NudbQualityError | None
nudb_use.quality.specific_variables.kommune module¶
Validations for kommune-coded variables referencing KLASS 131.
- check_kommune(df, **kwargs)¶
Run kommune-specific validations on all kommune columns in the DataFrame.
- Parameters:
df (
DataFrame) – DataFrame that may contain kommune columns.**kwargs (
object) – Placeholder for future configuration. Passed in from parent function.
- Returns:
Errors aggregated from kommune checks.
- Return type:
list[NudbQualityError]
- subcheck_only_single_sentinel_value_9999_allowed(kommune_col)¶
Ensure kommune codes only use ‘9999’ as the sentinel value.
- Parameters:
kommune_col (
Series|None) – Series with kommune codes.- Returns:
Error when other sentinel values exist, else None.
- Return type:
NudbQualityError | None
- subcheck_single_kommune_oslo_svalbard_utland(kommune_col)¶
Ensure fylker with single municipality codes are mapped correctly.
- Parameters:
kommune_col (
Series|None) – Series with kommune codes.- Returns:
Error when illegal mappings exist, else None.
- Return type:
NudbQualityError | None
nudb_use.quality.specific_variables.land module¶
Validations for country (land) variables mapping to KLASS 91.
- check_land(df, **kwargs)¶
Run land-specific validations for all configured columns.
- Parameters:
df (
DataFrame) – DataFrame containing potential land variables.**kwargs (
object) – Placeholder for future options. Passed in from parent function.
- Returns:
Errors aggregated from land checks.
- Return type:
list[NudbQualityError]
- subcheck_landkode_000(land_col, col_name)¶
Ensure the reserved land code 000 is not incorrectly used.
- Parameters:
land_col (
Series|None) – Series with land codes.col_name (
str) – Name of the land column for logging context.
- Returns:
Error when invalid codes are present, else None.
- Return type:
NudbQualityError | None
nudb_use.quality.specific_variables.nus2000 module¶
Validations for the nus2000 classification variable.
- check_nus2000(df, **kwargs)¶
Run all nus2000-specific validation checks on the provided dataset.
- Parameters:
df (
DataFrame) – DataFrame that should contain nus2000-relevant columns.**kwargs (
object) – Additional keyword arguments forwarded to range validation. Passed in from parent function.
- Returns:
Validation errors gathered from all sub-checks, or an empty list when the dataset passes cleanly.
- Return type:
list[NudbQualityError]
- subcheck_nus2000_uh_institusjon_id_against_nus(uh_institusjon_id_col, nus_col, utd_skoleaar_start_col)¶
Ensure UH institution id is populated when nus2000 starts with 6, 7, or 8.
- Parameters:
uh_institusjon_id_col (
Series|None) – Series with UH institution identifiers.nus_col (
Series|None) – Series containing nus2000 codes.utd_skoleaar_start_col (
Series|None) – Series with the school-year start used to determine when the validation applies.
- Returns:
Validation error describing offending combinations, or None when every row satisfies the rule.
- Return type:
NudbQualityError | None
- subcheck_nus2000_valid_nus(col)¶
Validate that every nus2000 code is six digits and starts with 1-8.
- Parameters:
col (
Series|None) – Series containing nus2000 codes, or None if the column is missing.- Returns:
Validation error describing the invalid codes, or None when all codes satisfy the required format.
- Return type:
NudbQualityError | None
- subcheck_nus2000_valid_range(nus_col, range_valid_nus=None, dataset_name=None, **kwargs)¶
Check that nus2000 codes stay inside the configured numeric range.
- Parameters:
nus_col (
Series|None) – Series containing nus2000 codes to validate.range_valid_nus (
range|None) – Optional range describing the allowed first-digit span. When omitted, dataset-specific or default ranges are used.dataset_name (
str|None) – to look up configuration overrides.**kwargs (
Any) – Keyword arguments that may include range_valid_nus
- Returns:
Validation error listing codes outside the allowed range, or None when no violations are detected.
- Return type:
NudbQualityError | None
nudb_use.quality.specific_variables.run_all module¶
Entry point for executing all variable-specific validation checks.
- run_all_specific_variable_tests(df, raise_errors=False, **kwargs)¶
Execute every registered variable-specific validation routine.
- Parameters:
df (
DataFrame) – DataFrame that should contain the required variables.raise_errors (
bool) – When True, raise grouped errors if any validations fail.**kwargs (
object) – Extra keyword arguments forwarded to each check.
- Returns:
Errors aggregated from all specific checks, or an empty list when every check passes.
- Return type:
list[NudbQualityError]
nudb_use.quality.specific_variables.skoleaar module¶
Validations for the skoleaar time variables.
- check_skoleaar(df, **kwargs)¶
Validate skoleaar columns for invalid year formats.
Scans columns whose name includes “skoleaar” or “skolar” and aggregates any validation errors found.
- Parameters:
df (
DataFrame) – Input DataFrame to validate.**kwargs (
object) – Unused extra arguments for compatibility.
- Returns:
Collected validation errors.
- Return type:
list[NudbQualityError]
- check_skoleaar_contains_one_year(col_name, col_series)¶
Validate that skoleaar values look like a single 4-digit year.
- Parameters:
col_name (
str) – Column name for error context.col_series (
Series) – Series of skoleaar values as strings.
- Returns:
Error if invalid values are found, otherwise None.
- Return type:
NudbQualityError | None
- check_skoleaar_contains_sane_years(col_name, col_series)¶
Validate that skoleaar values fall within a sane year range.
- Parameters:
col_name (
str) – Column name for error context.col_series (
Series) – Series of skoleaar values as strings or integers.
- Returns:
Error if out-of-range values are found, otherwise None.
- Return type:
NudbQualityError | None
- check_skoleaar_contains_two_years_one_offset(col_name, col_series)¶
Validate 8-digit skoleaar values have a +1 year offset.
- Parameters:
col_name (
str) – Column name for error context.col_series (
Series) – Series of skoleaar values as strings.
- Returns:
Error if 8-digit values are not consecutive, otherwise None.
- Return type:
NudbQualityError | None
- check_skoleaar_is_string_dtype(col_name, col_series)¶
Ensure a skoleaar column uses a string dtype before further checks.
- Parameters:
col_name (
str) – Column name for error context.col_series (
Series) – Series of skoleaar values to validate.
- Returns:
Error when dtype is not string, otherwise None.
- Return type:
NudbQualityError | None
nudb_use.quality.specific_variables.sn07 module¶
Validations for the SN07 classification variable.
- check_sn07(df, **kwargs)¶
Execute SN07-specific checks and return collected errors.
- Parameters:
df (
DataFrame) – DataFrame containing SN07 codes.**kwargs (
object) – Placeholder for future configuration. Passed in from parent function.
- Returns:
Errors describing invalid SN07 codes.
- Return type:
list[NudbQualityError]
- subcheck_sn07_bad_value(sn07)¶
Detect disallowed SN07 codes.
- Parameters:
sn07 (
Series|None) – Series containing SN07 codes to inspect.- Returns:
Error when forbidden codes are present, else None.
- Return type:
NudbQualityError | None
nudb_use.quality.specific_variables.snr_fnr module¶
Checks for required personal identifier columns.
- check_has_personal_ids(df, **kwargs)¶
Ensure at least one personal identifier column is populated per row.
- Parameters:
df (
DataFrame) – DataFrame containing personal identifier columns.**kwargs (
object) – Placeholder for future options. Passed in from parent function.
- Returns:
Errors for rows missing all identifier values.
- Return type:
list[NudbQualityError]
nudb_use.quality.specific_variables.unique_per_person module¶
Validations ensuring certain columns are unique per person.
- check_unique_per_person(df, **kwargs)¶
Ensure configured columns have at most one value per person.
- Parameters:
df (
DataFrame) – DataFrame containing personal identifier and value columns.**kwargs (
object) – Placeholder for future options. Passed in from parent function.
- Returns:
Errors describing rows that violate uniqueness.
- Return type:
list[NudbQualityError]
- subcheck_unique_per_person(fnr, snr, unique_col, unique_col_name)¶
Check that a single column has unique values per person.
- Parameters:
fnr (
Series|None) – Series containing national identifiers.snr (
Series|None) – Series containing snr person identifiers.unique_col (
Series|None) – Column that should hold unique values per person.unique_col_name (
str) – Name of the column for logging context.
- Returns:
Error when multiple values exist per person, else None.
- Return type:
NudbQualityError | None
nudb_use.quality.specific_variables.utils module¶
Shared helpers for the variable-specific validation modules.
- add_err2list(errors, error)¶
Append a validation error to a list if it is not None.
- Return type:
None- Parameters:
errors (list[NudbQualityError])
error (None | NudbQualityError)
- get_column(df, col)¶
Return a DataFrame column or None when it is missing.
- Return type:
Series|None- Parameters:
df (DataFrame)
col (str)
- require_series_present(**series_by_name)¶
Ensure required pandas Series exist before continuing a validation step.
- Return type:
dict[str,Series] |None- Parameters:
series_by_name (Series | None)
nudb_use.quality.specific_variables.vg_fullfoertkode_detaljert module¶
Validations for the vg_fullfoertkode_detaljert variable.
- check_vg_fullfoertkode_detaljert(df, **kwargs)¶
Run all vg_fullfoertkode_detaljert-specific checks on the provided DataFrame.
- Parameters:
df (
DataFrame) – DataFrame containing vg_fullfoertkode_detaljert and supporting columns.**kwargs (
object) – Additional keyword arguments for future compatibility.
- Returns:
Errors reported by the vg_fullfoertkode_detaljert checks.
- Return type:
list[NudbQualityError]
- subcheck_vg_fullfoertkode_detaljert_utd_211(utd_utdanningstype, vg_fullfoertkode_detaljert)¶
Ensure vg_fullfoertkode_detaljert is filled only for utd_utdanningstype 211, 212, 220 and 610.
- Parameters:
utd_utdanningstype (
Series|None) – Series containing the utdanningstype codes.vg_fullfoertkode_detaljert (
Series|None) – Series containing gro_elevstatus values.
- Returns:
Error when invalid combinations exist, else None.
- Return type:
NudbQualityError | None