ssb_utdanning.orgnrkontroll package¶
ssb_utdanning.orgnrkontroll.orgnrkontroll module¶
- get_skolereg(year='latest', sub_category='')¶
Retrieves skolereg catalogue.
Retrieves an UtdKatalog instance representing school registration data for a given year and sub-category. If no sub-category is specified, it excludes all predefined categories, i.e barnehage, vgskoler, grunnskoler, test.
- Parameters:
year (str | int, optional) – The year for which the data is to be retrieved. Can be an integer representing the year or ‘latest’ for the most recent data. Defaults to ‘latest’.
sub_category (str) – A specific sub-category of school data to be retrieved. Options include ‘barnehage’, ‘vgskoler’, ‘test’, ‘grunnskoler’. Defaults to an empty string, which means no specific sub-category.
- Returns:
- An instance of UtdKatalog configured with the appropriate file pattern,
key columns, and exclusions based on the inputs.
- Return type:
- Raises:
ValueError – If the specified sub-category is not recognized.
- get_vigo_skole(year='latest')¶
Retrieves vigo-skole catalogue.
Retrieves an UtdKatalog instance representing VIGO school data for a specified year. If no year is specified, it fetches data for the most recent year.
- Parameters:
year (str | int, optional) – The year for which the data is to be retrieved. Can be an integer representing the year or ‘latest’ for the most recent data. Defaults to ‘latest’.
- Returns:
- An instance of UtdKatalog configured with the appropriate file pattern
and key columns for accessing VIGO school data.
- Return type:
- orgnrkontroll_func(data, year='latest', skolereg_keep_cols=None, vigo_keep_cols=None, orgnr_col_innfil='orgnr', fskolenr_col_innfil='fskolenr', skolereg_subcategory='')¶
Performs merge validation and merges on orgnr from skolereg and on fskolnr from vigo-skole catalogue.
Performs data validation and merging operations on educational data from different catalogs. Ensures the integrity of organizational number fields, merges additional data from school and VIGO catalogs, and handles missing data or discrepancies in organizational numbers.
- Parameters:
data (pd.DataFrame | UtdData) – The input dataset, either as a DataFrame or UtdData instance.
year (str | int, optional) – The year of the data to process. Defaults to ‘latest’.
skolereg_keep_cols (set[str] | list[str] | None, optional) – Columns to keep from the skolereg data.
vigo_keep_cols (set[str] | list[str] | None, optional) – Columns to keep from the VIGO data.
orgnr_col_innfil (str) – Column name for organizational numbers in the input data. Defaults to “orgnr”.
fskolenr_col_innfil (str) – Column name for school numbers in the input data. Defaults to “fskolenr”.
skolereg_subcategory (str) – Subcategory of school data to filter from the skolereg data.
- Returns:
The consolidated dataset after validation and merging operations.
- Return type:
pd.DataFrame | UtdData
- Raises:
ValueError – If essential columns are missing or have duplicates in the input data.
TypeError – If ‘skolereg_keep_cols’ or ‘vigo_keep_cols’ are not provided in an appropriate format.