klass.classes package

klass.classes.classification module

class KlassClassification(classification_id, language='nb', include_future=False)

Bases: object

Classifications are the main level people are used to thinking about “things in KLASS”.

  • they represent “groupings of general, official codelists”.

  • can have many Versions, versions are the classification placed in time. When the classification is updated with new codes, a new version is created.

  • has Codes, actually owned by the Versions (placed in time), but they are directly available under the classification as well, by adding time-parameters.

  • has Variants, which are differently grouped aggregations of codelists.

  • can Correspond with other Classifications and their codelists.

  • belongs undera Family, a general statistical group, like “Education”.

Print an initialized Classification object to see extensive information.

To see all the Classification’s Variants (different aggregations of codelists), you first need to get the classification at a specific time (a KlassVersion) by using get_version() for example.

versions

A list of the data the Classifications has on its versions. Versions represent the changes to the classifications codelists placed in time.

Type:

list

name

The name of the classification.

Type:

str

classification_id

The ID of the classification.

Type:

str

classificationType

The type of the classification.

Type:

str

lastModified

The last time the classification was modified. ISO-stringified datetime(ISO-datetime)

Type:

str

description

A longer description of the classification.

Type:

str

primaryLanguage

The primary language of the classification. “nb”, “nn” or “en”.

Type:

str

language

The language chosen at initialization of the classification. “nb”, “nn” or “en”.

Type:

str

copyrighted

Whether the classification is copyrighted.

Type:

bool

includeShortName

If true, indicates that classificationItems may have shortnames.

Type:

bool

includeNotes

If true, indicates that classificationItems may have notes.

Type:

bool

contactPerson

A dictionary containing the contact person of the classification.

Type:

dict

owningSection

The section (part of Statistics Norway)that owns the classification.

Type:

str

statisticalUnits

Statistical units assigned to classification

Type:

list

include_future

Whether to include future versions of the classification.

Type:

bool

A dictionary containing the links to different possible endpoints using the classification.

Type:

dict

Parameters:
  • classification_id (str) – The classification_id of the classification. For example: ‘36’

  • language (str) – The language of the classification. “nb”, “nn” or “en”.

  • include_future (bool) – Whether to include future versions of the classification.

Raises:

ValueError – If the language is not “no”, “nb” or “en”. If the include_future is not a bool.

Get the data for the classification from the API.

get_changes(from_date, to_date='', language='nb', include_future=False)

Return a KlassChanges object of the classification at a specific time or in a specific time range.

Different from get_codes(), this method does not return all codes, but only what has changed since the last update or within the time range.

Parameters:
  • from_date (str) – The start date of the time period. “YYYY-MM-DD”.

  • to_date (str) – The end date of the time period. “YYYY-MM-DD”.

  • language (str) – The language of the version. “nn”, “nb” or “en”.

  • include_future (bool) – Whether to include future versions of the version.

Returns:

A pandas DataFrame of the changes in the classification at a specific time (from the last time it changed) or within the specific time range.

Return type:

pd.DataFrame

get_codes(from_date='', to_date='', select_codes='', select_level='', presentation_name_pattern='', language='', include_future=None)

Return a KlassCodes object of the classification at a specific time or in a specific time range.

Parameters:
  • from_date (str) – The start date of the time period. “YYYY-MM-DD”.

  • to_date (str) – The end date of the time period. “YYYY-MM-DD”.

  • select_codes (str) – Limit the result to codes matching this pattern. See rules: https://data.ssb.no/api/klass/v1/api-guide.html#_selectcodes.

  • select_level (str) – The level of the version to keep in the data.

  • presentation_name_pattern (str) – Used to build an alternative presentation name for the codes. See rules: https://data.ssb.no/api/klass/v1/api-guide.html#_presentationnamepattern.

  • language (str) – The language of the version. “nn”, “nb” or “en”.

  • include_future (bool) – Whether to include future versions of the version.

Returns:

A KlassCodes object of the classification at a specific time or in a specific time range.

Return type:

KlassCodes

get_correspondence_to(target_classification_id, from_date, to_date='', language='', include_future=None)

Treats the current classification as a source of correspondences, specifying the target’s ID and a date.

Returns a KlassCorrespondence object of the correspondences.

Parameters:
  • target_classification_id (str) – The classification ID of the target classification.

  • from_date (str) – The start date of the time period. “YYYY-MM-DD”.

  • to_date (str) – The end date of the time period. “YYYY-MM-DD”.

  • language (str) – The language of the correspondences. “nn”, “nb” or “en”.

  • include_future (bool) – Whether to include future correspondences.

Returns:

A KlassCorrespondence object of the correspondences between the current classification and the target classification.

Return type:

KlassCorrespondence

get_variant_by_name(name, from_date, to_date='', select_codes='', select_level='', presentation_name_pattern='', language='nb', include_future=False)

Get a KlassVariant by searching for its name under the Variants owned by the current classification.

In Klass, a Variant is a different way of aggregating an existing codelist. It does not have to be extensive (all filled out), but can, for example, redefine upper levels for some lower-level codes.

Parameters:
  • name (str) – The start of the name of the variant.

  • from_date (str) – The start date of the time period. “YYYY-MM-DD”.

  • to_date (str) – The end date of the time period. “YYYY-MM-DD”.

  • select_codes (str) – Limit the result to codes matching this pattern. See rules: https://data.ssb.no/api/klass/v1/api-guide.html#_selectcodes.

  • select_level (str) – The level of the version to keep in the data.

  • presentation_name_pattern (str) – Used to build an alternative presentation name for the codes. See rules: https://data.ssb.no/api/klass/v1/api-guide.html#_presentationnamepattern.

  • language (str) – The language of the version. “nn”, “nb” or “en”.

  • include_future (bool) – Whether to include future versions of the version.

Returns:

A KlassVariantSearchByName object based on the classification’s ID and searching for the name passed in.

Return type:

KlassVariantSearchByName

get_version(version_id=0, select_level=0, language='', include_future=None)

Return a KlassVersion object of the classification based on ID.

A Version in Klass is a Classification placed in time. If no ID is specified, will get the first version under the attribute .versions on this class.

Parameters:
  • version_id (int) – The version ID of the version.

  • select_level (int) – The level of the version to keep in the data.

  • language (str) – The language of the version. “nn”, “nb” or “en”.

  • include_future (bool) – Whether to include future versions of the version.

Returns:

A KlassVersion object of the specified ID.

Return type:

KlassVersion

versions_dict()

Reformats the versions into a simple dict with just the IDs as keys and names as values.

Returns:

Version IDs as keys, and version names as values.

Return type:

dict

klass.classes.codes module

class KlassCodes(classification_id='', from_date='', to_date='', select_codes='', select_level='', presentation_name_pattern='', language='nb', include_future=False)

Bases: object

Get codes from Klass.

The codelist is owned by the Classification through a Version, and will be valid for a time period.

data

The pandas DataFrame of the codes.

Type:

pd.DataFrame

classification_id

The classification ID.

Type:

str

from_date

The start date of the time period. “YYYY-MM-DD”.

Type:

str

to_date

The end date of the time period. “YYYY-MM-DD”.

Type:

str

Parameters:
  • classification_id (str) – The classification ID.

  • from_date (str) – The start date of the time period. “YYYY-MM-DD”.

  • to_date (str) – The end date of the time period. “YYYY-MM-DD”.

  • select_codes (str) – A list of codes to be selected.

  • select_level (str) – A list of levels to be selected.

  • presentation_name_pattern (str) – A pattern for filtering the code names.

  • language (str) – The language of the code names. Defaults to “nb”.

  • include_future (bool) – Whether to include future codes. Defaults to False.

Raises:
  • ValueError – If from_date or to_date is not a valid date or date-string YYYY-MM-DD.

  • ValueError – If select_codes contains anything except numbers and the special characters “*” (star) or “-” (dash).

  • ValueError – If select_level is anything except a whole number.

  • ValueError – If presentation_name_pattern is not a valid pattern.

  • ValueError – If language is not “nb”, “nn” or “en”.

  • ValueError – If include_future is not a bool.

Get the data from the KLASS-api belonging to the code-list.

change_dates(from_date='', to_date='', include_future=None)

Change the dates of the codelist and get the data again based on new dates.

Parameters:
  • from_date (str) – The start date of the time period. “YYYY-MM-DD”.

  • to_date (str) – The end date of the time period. “YYYY-MM-DD”.

  • include_future (bool) – Whether to include future codes.

Returns:

Returns self to make the method more easily chainable.

Return type:

self (KlassSearchFamilies)

get_codes(raise_on_empty_data=True)

Retrieve codes from the classification specified by self.classification_id at a specific time.

If self.to_date is not None, codes will be retrieved from the date range specified by self.from_date and self.to_date. Otherwise, codes will be retrieved only for the date specified by self.from_date.

Parameters:

raise_on_empty_data (bool) – Whether to raise an error if the returned dataframe is empty. Defaults to True.

Returns:

Returns self to make the method more easily chainable.

Return type:

self (KlassSearchFamilies)

Raises:

ValueError – If the returned dataframe is empty, there is probably something too narrow in the parameters.

pivot_level(keep=None)

Pivot levels into separate columns and number columns based on levels as suffixes.

Joining children codes onto their parent codes. For example, instead of “code”, gives you “code_1”, “code_2” etc.

First envisioned by @mfmssb

Parameters:

keep (list[str]) – The start of the names of the columns you want to keep when done. Default is [“code”, “name”], but other possibilities are “presentationName”, “level”, “shortName”, “validTo”, “validFrom”, and “notes”.

Returns:

The resulting pandas DataFrame.

Return type:

pd.DataFrame

to_dict(key='code', value='', other='')

Extract two columns from the data, turning them into a dict.

If you specify a value for “other”, returns a defaultdict instead.

Parameters:
  • key (str) – The name of the column with the values you want as keys.

  • value (str) – The name of the column with the values you want as values in your dict.

  • other (str) – If key is missing from dict, return this value instead, if you specify an OTHER-value.

Returns:

The extracted columns as a dict or defaultdict.

Return type:

dict | defaultdict

klass.classes.correspondence module

class KlassCorrespondence(correspondence_id='', source_classification_id='', target_classification_id='', from_date='', to_date='', contain_quarter=0, language='nb', include_future=False)

Bases: object

Correspondences in Klass exist between two classifications at a specific time (hence actually between Versions).

They are used to translate data between two classifications. For example, from geographical municipality up to county level.

You can identify the correspondence by their individual ids, or by the source classification ID + the target classification ID + a specific time.

data

The pandas DataFrame of the correspondences.

Type:

pd.DataFrame

correspondence

The list of the correspondences returned by the API.

Type:

list

correspondence_id

The ID of the correspondence.

Type:

str

source_classification_id

The ID of the source classification.

Type:

str

target_classification_id

The ID of the target classification.

Type:

str

from_date

The start date of the correspondence.

Type:

str

to_date

The end date of the correspondence.

Type:

str, optional

contain_quarter

The number of quarters the correspondence should contain, this replaces the to_date during initialization.

Type:

int

language

The language of the correspondence. “nb”, “nn” or “en”.

Type:

str

include_future

If the correspondence should include future correspondences.

Type:

bool

Parameters:
  • correspondence_id (str) – The ID of the correspondence.

  • source_classification_id (str) – The ID of the source classification.

  • target_classification_id (str) – The ID of the target classification.

  • from_date (str) – The start date of the correspondence.

  • to_date (str, optional) – The end date of the correspondence.

  • contain_quarter (int) – The number of quarters the correspondence should contain, this replaces the to_date during initialization.

  • language (str) – The language of the correspondence. “nb”, “nn” or “en”.

  • include_future (bool) – If the correspondence should include future correspondences.

Get the correspondence-data from the API.

get_correspondence()

Run as the last part of initialization. Actually setting the data from the API as attributes.

If you reset some attributes, maybe run this after to “update” the data of the correspondence.

Gets and reshapes correspondences based on attributes on the class.

Returns:

Returns self to make the method more easily chainable.

Return type:

self (KlassCorrespondence)

Raises:

ValueError – If you are filling out the wrong combination of correspondence_id, source_classification_id, target_classification_id and from_date, we cant get make a correct query to the API.

to_dict(key='sourceCode', value='targetCode', other='')

Extract two columns from the data, turning them into a dict.

If you specify a value for “other”, returns a defaultdict instead.

Columns in the data are ‘sourceCode’, ‘sourceName’, ‘sourceShortName’, ‘targetCode’, ‘targetName’, ‘targetShortName’, ‘validFrom’, ‘validTo’.

Parameters:
  • key (str) – The name of the column with the values you want as keys.

  • value (str) – The name of the column with the values you want as values in your dict.

  • other (str) – The value to use for keys that don’t exist in the data.

Returns:

The dictionary of the correspondence.

Return type:

dict | defaultdict

klass.classes.family module

class KlassFamily(family_id)

Bases: object

Families represent “general statistical areas” like “Education”.

Families in Klass “own” / “have” several classifications. Families are owned by sections (a part of Statistics Norway who is responsible for the family).

classifications

A list of classifications in the family.

Type:

list

family_id

The ID of the family.

Type:

str

name

The name of the family.

Type:

str

A dictionary of API links referencing itself.

Type:

dict

Parameters:

family_id (str) – The ID of the family.

Get the family data from the klass-api, setting it as attributes on the object.

get_classification(classification_id='')

Get a classification from the family.

Parameters:

classification_id (str) – The ID of the classification. If not given, the first classification in the family is returned based on its ID.

Returns:

The classification.

Return type:

KlassClassification

klass.classes.search module

class KlassSearchClassifications(query='', include_codelists=True, ssbsection='', no_dupes=False)

Bases: object

Use to search for classifications.

classifications

A list of KlassClassification objects.

Type:

list

query

The search query.

Type:

str

include_codelists

Whether to include codelists in the search results.

Type:

bool

ssbsection

The SSB section who owns the classification you are searching for.

Type:

str

no_dupes

Whether to remove duplicates from the search results.

Type:

bool

Parameters:
  • query (str) – The search query.

  • include_codelists (bool) – Whether to include codelists in the search results.

  • ssbsection (str) – The SSB section who owns the classification you are searching for.

  • no_dupes (bool) – Whether to remove duplicates from the search results. (Usually caused by languages showing up multiple times)

Get data from the KLASS-api, setting it as attributes on this object.

static get_classification(classification_id, language='nb', include_future=False)

Get a Classification from the search object.

Parameters:
  • classification_id (str) – The classification ID to get.

  • language (str) – The language to get the classification in. Default: “nb” for Norwegian, “nn” for Nynorsk, “en” for English.

  • include_future (bool) – Whether to include future codelists.

Returns:

The classification object.

Return type:

KlassClassification

Call during init, actually get the data from the KLASS-API.

Return type:

None

simple_search_result()

Reformat the resulting search into a simple string.

Returns:

The resulting reformatted string from the search results.

Return type:

str

class KlassSearchFamilies(ssbsection='', include_codelists=False, language='nb')

Bases: object

Search for families in the Klass API.

Parameters:
  • ssbsection (str) – The SSB section who owns the family you are searching for.

  • include_codelists (bool) – Whether to include codelists in the search.

  • language (str) – The language to use in the search.

Default: “nb” for Norwegian, “nn” for Nynorsk, “en” for English.

Get data from the KLASS-api, setting it as attributes on this object.

get_family(family_id='0')

Return a KlassFamily object of the family with the given ID.

If no ID is given, chooses the first Family returned by the search.

Parameters:

family_id (str) – The family ID to get.

Returns:

The family object.

Return type:

KlassFamily

Get the search result from the API and reformat it into the .families and .links attributes.

This should be run after any change to the .ssbsection, .include_codelists, or .language attributes.

Returns:

Returns self to make the method more easily chainable.

Return type:

self (KlassSearchFamilies)

simple_search_result()

Reformat the resulting search into a simple string.

Returns:

The resulting reformatted string from the search results.

Return type:

str

klass.classes.variant module

class KlassVariant(variant_id, select_level=0, language='nb')

Bases: object

In Klass a Variant is a different way of aggregating an existing codelist.

It does not have to be extensive (all filled out), but can, for example, redefine upper levels, for some lower-level codes.

For example: “Study points for vocational education programmes” is a Version (ID 1959) for the Classification of Education (NUS, ID 36). It sets a new upper level of codes (amount of study points), for a set of lower-level existing codes (NUS codes, level 5).

data

The classificationItems as a pandas dataframe. Usually what you’re after?

Type:

pd.DataFrame

variant_id

The variant_id of the variant. For example: ‘36’.

Type:

str

name

The name of the variant.

Type:

str

contactPerson

The contact person of the variant.

Type:

dict

owningSection

The owning section of the variant.

Type:

str

lastModified

Stringified iso-datetime for last modification.

Type:

str

published

Languages that the variant is published in.

Type:

list[str]

validFrom

Date-string from when the version is valid.

Type:

str

validTo

Date-string to when the version is valid.

Type:

str, optional

introduction

A longer description of the variant.

Type:

str

correspondenceTables

The correspondence tables of the variant.

Type:

list

changelogs

The changelogs of the variant.

Type:

list

levels

The levels contained in the codelist (items).

Type:

list[dict]

classificationItems

The codelist-elements of the variant.

Type:

list[dict]

select_level

The level of the dataset to keep. For example: 0.

Type:

int

language

The language of the variant to select. For example: ‘nb’.

Type:

str

The links returned from the API.

Type:

dict

Parameters:
  • variant_id (str) – The variant_id of the variant. For example: ‘1959’.

  • select_level (int) – The level of the dataset to keep. For example: 5.

  • language (str) – The language of the variant to select. For example: ‘nb’.

Get the data from the KLASS-api to populate this objects attributes.

get_classification_codes(select_level=0)

Get the data from the API, setting it as attributes on the object.

The codes are put into the .data attribute. Other keys are added dynamically to the object, like classificationItems.

Parameters:

select_level (int) – The level of the dataset to keep. For example: 0.

Return type:

None

class KlassVariantSearchByName(classification_id, variant_name, from_date, to_date='', select_codes='', select_level='', presentation_name_pattern='', language='nb', include_future=False)

Bases: object

Look up a Variant based on the owning Classifications ID and the start of the Variants name.

The name is put into a URL-parameter, so it might be sensitive to special characters, if the name you are trying isn’t working, try keeping less of it, but keep the start of the name.

There might be a bug (2023), where you can get duplicate rows from the API on this, so if you use this class, make sure to check for duplicates before moving on.

In Klass a Variant is a different way of aggregating an existing codelist. It does not have to be extensive (all filled out), but can, for example, redefine upper levels, for some lower-level codes.

data

The codelists from the Variant as a pandas dataframe. Usually what you’re after?

Type:

pd.DataFrame

classification_id

The classification ID.

Type:

str

variant_name

The start of the variant name.

Type:

str

from_date

The start of the date range.

Type:

str

to_date

The end of the date range.

Type:

str

select_codes

Limit the result to codes matching this pattern. See rules: https://data.ssb.no/api/klass/v1/api-guide.html#_selectcodes

Type:

str

select_level

The level of codes to keep in the dataset.

Type:

str

presentation_name_pattern

Used to build an alternative presentation name for the codes. See rules: https://data.ssb.no/api/klass/v1/api-guide.html#_presentationnamepattern

Type:

str

language

Language of the names, select “en”, “nb” or “nn”.

Type:

str

include_future

Whether to include future codes. Defaults to False.

Type:

bool

Parameters:
  • classification_id (str) – The classification ID.

  • variant_name (str) – The start of the variant name.

  • from_date (str) – The start of the date range.

  • to_date (str) – The end of the date range.

  • select_codes (str) – Limit the result to codes matching this pattern. See rules: https://data.ssb.no/api/klass/v1/api-guide.html#_selectcodes

  • select_level (str) – The level of codes to keep in the dataset.

  • presentation_name_pattern (str) – Used to build an alternative presentation name for the codes. See rules: https://data.ssb.no/api/klass/v1/api-guide.html#_presentationnamepattern

  • language (str) – Language of the names, select “en”, “nb” or “nn”.

  • include_future (bool) – Whether to include future codes. Defaults to False.

Get the data from the KLASS-api, setting it as attributes on the object.

get_variant()

Actually get the data from the API, called at the end of init.

Return type:

None

klass.classes.version module

class KlassVersion(version_id, select_level=0, language='nb', include_future=False)

Bases: object

A version of a classification is set in time.

For example, the ID of NUS valid in 2023 is 1954, while the ID of NUS without being time-specific is 36.

data

The codelist of the classification-version as a pandas dataframe.

Type:

pd.DataFrame

name

The name of the version.

Type:

str

validFrom

The date the version is valid from.

Type:

str

validTo

The date the version is valid to (if any).

Type:

str

lastModified

The date the version was last modified.

Type:

str

published

A list of languages that the version is published in.

Type:

list

introduction

A longer description of the version.

Type:

str

contactPerson

A dictionary of the contact person of the version.

Type:

dict

owningSection

The name of the section that owns the version.

Type:

str

legalBase

The basis in law for the classification.

Type:

str

publications

Where the classification is published (URL).

Type:

str

derivedFrom

Notes on where the classification was derived from.

Type:

str

correspondenceTables

A list of correspondence-tables of the version.

Type:

list

Parameters:
  • version_id (str) – The ID of the version.

  • select_level (int, optional) – The level in the codelist-data to keep. Defaults to 0.

  • language (str, optional) – The language of the version. Defaults to “nb”, can be set to “en”, or “nn”.

  • include_future (bool, optional) – If the version should include future versions. Defaults to False.

Set up the object with data from the KLASS-API.

correspondences_simple()

Get a simple dictionary of the correspondences.

With the IDs as keys.

Returns:

A nested dictionary of the available correspondences.

Return type:

dict

get_classification_codes(select_level=0)

Get the codelists of the version. Inserts the result into the KlassVersions .data attribute, instead of returning it.

Run as a part of the class initialization.

Parameters:

select_level (int) – The level of the version to keep in the data. Setting to 0 keeps all levels.

Returns:

Returns self to make the method more easily chainable.

Return type:

self (KlassVersion)

static get_correspondence(correspondence_id='', source_classification_id='', target_classification_id='', from_date='', to_date='', contain_quarter=0, language='nb', include_future=False)

Get a specific correspondence.

Parameters:
  • correspondence_id (str) – The ID of the correspondence.

  • source_classification_id (str) – The ID of the source classification.

  • target_classification_id (str) – The ID of the target classification.

  • from_date (str) – The start date of the correspondence.

  • to_date (str) – The end date of the correspondence.

  • contain_quarter (int) – The number of quarters the correspondence should contain.

  • language (str) – The language of the correspondence. “nb”, “nn” or “en”.

  • include_future (bool) – If the correspondence should include future correspondences.

Returns:

A correspondence object with the specified ID, language, and dates.

Return type:

KlassCorrespondence

static get_variant(variant_id, select_level=0, language='nb')

Get a specific variant.

Parameters:
  • variant_id (str) – The ID of the variant.

  • select_level (int) – The level of the variant to keep in the data. Setting to 0 keeps all levels.

  • language (str) – The language of the variant.

Returns:

A variant object with the specified ID and language.

Return type:

KlassVariant

variants_simple()

Get a simplifed dictionary of the variants, ids as keys, names as values.

Return type:

dict[str, str]