Reference

ssb_poc_statlog_model package

ssb_poc_statlog_model.change_data_log module

class ChangeDataLog(**data)

Bases: StatlogBaseModel

Data model for data change log in a statistical production process.

Parameters:
  • schema_version (Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')])

  • statistics_name (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')])

  • data_source (Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more input datasets used as data source before changing data.')])

  • data_target (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Target dataset filepath (eg. GCS-path to a parquet file).')])

  • data_period (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Data period for changed data - eg. a year, quarter, month, day (date), ...')])

  • change_event (Annotated[ChangeEvent, FieldInfo(annotation=NoneType, required=True, description='How the event was triggered: Automatically changed (A), Manually changed (M), Manually approved with no change (MNC), Not reviewed (NOT).')])

  • change_event_reason (Annotated[ChangeEventReason | None, FieldInfo(annotation=NoneType, required=True, description='Reason for change or approval: Other source (OTHER_SOURCE), Statistical review (REVIEW), Information from the data provider/registry owner (OWNER), Small/marginal unit (MARGINAL_UNIT), Data duplicate (DUPLICATE), Other reason (OTHER).')])

  • change_datetime (Annotated[AwareDatetime, FieldInfo(annotation=NoneType, required=True, description='Timestamp (date and time, ISO 8601) of an event or change')])

  • changed_by (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='If manually (M): user name of the person who triggered an event; if automatically (A) name of method, function and/or process.')])

  • data_change_type (Annotated[DataChangeType | None, FieldInfo(annotation=NoneType, required=True, description='Data change type: Updated value (UPD), created new unit/row (NEW), or deleted unit/row (DEL).')])

  • change_comment (Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Change comment')])

  • change_details (Annotated[ChangeDetails | ChangeDetails1, FieldInfo(annotation=NoneType, required=True, description='Detailed information about the change. Either a unit-id, old and new value if one row (unit) was affected, or number of rows affected if the process changed multiple rows (units).', discriminator='detail_type')])

change_comment: Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Change comment')]
change_datetime: Annotated[AwareDatetime, FieldInfo(annotation=NoneType, required=True, description='Timestamp (date and time, ISO 8601) of an event or change')]
change_details: Annotated[ChangeDetails | ChangeDetails1, FieldInfo(annotation=NoneType, required=True, description='Detailed information about the change. Either a unit-id, old and new value if one row (unit) was affected, or number of rows affected if the process changed multiple rows (units).', discriminator='detail_type')]
change_event: Annotated[ChangeEvent, FieldInfo(annotation=NoneType, required=True, description='How the event was triggered: Automatically changed (A), Manually changed (M), Manually approved with no change (MNC), Not reviewed (NOT).')]
change_event_reason: Annotated[ChangeEventReason | None, FieldInfo(annotation=NoneType, required=True, description='Reason for change or approval: Other source (OTHER_SOURCE), Statistical review (REVIEW), Information from the data provider/registry owner (OWNER), Small/marginal unit (MARGINAL_UNIT), Data duplicate (DUPLICATE), Other reason (OTHER).')]
changed_by: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='If manually (M): user name of the person who triggered an event; if automatically (A) name of method, function and/or process.')]
data_change_type: Annotated[DataChangeType | None, FieldInfo(annotation=NoneType, required=True, description='Data change type: Updated value (UPD), created new unit/row (NEW), or deleted unit/row (DEL).')]
data_period: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Data period for changed data - eg. a year, quarter, month, day (date), ...')]
data_source: Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more input datasets used as data source before changing data.')]
data_target: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Target dataset filepath (eg. GCS-path to a parquet file).')]
model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

schema_version: Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')]
statistics_name: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')]
class ChangeDetails(**data)

Bases: StatlogBaseModel

Detailed information about the change. Either a unit-id, old and new value if one row (unit) was affected, or number of rows affected if the process changed multiple rows (units).

Parameters:
  • detail_type (Annotated[Literal['rows'], FieldInfo(annotation=NoneType, required=True, description='Discriminator for change_details variant')])

  • rows_affected (Annotated[int, FieldInfo(annotation=NoneType, required=True, description='Number of rows affected if the process changed multiple rows (units).')])

  • variable_name (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='The variable name (or element-path) that contains data changes.')])

detail_type: Annotated[Literal['rows'], FieldInfo(annotation=NoneType, required=True, description='Discriminator for change_details variant')]
model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

rows_affected: Annotated[int, FieldInfo(annotation=NoneType, required=True, description='Number of rows affected if the process changed multiple rows (units).')]
variable_name: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='The variable name (or element-path) that contains data changes.')]
class ChangeDetails1(**data)

Bases: StatlogBaseModel

Detailed information about the change. Either a unit-id, old and new value if one row (unit) was affected, or number of rows affected if the process changed multiple rows (units).

Parameters:
  • detail_type (Annotated[Literal['unit'], FieldInfo(annotation=NoneType, required=True, description='Discriminator for change_details variant')])

  • unit_id (Annotated[list[UnitIdItem], FieldInfo(annotation=NoneType, required=True, description="One or more unit-identifier variables and values (primary key) if one row (unit) was affected, eg. 'fnr'='311280nnnnn' and 'orgnr'='123456789'.")])

  • old_value (Annotated[list[OldValueItem] | None, FieldInfo(annotation=NoneType, required=True, description="Old value(s). If delete (data_change_type = DEL) - log all deleted variable-values from the deleted data row/record, eg. 'income'='1000', 'address'='Street 123', ...")])

  • new_value (Annotated[list[NewValueItem] | None, FieldInfo(annotation=NoneType, required=True, description="New value(s). If insert (data_change_type = INS) - log all inserted variable-values in the data row/record, eg. 'income'='1000', 'address'='Street 123', ...")])

detail_type: Annotated[Literal['unit'], FieldInfo(annotation=NoneType, required=True, description='Discriminator for change_details variant')]
model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

new_value: Annotated[list[NewValueItem] | None, FieldInfo(annotation=NoneType, required=True, description="New value(s). If insert (data_change_type = INS) - log all inserted variable-values in the data row/record, eg. 'income'='1000', 'address'='Street 123', ...")]
old_value: Annotated[list[OldValueItem] | None, FieldInfo(annotation=NoneType, required=True, description="Old value(s). If delete (data_change_type = DEL) - log all deleted variable-values from the deleted data row/record, eg. 'income'='1000', 'address'='Street 123', ...")]
unit_id: Annotated[list[UnitIdItem], FieldInfo(annotation=NoneType, required=True, description="One or more unit-identifier variables and values (primary key) if one row (unit) was affected, eg. 'fnr'='311280nnnnn' and 'orgnr'='123456789'.")]
class ChangeEvent(*values)

Bases: StrEnum

How the event was triggered: Automatically changed (A), Manually changed (M), Manually approved with no change (MNC), Not reviewed (NOT).

A = 'A'
M = 'M'
MNC = 'MNC'
NOT = 'NOT'
class ChangeEventReason(*values)

Bases: StrEnum

Reason for change or approval: Other source (OTHER_SOURCE), Statistical review (REVIEW), Information from the data provider/registry owner (OWNER), Small/marginal unit (MARGINAL_UNIT), Data duplicate (DUPLICATE), Other reason (OTHER).

DUPLICATE = 'DUPLICATE'
MARGINAL_UNIT = 'MARGINAL_UNIT'
OTHER = 'OTHER'
OTHER_SOURCE = 'OTHER_SOURCE'
OWNER = 'OWNER'
REVIEW = 'REVIEW'
class DataChangeType(*values)

Bases: StrEnum

Data change type: Updated value (UPD), created new unit/row (NEW), or deleted unit/row (DEL).

DEL = 'DEL'
NEW = 'NEW'
UPD = 'UPD'
class NewValueItem(**data)

Bases: StatlogBaseModel

Parameters:
  • variable_name (str)

  • value (str)

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

value: str
variable_name: str
class OldValueItem(**data)

Bases: StatlogBaseModel

Parameters:
  • variable_name (str)

  • value (str)

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

value: str
variable_name: str
class UnitIdItem(**data)

Bases: StatlogBaseModel

Parameters:
  • unit_id_variable (Annotated[str, FieldInfo(annotation=NoneType, required=True, description="The unit-id variable name, e.g. 'fnr', 'pers_id', 'reg_nr' or 'komm_nr'.")])

  • unit_id_value (Annotated[str, FieldInfo(annotation=NoneType, required=True, description="The unit-id value, e.g. '311280nnnnn' or '123456789'.")])

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

unit_id_value: Annotated[str, FieldInfo(annotation=NoneType, required=True, description="The unit-id value, e.g. '311280nnnnn' or '123456789'.")]
unit_id_variable: Annotated[str, FieldInfo(annotation=NoneType, required=True, description="The unit-id variable name, e.g. 'fnr', 'pers_id', 'reg_nr' or 'komm_nr'.")]

ssb_poc_statlog_model.generate_python module

ssb_poc_statlog_model.linage module

class Linage(**data)

Bases: StatlogBaseModel

Data model for data linage in a statistical production process.

Parameters:
  • schema_version (Annotated[Literal['1.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')])

  • data_source (Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more input datasets used as data source.')])

  • data_target (Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more target datasets (where data is stored after the process has been run).')])

  • step (Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='A string describing at which step in the process the log is for.')])

  • file_hash (Annotated[list[str] | None, FieldInfo(annotation=NoneType, required=True, description='List of SHA-256 hashes, matching the files in the data_sources field.')])

data_source: Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more input datasets used as data source.')]
data_target: Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more target datasets (where data is stored after the process has been run).')]
file_hash: Annotated[list[str] | None, FieldInfo(annotation=NoneType, required=True, description='List of SHA-256 hashes, matching the files in the data_sources field.')]
model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

schema_version: Annotated[Literal['1.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')]
step: Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='A string describing at which step in the process the log is for.')]

ssb_poc_statlog_model.quality_control_description module

class QualityControlDescription(**data)

Bases: StatlogBaseModel

Model for description of quality controls used in a statistical production.

Parameters:
  • schema_version (Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')])

  • quality_control_id (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='A unique quality control ID')])

  • quality_control_description (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Quality control description')])

  • quality_control_type (Annotated[QualityControlType, FieldInfo(annotation=NoneType, required=True, description='Quality control type: hard (H), soft (S), informative (I).')])

  • variables (Annotated[list[Variable], FieldInfo(annotation=NoneType, required=True, description='A description of which variables must be included in the quality control.')])

model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

quality_control_description: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Quality control description')]
quality_control_id: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='A unique quality control ID')]
quality_control_type: Annotated[QualityControlType, FieldInfo(annotation=NoneType, required=True, description='Quality control type: hard (H), soft (S), informative (I).')]
schema_version: Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')]
variables: Annotated[list[Variable], FieldInfo(annotation=NoneType, required=True, description='A description of which variables must be included in the quality control.')]
class QualityControlType(*values)

Bases: StrEnum

Quality control type: hard (H), soft (S), informative (I).

H = 'H'
I = 'I'
S = 'S'
class Variable(**data)

Bases: StatlogBaseModel

Parameters:

variable_description (str | None)

model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

variable_description: str | None

ssb_poc_statlog_model.quality_control_result module

class QualityControlResult(**data)

Bases: StatlogBaseModel

Schema for statistics quality control result.

Parameters:
  • schema_version (Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')])

  • statistics_name (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')])

  • quality_control_id (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='A reference (or link/uri) to the quality control description')])

  • data_location (Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Controlled dataset reference/filepath (eg. GCS-path to a parquet file) or other dataset reference (eg. ref. to a CloudSQL database table).')])

  • data_period (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Data period controlled - eg. year, date, date-time, ...')])

  • quality_control_datetime (Annotated[AwareDatetime, FieldInfo(annotation=NoneType, required=True, description='Quality control datetime (date and time, ISO 8601)')])

  • quality_control_results (Annotated[QualityControlResults, FieldInfo(annotation=NoneType, required=True, description='Quality control result: quality ok (0), quality issues detected (1), missing value detected (2).')])

  • quality_result_comment (Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Quality control result comment.')])

  • quality_control_run_exception (Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Exception description. An error or warning occurred when executing the quality control routine.')])

data_location: Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Controlled dataset reference/filepath (eg. GCS-path to a parquet file) or other dataset reference (eg. ref. to a CloudSQL database table).')]
data_period: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Data period controlled - eg. year, date, date-time, ...')]
model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

quality_control_datetime: Annotated[AwareDatetime, FieldInfo(annotation=NoneType, required=True, description='Quality control datetime (date and time, ISO 8601)')]
quality_control_id: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='A reference (or link/uri) to the quality control description')]
quality_control_results: Annotated[QualityControlResults, FieldInfo(annotation=NoneType, required=True, description='Quality control result: quality ok (0), quality issues detected (1), missing value detected (2).')]
quality_control_run_exception: Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Exception description. An error or warning occurred when executing the quality control routine.')]
quality_result_comment: Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Quality control result comment.')]
schema_version: Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')]
statistics_name: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')]
class QualityControlResults(*values)

Bases: StrEnum

Quality control result: quality ok (0), quality issues detected (1), missing value detected (2).

field_0 = '0'
field_1 = '1'
field_2 = '2'

ssb_poc_statlog_model.release module

class Release(**data)

Bases: StatlogBaseModel

Data model for a release in a statistical production process.

Parameters:
  • schema_version (Annotated[Literal['1.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')])

  • dapla_team (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Name of the dapla team that produced the release.')])

  • statistics_name (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')])

  • git_tag (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Git tag for the release.')])

  • git_commit_hash (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Git commit hash for the release.')])

  • data_source (Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more datasets in the final data state (utdata or klargjorte-data) used as data source for the release.')])

  • daplalab_image (Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='The tag of the Daplalab container image that produced the release.')])

dapla_team: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Name of the dapla team that produced the release.')]
daplalab_image: Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='The tag of the Daplalab container image that produced the release.')]
data_source: Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more datasets in the final data state (utdata or klargjorte-data) used as data source for the release.')]
git_commit_hash: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Git commit hash for the release.')]
git_tag: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Git tag for the release.')]
model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

schema_version: Annotated[Literal['1.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')]
statistics_name: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')]

ssb_poc_statlog_model.statlog_base_model module

class StatlogBaseModel(**data)

Bases: BaseModel

Pydantic model that defines configurations which applies to all Models in this package.

model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].