Reference¶
ssb_poc_statlog_model package¶
ssb_poc_statlog_model.change_data_log module¶
- class ChangeDataLog(**data)¶
Bases:
StatlogBaseModelData model for data change log in a statistical production process.
- Parameters:
schema_version (Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')])
statistics_name (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')])
data_source (Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more input datasets used as data source before changing data.')])
data_target (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Target dataset filepath (eg. GCS-path to a parquet file).')])
data_period (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Data period for changed data - eg. a year, quarter, month, day (date), ...')])
change_event (Annotated[ChangeEvent, FieldInfo(annotation=NoneType, required=True, description='How the event was triggered: Automatically changed (A), Manually changed (M), Manually approved with no change (MNC), Not reviewed (NOT).')])
change_event_reason (Annotated[ChangeEventReason | None, FieldInfo(annotation=NoneType, required=True, description='Reason for change or approval: Other source (OTHER_SOURCE), Statistical review (REVIEW), Information from the data provider/registry owner (OWNER), Small/marginal unit (MARGINAL_UNIT), Data duplicate (DUPLICATE), Other reason (OTHER).')])
change_datetime (Annotated[AwareDatetime, FieldInfo(annotation=NoneType, required=True, description='Timestamp (date and time, ISO 8601) of an event or change')])
changed_by (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='If manually (M): user name of the person who triggered an event; if automatically (A) name of method, function and/or process.')])
data_change_type (Annotated[DataChangeType | None, FieldInfo(annotation=NoneType, required=True, description='Data change type: Updated value (UPD), created new unit/row (NEW), or deleted unit/row (DEL).')])
change_comment (Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Change comment')])
change_details (Annotated[ChangeDetails | ChangeDetails1, FieldInfo(annotation=NoneType, required=True, description='Detailed information about the change. Either a unit-id, old and new value if one row (unit) was affected, or number of rows affected if the process changed multiple rows (units).', discriminator='detail_type')])
- change_comment: Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Change comment')]¶
- change_datetime: Annotated[AwareDatetime, FieldInfo(annotation=NoneType, required=True, description='Timestamp (date and time, ISO 8601) of an event or change')]¶
- change_details: Annotated[ChangeDetails | ChangeDetails1, FieldInfo(annotation=NoneType, required=True, description='Detailed information about the change. Either a unit-id, old and new value if one row (unit) was affected, or number of rows affected if the process changed multiple rows (units).', discriminator='detail_type')]¶
- change_event: Annotated[ChangeEvent, FieldInfo(annotation=NoneType, required=True, description='How the event was triggered: Automatically changed (A), Manually changed (M), Manually approved with no change (MNC), Not reviewed (NOT).')]¶
- change_event_reason: Annotated[ChangeEventReason | None, FieldInfo(annotation=NoneType, required=True, description='Reason for change or approval: Other source (OTHER_SOURCE), Statistical review (REVIEW), Information from the data provider/registry owner (OWNER), Small/marginal unit (MARGINAL_UNIT), Data duplicate (DUPLICATE), Other reason (OTHER).')]¶
- changed_by: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='If manually (M): user name of the person who triggered an event; if automatically (A) name of method, function and/or process.')]¶
- data_change_type: Annotated[DataChangeType | None, FieldInfo(annotation=NoneType, required=True, description='Data change type: Updated value (UPD), created new unit/row (NEW), or deleted unit/row (DEL).')]¶
- data_period: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Data period for changed data - eg. a year, quarter, month, day (date), ...')]¶
- data_source: Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more input datasets used as data source before changing data.')]¶
- data_target: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Target dataset filepath (eg. GCS-path to a parquet file).')]¶
- model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- schema_version: Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')]¶
- statistics_name: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')]¶
- class ChangeDetails(**data)¶
Bases:
StatlogBaseModelDetailed information about the change. Either a unit-id, old and new value if one row (unit) was affected, or number of rows affected if the process changed multiple rows (units).
- Parameters:
detail_type (Annotated[Literal['rows'], FieldInfo(annotation=NoneType, required=True, description='Discriminator for change_details variant')])
rows_affected (Annotated[int, FieldInfo(annotation=NoneType, required=True, description='Number of rows affected if the process changed multiple rows (units).')])
variable_name (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='The variable name (or element-path) that contains data changes.')])
- detail_type: Annotated[Literal['rows'], FieldInfo(annotation=NoneType, required=True, description='Discriminator for change_details variant')]¶
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- rows_affected: Annotated[int, FieldInfo(annotation=NoneType, required=True, description='Number of rows affected if the process changed multiple rows (units).')]¶
- variable_name: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='The variable name (or element-path) that contains data changes.')]¶
- class ChangeDetails1(**data)¶
Bases:
StatlogBaseModelDetailed information about the change. Either a unit-id, old and new value if one row (unit) was affected, or number of rows affected if the process changed multiple rows (units).
- Parameters:
detail_type (Annotated[Literal['unit'], FieldInfo(annotation=NoneType, required=True, description='Discriminator for change_details variant')])
unit_id (Annotated[list[UnitIdItem], FieldInfo(annotation=NoneType, required=True, description="One or more unit-identifier variables and values (primary key) if one row (unit) was affected, eg. 'fnr'='311280nnnnn' and 'orgnr'='123456789'.")])
old_value (Annotated[list[OldValueItem] | None, FieldInfo(annotation=NoneType, required=True, description="Old value(s). If delete (data_change_type = DEL) - log all deleted variable-values from the deleted data row/record, eg. 'income'='1000', 'address'='Street 123', ...")])
new_value (Annotated[list[NewValueItem] | None, FieldInfo(annotation=NoneType, required=True, description="New value(s). If insert (data_change_type = INS) - log all inserted variable-values in the data row/record, eg. 'income'='1000', 'address'='Street 123', ...")])
- detail_type: Annotated[Literal['unit'], FieldInfo(annotation=NoneType, required=True, description='Discriminator for change_details variant')]¶
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- new_value: Annotated[list[NewValueItem] | None, FieldInfo(annotation=NoneType, required=True, description="New value(s). If insert (data_change_type = INS) - log all inserted variable-values in the data row/record, eg. 'income'='1000', 'address'='Street 123', ...")]¶
- old_value: Annotated[list[OldValueItem] | None, FieldInfo(annotation=NoneType, required=True, description="Old value(s). If delete (data_change_type = DEL) - log all deleted variable-values from the deleted data row/record, eg. 'income'='1000', 'address'='Street 123', ...")]¶
- unit_id: Annotated[list[UnitIdItem], FieldInfo(annotation=NoneType, required=True, description="One or more unit-identifier variables and values (primary key) if one row (unit) was affected, eg. 'fnr'='311280nnnnn' and 'orgnr'='123456789'.")]¶
- class ChangeEvent(*values)¶
Bases:
StrEnumHow the event was triggered: Automatically changed (A), Manually changed (M), Manually approved with no change (MNC), Not reviewed (NOT).
- A = 'A'¶
- M = 'M'¶
- MNC = 'MNC'¶
- NOT = 'NOT'¶
- class ChangeEventReason(*values)¶
Bases:
StrEnumReason for change or approval: Other source (OTHER_SOURCE), Statistical review (REVIEW), Information from the data provider/registry owner (OWNER), Small/marginal unit (MARGINAL_UNIT), Data duplicate (DUPLICATE), Other reason (OTHER).
- DUPLICATE = 'DUPLICATE'¶
- MARGINAL_UNIT = 'MARGINAL_UNIT'¶
- OTHER = 'OTHER'¶
- OTHER_SOURCE = 'OTHER_SOURCE'¶
- OWNER = 'OWNER'¶
- REVIEW = 'REVIEW'¶
- class DataChangeType(*values)¶
Bases:
StrEnumData change type: Updated value (UPD), created new unit/row (NEW), or deleted unit/row (DEL).
- DEL = 'DEL'¶
- NEW = 'NEW'¶
- UPD = 'UPD'¶
- class NewValueItem(**data)¶
Bases:
StatlogBaseModel- Parameters:
variable_name (str)
value (str)
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- value: str¶
- variable_name: str¶
- class OldValueItem(**data)¶
Bases:
StatlogBaseModel- Parameters:
variable_name (str)
value (str)
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- value: str¶
- variable_name: str¶
- class UnitIdItem(**data)¶
Bases:
StatlogBaseModel- Parameters:
unit_id_variable (Annotated[str, FieldInfo(annotation=NoneType, required=True, description="The unit-id variable name, e.g. 'fnr', 'pers_id', 'reg_nr' or 'komm_nr'.")])
unit_id_value (Annotated[str, FieldInfo(annotation=NoneType, required=True, description="The unit-id value, e.g. '311280nnnnn' or '123456789'.")])
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- unit_id_value: Annotated[str, FieldInfo(annotation=NoneType, required=True, description="The unit-id value, e.g. '311280nnnnn' or '123456789'.")]¶
- unit_id_variable: Annotated[str, FieldInfo(annotation=NoneType, required=True, description="The unit-id variable name, e.g. 'fnr', 'pers_id', 'reg_nr' or 'komm_nr'.")]¶
ssb_poc_statlog_model.generate_python module¶
ssb_poc_statlog_model.linage module¶
- class Linage(**data)¶
Bases:
StatlogBaseModelData model for data linage in a statistical production process.
- Parameters:
schema_version (Annotated[Literal['1.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')])
data_source (Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more input datasets used as data source.')])
data_target (Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more target datasets (where data is stored after the process has been run).')])
step (Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='A string describing at which step in the process the log is for.')])
file_hash (Annotated[list[str] | None, FieldInfo(annotation=NoneType, required=True, description='List of SHA-256 hashes, matching the files in the data_sources field.')])
- data_source: Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more input datasets used as data source.')]¶
- data_target: Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more target datasets (where data is stored after the process has been run).')]¶
- file_hash: Annotated[list[str] | None, FieldInfo(annotation=NoneType, required=True, description='List of SHA-256 hashes, matching the files in the data_sources field.')]¶
- model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- schema_version: Annotated[Literal['1.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')]¶
- step: Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='A string describing at which step in the process the log is for.')]¶
ssb_poc_statlog_model.quality_control_description module¶
- class QualityControlDescription(**data)¶
Bases:
StatlogBaseModelModel for description of quality controls used in a statistical production.
- Parameters:
schema_version (Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')])
quality_control_id (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='A unique quality control ID')])
quality_control_description (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Quality control description')])
quality_control_type (Annotated[QualityControlType, FieldInfo(annotation=NoneType, required=True, description='Quality control type: hard (H), soft (S), informative (I).')])
variables (Annotated[list[Variable], FieldInfo(annotation=NoneType, required=True, description='A description of which variables must be included in the quality control.')])
- model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- quality_control_description: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Quality control description')]¶
- quality_control_id: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='A unique quality control ID')]¶
- quality_control_type: Annotated[QualityControlType, FieldInfo(annotation=NoneType, required=True, description='Quality control type: hard (H), soft (S), informative (I).')]¶
- schema_version: Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')]¶
- class QualityControlType(*values)¶
Bases:
StrEnumQuality control type: hard (H), soft (S), informative (I).
- H = 'H'¶
- I = 'I'¶
- S = 'S'¶
- class Variable(**data)¶
Bases:
StatlogBaseModel- Parameters:
variable_description (str | None)
- model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- variable_description: str | None¶
ssb_poc_statlog_model.quality_control_result module¶
- class QualityControlResult(**data)¶
Bases:
StatlogBaseModelSchema for statistics quality control result.
- Parameters:
schema_version (Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')])
statistics_name (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')])
quality_control_id (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='A reference (or link/uri) to the quality control description')])
data_location (Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Controlled dataset reference/filepath (eg. GCS-path to a parquet file) or other dataset reference (eg. ref. to a CloudSQL database table).')])
data_period (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Data period controlled - eg. year, date, date-time, ...')])
quality_control_datetime (Annotated[AwareDatetime, FieldInfo(annotation=NoneType, required=True, description='Quality control datetime (date and time, ISO 8601)')])
quality_control_results (Annotated[QualityControlResults, FieldInfo(annotation=NoneType, required=True, description='Quality control result: quality ok (0), quality issues detected (1), missing value detected (2).')])
quality_result_comment (Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Quality control result comment.')])
quality_control_run_exception (Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Exception description. An error or warning occurred when executing the quality control routine.')])
- data_location: Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Controlled dataset reference/filepath (eg. GCS-path to a parquet file) or other dataset reference (eg. ref. to a CloudSQL database table).')]¶
- data_period: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Data period controlled - eg. year, date, date-time, ...')]¶
- model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- quality_control_datetime: Annotated[AwareDatetime, FieldInfo(annotation=NoneType, required=True, description='Quality control datetime (date and time, ISO 8601)')]¶
- quality_control_id: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='A reference (or link/uri) to the quality control description')]¶
- quality_control_results: Annotated[QualityControlResults, FieldInfo(annotation=NoneType, required=True, description='Quality control result: quality ok (0), quality issues detected (1), missing value detected (2).')]¶
- quality_control_run_exception: Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Exception description. An error or warning occurred when executing the quality control routine.')]¶
- quality_result_comment: Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='Quality control result comment.')]¶
- schema_version: Annotated[Literal['2.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')]¶
- statistics_name: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')]¶
ssb_poc_statlog_model.release module¶
- class Release(**data)¶
Bases:
StatlogBaseModelData model for a release in a statistical production process.
- Parameters:
schema_version (Annotated[Literal['1.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')])
dapla_team (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Name of the dapla team that produced the release.')])
statistics_name (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')])
git_tag (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Git tag for the release.')])
git_commit_hash (Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Git commit hash for the release.')])
data_source (Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more datasets in the final data state (utdata or klargjorte-data) used as data source for the release.')])
daplalab_image (Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='The tag of the Daplalab container image that produced the release.')])
- dapla_team: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Name of the dapla team that produced the release.')]¶
- daplalab_image: Annotated[str | None, FieldInfo(annotation=NoneType, required=True, description='The tag of the Daplalab container image that produced the release.')]¶
- data_source: Annotated[list[str], FieldInfo(annotation=NoneType, required=True, description='Reference or filepath to one or more datasets in the final data state (utdata or klargjorte-data) used as data source for the release.')]¶
- git_commit_hash: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Git commit hash for the release.')]¶
- git_tag: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Git tag for the release.')]¶
- model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- schema_version: Annotated[Literal['1.0.0'], FieldInfo(annotation=NoneType, required=True, description='Version of this schema.')]¶
- statistics_name: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='Statistics shortname or statistics product name')]¶
ssb_poc_statlog_model.statlog_base_model module¶
- class StatlogBaseModel(**data)¶
Bases:
BaseModelPydantic model that defines configurations which applies to all Models in this package.
- model_config: ClassVar[ConfigDict] = {'use_enum_values': True, 'validate_assignment': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].