ssb_timeseries.io.snapshot

Provides a file-based I/O handler for persisting dataset snapshots.

This handler stores data in a versioned directory structure that adheres to the naming conventions of Statistics Norway.

class FileSystem(set_name, bucket, process_stage='statistikk', product='', sharing=None)

Bases: object

A filesystem abstraction for writing dataset snapshots.

Parameters:
  • set_name (str)

  • bucket (PathStr)

  • process_stage (str)

  • product (str)

  • sharing (dict | None)

__init__(set_name, bucket, process_stage='statistikk', product='', sharing=None)

Initialize the filesystem handler for a given dataset snapshot.

This method calculates the necessary directory structure based on the dataset’s name and other contextual attributes.

Parameters:
  • set_name (str)

  • bucket (str | PathLike[str])

  • process_stage (str)

  • product (str)

  • sharing (dict | None)

Return type:

None

last_version_number_by_regex(directory, pattern='*')

Return the max version number from files in a directory matching a pattern.

Return type:

str

Parameters:
  • directory (str)

  • pattern (str)

sharing_directory(path)

Return the directory path for sharing, creating it if it does not exist.

Return type:

str | PathLike[str]

Parameters:

path (str)

property snapshot_directory: str | PathLike[str]

Return the directory path for the snapshot.

The path is constructed from the configured bucket, process stage, product, and dataset name.

snapshot_filename(as_of_utc=None, period_from='', period_to='')

Construct the full filename for the snapshot file.

The name includes the dataset name, period range, version timestamp, and an incrementing version number.

Return type:

str | PathLike[str]

Parameters:
  • as_of_utc (datetime | None)

  • period_from (str)

  • period_to (str)

write(sharing=None, as_of_tz=None, period_from=None, period_to=None, data_path='', meta_path='')

Copy snapshot files to their primary and shared storage locations.

Parameters:
  • sharing (dict | None) – A dictionary defining sharing configurations.

  • as_of_tz (datetime | None) – The version timestamp of the snapshot.

  • period_from (datetime | None) – The start of the data’s time period.

  • period_to (datetime | None) – The end of the data’s time period.

  • data_path (str) – The source path of the data file to copy.

  • meta_path (str) – The source path of the metadata file to copy.

Return type:

None

version_from_file_name(file_name, pattern='as_of', group=2)

Extract a version marker from a filename using known patterns.

Return type:

str

Parameters:
  • file_name (str)

  • pattern (str | Versioning)

  • group (int)