ssb_timeseries.io.snapshot¶
Provides a file-based I/O handler for persisting dataset snapshots.
This handler stores data in a versioned directory structure that adheres to the naming conventions of Statistics Norway.
- class FileSystem(set_name, bucket, process_stage='statistikk', product='', sharing=None)¶
Bases:
objectA filesystem abstraction for writing dataset snapshots.
- Parameters:
set_name (str)
bucket (PathStr)
process_stage (str)
product (str)
sharing (dict | None)
- __init__(set_name, bucket, process_stage='statistikk', product='', sharing=None)¶
Initialize the filesystem handler for a given dataset snapshot.
This method calculates the necessary directory structure based on the dataset’s name and other contextual attributes.
- Parameters:
set_name (str)
bucket (str | PathLike[str])
process_stage (str)
product (str)
sharing (dict | None)
- Return type:
None
- last_version_number_by_regex(directory, pattern='*')¶
Return the max version number from files in a directory matching a pattern.
- Return type:
str- Parameters:
directory (str)
pattern (str)
- sharing_directory(path)¶
Return the directory path for sharing, creating it if it does not exist.
- Return type:
str|PathLike[str]- Parameters:
path (str)
- property snapshot_directory: str | PathLike[str]¶
Return the directory path for the snapshot.
The path is constructed from the configured bucket, process stage, product, and dataset name.
- snapshot_filename(as_of_utc=None, period_from='', period_to='')¶
Construct the full filename for the snapshot file.
The name includes the dataset name, period range, version timestamp, and an incrementing version number.
- Return type:
str|PathLike[str]- Parameters:
as_of_utc (datetime | None)
period_from (str)
period_to (str)
- write(sharing=None, as_of_tz=None, period_from=None, period_to=None, data_path='', meta_path='')¶
Copy snapshot files to their primary and shared storage locations.
- Parameters:
sharing (
dict|None) – A dictionary defining sharing configurations.as_of_tz (
datetime|None) – The version timestamp of the snapshot.period_from (
datetime|None) – The start of the data’s time period.period_to (
datetime|None) – The end of the data’s time period.data_path (
str) – The source path of the data file to copy.meta_path (
str) – The source path of the metadata file to copy.
- Return type:
None
- version_from_file_name(file_name, pattern='as_of', group=2)¶
Extract a version marker from a filename using known patterns.
- Return type:
str- Parameters:
file_name (str)
pattern (str | Versioning)
group (int)