ssb_timeseries.io

The IO modules define the read and write functionality that the Dataset module can access.

These modules are internal service modules. They are not supposed to be called directly from user code. Rather, for each time series repository, the configuration must identify the module to be used along with any required parameters. Thus, multiple time series repositories can be configured with different storage locations and technologies. Data and metadata are conceptually separated, so that a metadata catalog may be maintained per repository, or common to all repositories.

Some basic interaction patterns are built in, but the library can easily be extended with external IO modules.

class DataIO(ds)

Bases: object

Generic IO for data of a specific dataset.

Parameters:

ds (Dataset)

__init__(ds)

Retrieve configuration and initiate data IO handler for Dataset.

Parameters:

ds (Dataset)

Return type:

None

property dh: DataHandler

Expose the IO handler.

class MetaIO(ds)

Bases: object

Generic IO for metadata of a specific dataset.

Parameters:

ds (Dataset)

__init__(ds)

Retrieve configuration and initiate metadata IO handler for Dataset.

Parameters:

ds (Dataset)

Return type:

None

property dh: MetadataHandler

Expose the IO handler.

find(set_name='', repository='', require_one=False, require_unique=False, **kwargs)

Search for datasets by name matching pattern in specified or all repositories.

Returns:

The dataset for a single match, a list for no or multiple matches.

Return type:

list[io.SearchResult] | Dataset | list[None]

Raises:

LookupError – If require_unique = True and a unique result is not found.

Parameters:
  • set_name (str)

  • repository (str | dict)

  • require_one (bool)

  • require_unique (bool)

persist(ds)

Hardcoded with snapshot.FileSystem; note dependency on other IO for providing path(s) to write to.

Return type:

None

Parameters:

ds (Dataset)

read_data(repository, set_name, as_of_tz=None)

Read data into >Arrow Table with configured IO Handlers.

Return type:

Union[narwhals.typing.IntoDataFrame, narwhals.typing.IntoLazyFrame]

Parameters:
  • repository (str | dict)

  • set_name (str)

  • as_of_tz (datetime | None)

read_metadata(repository, set_name)

Read metadata dict with configured IO Handlers.

Return type:

dict

Parameters:
  • repository (str | dict)

  • set_name (str)

save(ds)

Write data and metadata using configured IO handlers.

Return type:

None

Parameters:

ds (Dataset)

versions(ds, **kwargs)

Get list of all series version markers (as_of dates or version names).

Return type:

list[datetime | str]

Parameters:

ds (Dataset)