`ssb_timeseries.io.protocols`¶

Defines the structural contracts for I/O handlers using typing.Protocol.

This module specifies the formal API that a custom I/O plugin must adhere to. By using Protocols (structural typing), external users can create handler classes that are compatible with ssb-timeseries without needing to inherit from any of its base classes. This provides maximum flexibility and decoupling for plugin authors.

class DataReadWrite(repository, set_name, set_type, as_of_utc=None, **kwargs)¶

Bases: Protocol

Defines the contract (protocol) for data IO handlers.

Parameters:

repository (str | dict)
set_name (str)
set_type (str)
as_of_utc (datetime | None)

__init__(repository, set_name, set_type, as_of_utc=None, **kwargs)¶

Initialize the IO handler with configuration for a specific data storage.

This constructor is called by the IO dispatcher. It configures the handler instance to operate within a specific context.

Parameters:

repository (str | dict) – The data repository name or configuration.
set_name (str) – The dataset name.
set_type (str) – The data type for the dataset.
as_of_utc (datetime | None) – The version marker (should be timezone aware).
**kwargs – Any parameters defined for the handler in the configuration.

Return type:

None

property exists: bool¶: Check if the dataset exists in the configured storage.

read(*args, **kwargs)¶

Read data from the configured storage.

Return type:: typing.Any
Returns:: The dataset’s data in a dataframe-like format (e.g., PyArrow Table). If the data does not exist, an empty dataframe should be returned.

versions(*args, **kwargs)¶

Retrieve a list of available versions for the dataset.

Return type:: list[datetime | str]
Returns:: A sorted list of version identifiers (datetimes or strings).

write(data, tags=None)¶

Write the dataset’s data to the configured storage.

This method should handle both the creation of new data files and the updating/merging of data into existing files, depending on the versioning strategy of the dataset.

Parameters:

data (typing.Any) – The data to be written (e.g., a pandas DataFrame or PyArrow Table).
tags (dict | None) – A dictionary of metadata tags to be stored with the data, often in the file’s schema.

Return type:

None

class MetadataReadWrite(repository, set_name, **kwargs)¶

Bases: Protocol

Defines the contract (protocol) for metadata IO handlers.

Parameters:

repository (str | dict)
set_name (str)

__init__(repository, set_name, **kwargs)¶

Initialize the IO handler for a specific metadata storage.

Parameters:

repository (str | dict) – The metadata repository name or configuration.
set_name (str) – The dataset name to operate on.
**kwargs – Any parameters defined for the handler in the configuration.

Return type:

None

exists(name)¶

Check if metadata for a given dataset name exists.

Return type:: bool
Parameters:: name (str)

find(**kwargs)¶

Find datasets in the configured storage based on metadata criteria.

Return type:: bool

read(**kwargs)¶

Read metadata from the configured storage.

Return type:: dict[str, typing.Any]
Returns:: A dictionary containing the metadata tags for the dataset.

classmethod search(**kwargs)¶

Search and retrieve metadata from the configured storage.

This method should allow searching for datasets based on various metadata criteria.

Return type:: dict[str, typing.Any]
Returns:: A dictionary or list of dictionaries containing the search results.

write(**kwargs)¶

Write metadata to the configured storage.

Return type:: None

ssb_timeseries.io.protocols¶

`ssb_timeseries.io.protocols`¶