ssb_timeseries.io.protocols¶
Defines the structural contracts for I/O handlers using typing.Protocol.
This module specifies the formal API that a custom I/O plugin must adhere to. By using Protocols (structural typing), external users can create handler classes that are compatible with ssb-timeseries without needing to inherit from any of its base classes. This provides maximum flexibility and decoupling for plugin authors.
- class DataReadWrite(repository, set_name, set_type, as_of_utc=None, **kwargs)¶
Bases:
ProtocolDefines the contract (protocol) for data IO handlers.
- Parameters:
repository (str | dict)
set_name (str)
set_type (str)
as_of_utc (datetime | None)
- __init__(repository, set_name, set_type, as_of_utc=None, **kwargs)¶
Initialize the IO handler with configuration for a specific data storage.
This constructor is called by the IO dispatcher. It configures the handler instance to operate within a specific context.
- Parameters:
repository (
str|dict) – The data repository name or configuration.set_name (
str) – The dataset name.set_type (
str) – The data type for the dataset.as_of_utc (
datetime|None) – The version marker (should be timezone aware).**kwargs – Any parameters defined for the handler in the configuration.
- Return type:
None
- property exists: bool¶
Check if the dataset exists in the configured storage.
- read(*args, **kwargs)¶
Read data from the configured storage.
- Return type:
typing.Any
- Returns:
The dataset’s data in a dataframe-like format (e.g., PyArrow Table). If the data does not exist, an empty dataframe should be returned.
- versions(*args, **kwargs)¶
Retrieve a list of available versions for the dataset.
- Return type:
list[datetime|str]- Returns:
A sorted list of version identifiers (datetimes or strings).
- write(data, tags=None)¶
Write the dataset’s data to the configured storage.
This method should handle both the creation of new data files and the updating/merging of data into existing files, depending on the versioning strategy of the dataset.
- Parameters:
data (typing.Any) – The data to be written (e.g., a pandas DataFrame or PyArrow Table).
tags (
dict|None) – A dictionary of metadata tags to be stored with the data, often in the file’s schema.
- Return type:
None
- class MetadataReadWrite(repository, set_name, **kwargs)¶
Bases:
ProtocolDefines the contract (protocol) for metadata IO handlers.
- Parameters:
repository (str | dict)
set_name (str)
- __init__(repository, set_name, **kwargs)¶
Initialize the IO handler for a specific metadata storage.
- Parameters:
repository (
str|dict) – The metadata repository name or configuration.set_name (
str) – The dataset name to operate on.**kwargs – Any parameters defined for the handler in the configuration.
- Return type:
None
- exists(name)¶
Check if metadata for a given dataset name exists.
- Return type:
bool- Parameters:
name (str)
- find(**kwargs)¶
Find datasets in the configured storage based on metadata criteria.
- Return type:
bool
- read(**kwargs)¶
Read metadata from the configured storage.
- Return type:
dict[str, typing.Any]- Returns:
A dictionary containing the metadata tags for the dataset.
- classmethod search(**kwargs)¶
Search and retrieve metadata from the configured storage.
This method should allow searching for datasets based on various metadata criteria.
- Return type:
dict[str, typing.Any]- Returns:
A dictionary or list of dictionaries containing the search results.
- write(**kwargs)¶
Write metadata to the configured storage.
- Return type:
None