Reference¶

The ssb_timeseries package is a helper library for production and analysis of statistical data in the form of time series.

It is designed to make it as easy as possible to store data and metadata for datasets and series in ways that are consistent with the information model, and to facilitate integration with automated workflows.

Functionality includes:

Read and write data and metadata
Metadata maintenance: tagging, detagging, retagging
Search and filtering
Time algebra: downsampling and upsampling to other time resolutions
Linear algebra operations with sets (matrices) and series (column vectors)
Metadata aware calculations, like unit conversions and aggregation over taxonomy hierarchies
Basic plotting

The most practical entry points are the ssb_timeseries.dataset and ssb_timeseries.catalog modules.

The ssb_timeseries.dataset module and its Dataset class is the very core of the ssb_timeseries package, defining most of the key functionality.

The dataset is the unit of analysis for both information model and workflow integration,and performance will benefit from linear algebra with sets as matrices consisting of series column vectors.

As described in the Information model time series datasets may consist of any number of series of the same SeriesType. The series types are defined by dimensionality characteristics:

Versioning (NONE, AS_OF, NAMED)
Temporality (Valid AT point in time, or FROM and TO for duration)
The type of the value. For now only scalar values are supported.

Additional type determinants (sparsity, irregular frequencies, non-numeric or non-scalar values, …) are conceivable and may be introduced later. The types are crucial because they are reflected in the physical storage structure. That in turn has practical implications for how the series can be interacted with, and for methods working on the data.

See also

The ssb_timeseries.catalog module for tools for searching for datasets or series by names or metadata.

The ssb_timeseries.catalog module provides several tools for searching for datasets or series in every Repository of a Catalog.

The catalog is essentially just a logical collection of repositories, providing a search interface across all of them.

Searches can list or count sets, series or items (both). The search criteria can be complete names (equals), parts of names (contains), or metadata attributes (tags).

A returned py:class:CatalogItem instance is identified by name and descriptive metadate, plus the repository, object type and relationships to parent and child objects are provided. Other information, like lineage and data quality metrics may be added later.

>>> 
>>> from ssb_timeseries.catalog import Catalog
>>> everything = Catalog().items()
>>> 

The other modules of the package are helpers used by these core modules, and not intended for direct use.

Some notable exceptions are taxonomy and hierarchy features of ssb_timeseries.meta and type definitions in ssb_timeseries.properties. ssb_timeseries.config may be used for initial set up and later switching between repositories, if needed. The ssb_timeseries.io seeks to make the storage agnostic of whether data and metada are stored in files or databases and ssb_timeseries.fs is an abstraction for local vs GCS file systems.

The package includes several modules:

Package modules