ssb_timeseries.config

Configurations for the SSB timeseries library.

An environment variable TIMESERIES_CONFIG is expected to point to a JSON file with configurations. If these exist, they will be loaded and put into a Config object CONFIG when the configuration module is loaded.

In most cases, this would happen behind the scene when ssb_timeseries.dataset or ssb_timeseries.catalog are imported.

Directly accessing the configuration module should only be required when manipulating configurations from Python code.

Example

>>> 
>>> from ssb_timeseries.config import CONFIG
>>> CONFIG.catalog = 'gs://{bucket}/timeseries/metadata/'
>>> CONFIG.save()
>>> 

For switching between preset configurations, use the timeseries-config command:

poetry run timeseries-config <option>

which is equivalent to:

python ./config.py <option>

See ssb_timeseries.config.main() for details on the named options.

CONFIG = <ssb_timeseries.config.Config object>

A Config object.

class Config(**kwargs)

Bases: object

Configuration class; for reading and writing timeseries configurations.

If instantiated with no parameters, an existing configuration file is exepected to exist: either in a location specified by the environment variable TIMESERIES_CONFIG or in the default location in the user’s home directory. If not, an error is returned.

If the configuration_file attribute is specified, configurations will be loaded from that file. No other parameters are required. A FileNotFoundError or FileDoesNotExist error will be returned if the file is not found. In this case, no attempt is made to load configurations from locations specified by environment variable or defaults.

If any additional parameters are provided, they will override values from the configuration file. If the result is not a valid configuration, a ValidationError is raised.

If one or more parameters are provided, but the configuration_file parameter is not among them, configurations are identified by the environment variable TIMESERIES_CONFIG or the default configuration file location (in that order of priority). Provided parameters override values from the configuration file. If the result is not a valid configuration, an error is raised.

The returned configuration will not be saved, but held in memory only till the save() method is called. Then the configuration will be savedto a file and the environment variable TIMESERIES_CONFIG set to reflect the location of the file.

Initialize Config object from keyword arguments.

Keyword Arguments:
  • preset (str) – Optional. Name of a preset configuration. If provided, the preset configuration is loaded, and no other parameters are considered.

  • configuration_file (str) – Path to the configuration file. If the parameter is not provided, the environment variable TIMESERIES_CONFIG is used. If the environment variable is not set, the default configuration file location is used.

  • timeseries_root (str) – Path to the root directory for time series data. If one of these identifies a vaild json file, the configuration is loaded from that file and no other parameters are required. If provided, they will override values from the configuration file.

  • catalog (str) – Path to the catalog file.

  • log_file (str) – Path to the log file.

  • bucket (str) – Name of the GCS bucket.

Raises:
  • FileNotFoundError – If the configuration file as implied by provided or not provided parameters does not exist. # noqa: DAR402

  • ValidationError – If the resulting configuration is not valid. # noqa: DAR402

  • EnvVarNotDefinedeError – If the environment variable TIMESERIES_CONFIG is not defined.

Examples

Load an existing config from TIMESERIES_CONFIG or default location:

>>> from ssb_timeseries.config import Config
>>> config = Config.active()
__eq__(other)

Equality test.

Return type:

bool

Parameters:

other (Self | dict)

__getitem__(item)

Get the value of a configuration.

Return type:

str

Parameters:

item (str)

__init__(**kwargs)

Initialize Config object from keyword arguments.

Keyword Arguments:
  • preset (str) – Optional. Name of a preset configuration. If provided, the preset configuration is loaded, and no other parameters are considered.

  • configuration_file (str) – Path to the configuration file. If the parameter is not provided, the environment variable TIMESERIES_CONFIG is used. If the environment variable is not set, the default configuration file location is used.

  • timeseries_root (str) – Path to the root directory for time series data. If one of these identifies a vaild json file, the configuration is loaded from that file and no other parameters are required. If provided, they will override values from the configuration file.

  • catalog (str) – Path to the catalog file.

  • log_file (str) – Path to the log file.

  • bucket (str) – Name of the GCS bucket.

Raises:
  • FileNotFoundError – If the configuration file as implied by provided or not provided parameters does not exist. # noqa: DAR402

  • ValidationError – If the resulting configuration is not valid. # noqa: DAR402

  • EnvVarNotDefinedeError – If the environment variable TIMESERIES_CONFIG is not defined.

Return type:

None

Examples

Load an existing config from TIMESERIES_CONFIG or default location:

>>> from ssb_timeseries.config import Config
>>> config = Config.active()
__str__()

Return timeseries configurations as JSON string.

Return type:

str

classmethod active()

Force reload and return the configuration identified by ENV_VAR_NAME.

Return type:

Self

apply(configuration)

Set configuration values from a dictionary.

Return type:

None

Parameters:

configuration (dict)

bucket: str | PathLike[str]

The topmost level of the GCS bucket for the team.

catalog: str | PathLike[str]

The path to the metadata directory of a repository .

configuration_file: str | PathLike[str]

The path to the configuRation file.

property is_valid: bool

Check if the configuration has all required fields.

log_file: str | PathLike[str]

The path to the log file.

save(path='')

Saves configurations to the JSON file defined by path or configuration_file.

If path is set, it will take presence and configuration_file will be set accordingly.

Parameters:

path (PathStr) – Full path of the JSON file to save to. If not specified, it will attempt to use the environment variable TIMESERIES_CONFIG before falling back to the default location $HOME/.config/ssb_timeseries/timeseries_config.json.

Raises:

ValueError – If path is not provided and configuration_file is not set.

Return type:

None

timeseries_root: str | PathLike[str]

The root directory for data storage of a repository.

DAPLA_BUCKET = 'gs://-'

//{DAPLA_TEAM}-{DAPLA_ENV}.

Type:

Returns the Dapla product bucket name for the current environment

Type:

gs

DAPLA_ENV = ''

‘prod’ | test | dev

Type:

Returns the Dapla environment

DAPLA_TEAM = ''

Returns the Dapla team/project name.’

class DictObject(dict_)

Bases: object

Helper class to convert dict to object.

Parameters:

dict_ (dict)

classmethod from_dict(d)
Parameters:

d (dict)

exception MissingEnvironmentVariableError

Bases: Exception

The environment variable TIMESEREIS_CONFIG must be defined.

exception ValidationError

Bases: Exception

Configuration validation error.

active_file(path='')

If a path is provided, sets environment variable ENV_VAR_NAME to specify the location of the configuration file.

Returns the value of the environment variable by way of get_active_file().

Return type:

str

Parameters:

path (str | PathLike[str])

configuration_schema(version='0.3.1')

Return the JSON schema for the configuration file.

Return type:

dict

Parameters:

version (str)

is_valid_config(configuration)

Check if a dictionary is a valid configuration.

A valid configuration has the same keys as DEFAULTS.

Return type:

tuple[bool, object]

Parameters:

configuration (dict)

load_json_file(path, error_on_missing=False)

Read configurations from a JSON file into a Config object.

Return type:

dict

Parameters:
  • path (str | PathLike[str])

  • error_on_missing (bool)

main(*args)

Set configurations to predefined defaults when run from command line.

Use:

` poetry run timeseries-config <option> `

or

` python ./config.py <option>` `

Parameters:

*args (str) – ‘home’ | ‘gcs’ | ‘jovyan’.

Raises:

ValueError – If args is not ‘home’ | ‘gcs’ | ‘jovyan’. # noqa: DAR402

Return type:

None

migrate_to_new_config_location(file_to_copy='')

Copy existing configuration files to the new default location $HOME/.config/ssb_timeseries/.

The first file copied will be set to active.

Parameters:

file_to_copy (PathStr) – Optional. Path to a existing configuration file. If not provided, the function will look in the most common location for SSBs old JupyterLab and DaplaLab.

Return type:

str

path_str(*args)

Concatenate paths as string: str(Path(…)).

Return type:

str

presets(named_config)

Set configurations to predefined defaults.

Raises:

ValueError – If args is not ‘home’ | ‘gcs’ | ‘jovyan’.

Return type:

dict

Parameters:

named_config (str)

unset_env_var()

Unsets the environment variable ENV_VAR_NAME and returns the value that was unset.

Return type:

str