ssb_timeseries.config
¶
Configurations for the SSB timeseries library.
An environment variable TIMESERIES_CONFIG is expected to point to a JSON file with configurations. If these exist, they will be loaded and put into a Config object CONFIG when the configuration module is loaded.
In most cases, this would happen behind the scene when ssb_timeseries.dataset
or ssb_timeseries.catalog
are imported.
Directly accessing the configuration module should only be required when manipulating configurations from Python code.
Example
>>>
>>> from ssb_timeseries.config import CONFIG
>>> CONFIG.catalog = 'gs://{bucket}/timeseries/metadata/'
>>> CONFIG.save()
>>>
For switching between preset configurations, use the timeseries-config command:
poetry run timeseries-config <option>
which is equivalent to:
python ./config.py <option>
See ssb_timeseries.config.main()
for details on the named options.
- CONFIG = <ssb_timeseries.config.Config object>¶
A Config object.
- class Config(**kwargs)¶
Bases:
object
Configuration class; for reading and writing timeseries configurations.
If instantiated with no parameters, an existing configuration file is exepected to exist: either in a location specified by the environment variable TIMESERIES_CONFIG or in the default location in the user’s home directory. If not, an error is returned.
If the
configuration_file
attribute is specified, configurations will be loaded from that file. No other parameters are required. AFileNotFoundError
orFileDoesNotExist
error will be returned if the file is not found. In this case, no attempt is made to load configurations from locations specified by environment variable or defaults.If any additional parameters are provided, they will override values from the configuration file. If the result is not a valid configuration, a ValidationError is raised.
If one or more parameters are provided, but the configuration_file parameter is not among them, configurations are identified by the environment variable TIMESERIES_CONFIG or the default configuration file location (in that order of priority). Provided parameters override values from the configuration file. If the result is not a valid configuration, an error is raised.
The returned configuration will not be saved, but held in memory only till the
save()
method is called. Then the configuration will be savedto a file and the environment variable TIMESERIES_CONFIG set to reflect the location of the file.Initialize Config object from keyword arguments.
- Keyword Arguments:
preset (str) – Optional. Name of a preset configuration. If provided, the preset configuration is loaded, and no other parameters are considered.
configuration_file (str) – Path to the configuration file. If the parameter is not provided, the environment variable TIMESERIES_CONFIG is used. If the environment variable is not set, the default configuration file location is used.
timeseries_root (str) – Path to the root directory for time series data. If one of these identifies a vaild json file, the configuration is loaded from that file and no other parameters are required. If provided, they will override values from the configuration file.
catalog (str) – Path to the catalog file.
log_file (str) – Path to the log file.
bucket (str) – Name of the GCS bucket.
- Raises:
FileNotFoundError – If the configuration file as implied by provided or not provided parameters does not exist. # noqa: DAR402
ValidationError – If the resulting configuration is not valid. # noqa: DAR402
EnvVarNotDefinedeError – If the environment variable TIMESERIES_CONFIG is not defined.
Examples
Load an existing config from TIMESERIES_CONFIG or default location:
>>> from ssb_timeseries.config import Config >>> config = Config.active()
- __eq__(other)¶
Equality test.
- Return type:
bool
- Parameters:
other (Self | dict)
- __getitem__(item)¶
Get the value of a configuration.
- Return type:
str
- Parameters:
item (str)
- __init__(**kwargs)¶
Initialize Config object from keyword arguments.
- Keyword Arguments:
preset (str) – Optional. Name of a preset configuration. If provided, the preset configuration is loaded, and no other parameters are considered.
configuration_file (str) – Path to the configuration file. If the parameter is not provided, the environment variable TIMESERIES_CONFIG is used. If the environment variable is not set, the default configuration file location is used.
timeseries_root (str) – Path to the root directory for time series data. If one of these identifies a vaild json file, the configuration is loaded from that file and no other parameters are required. If provided, they will override values from the configuration file.
catalog (str) – Path to the catalog file.
log_file (str) – Path to the log file.
bucket (str) – Name of the GCS bucket.
- Raises:
FileNotFoundError – If the configuration file as implied by provided or not provided parameters does not exist. # noqa: DAR402
ValidationError – If the resulting configuration is not valid. # noqa: DAR402
EnvVarNotDefinedeError – If the environment variable TIMESERIES_CONFIG is not defined.
- Return type:
None
Examples
Load an existing config from TIMESERIES_CONFIG or default location:
>>> from ssb_timeseries.config import Config >>> config = Config.active()
- __str__()¶
Return timeseries configurations as JSON string.
- Return type:
str
- classmethod active()¶
Force reload and return the configuration identified by
ENV_VAR_NAME
.- Return type:
Self
- apply(configuration)¶
Set configuration values from a dictionary.
- Return type:
None
- Parameters:
configuration (dict)
-
bucket:
str
|PathLike
[str
]¶ The topmost level of the GCS bucket for the team.
-
catalog:
str
|PathLike
[str
]¶ The path to the metadata directory of a repository .
-
configuration_file:
str
|PathLike
[str
]¶ The path to the configuRation file.
- property is_valid: bool¶
Check if the configuration has all required fields.
-
log_file:
str
|PathLike
[str
]¶ The path to the log file.
- save(path='')¶
Saves configurations to the JSON file defined by path or
configuration_file
.If path is set, it will take presence and
configuration_file
will be set accordingly.- Parameters:
path (PathStr) – Full path of the JSON file to save to. If not specified, it will attempt to use the environment variable TIMESERIES_CONFIG before falling back to the default location $HOME/.config/ssb_timeseries/timeseries_config.json.
- Raises:
ValueError – If path is not provided and
configuration_file
is not set.- Return type:
None
-
timeseries_root:
str
|PathLike
[str
]¶ The root directory for data storage of a repository.
- DAPLA_BUCKET = 'gs://-'¶
//{DAPLA_TEAM}-{DAPLA_ENV}.
- Type:
Returns the Dapla product bucket name for the current environment
- Type:
gs
- DAPLA_ENV = ''¶
‘prod’ | test | dev
- Type:
Returns the Dapla environment
- DAPLA_TEAM = ''¶
Returns the Dapla team/project name.’
- class DictObject(dict_)¶
Bases:
object
Helper class to convert dict to object.
- Parameters:
dict_ (dict)
- classmethod from_dict(d)¶
- Parameters:
d (dict)
- exception MissingEnvironmentVariableError¶
Bases:
Exception
The environment variable TIMESEREIS_CONFIG must be defined.
- exception ValidationError¶
Bases:
Exception
Configuration validation error.
- active_file(path='')¶
If a path is provided, sets environment variable
ENV_VAR_NAME
to specify the location of the configuration file.Returns the value of the environment variable by way of
get_active_file()
.- Return type:
str
- Parameters:
path (str | PathLike[str])
- configuration_schema(version='0.3.1')¶
Return the JSON schema for the configuration file.
- Return type:
dict
- Parameters:
version (str)
- is_valid_config(configuration)¶
Check if a dictionary is a valid configuration.
A valid configuration has the same keys as DEFAULTS.
- Return type:
tuple
[bool
,object
]- Parameters:
configuration (dict)
- load_json_file(path, error_on_missing=False)¶
Read configurations from a JSON file into a Config object.
- Return type:
dict
- Parameters:
path (str | PathLike[str])
error_on_missing (bool)
- main(*args)¶
Set configurations to predefined defaults when run from command line.
- Use:
` poetry run timeseries-config <option> `
- or
` python ./config.py <option>` `
- Parameters:
*args (str) – ‘home’ | ‘gcs’ | ‘jovyan’.
- Raises:
ValueError – If args is not ‘home’ | ‘gcs’ | ‘jovyan’. # noqa: DAR402
- Return type:
None
- migrate_to_new_config_location(file_to_copy='')¶
Copy existing configuration files to the new default location $HOME/.config/ssb_timeseries/.
The first file copied will be set to active.
- Parameters:
file_to_copy (PathStr) – Optional. Path to a existing configuration file. If not provided, the function will look in the most common location for SSBs old JupyterLab and DaplaLab.
- Return type:
str
- path_str(*args)¶
Concatenate paths as string: str(Path(…)).
- Return type:
str
- presets(named_config)¶
Set configurations to predefined defaults.
- Raises:
ValueError – If args is not ‘home’ | ‘gcs’ | ‘jovyan’.
- Return type:
dict
- Parameters:
named_config (str)
- unset_env_var()¶
Unsets the environment variable
ENV_VAR_NAME
and returns the value that was unset.- Return type:
str