Reference

pre_system package

pre_system.additive_benchmark module

additive_benchmark(df_indicator, df_target, liste_km, startyear, endyear)

Adjust values in df_target to match df_indicator with additive quota adjustment.

This function adjusts the values in df_target to match the values in df_indicator by first aggregating df_target to the same frequency as df_indicator. Then calculating the difference between the two, spreading the difference evenly across the corresponding periods in df_target, and adding the difference to df_target. The adjusted df_indicator is then returned, with non-overlapping columns left untreated.

Author: Benedikt Goodman, National accounts Department and Vemund Rundberget, Macroeconomics Department, Research Division, SSB

Parameters:
  • df_indicator (DataFrame) – DataFrame containing the indicator values that will be benchmarked. Must have a PeriodIndex.

  • df_target (DataFrame) – DataFrame containing the values to benchmark additively against. Must have a PeriodIndex.

  • liste_km (list[str] | str) – List of series names (columns) to benchmark. If a single string is provided, it is automatically wrapped into a list.

  • startyear (int) – The starting year (inclusive) of the benchmarking period.

  • endyear (int) – The ending year (inclusive) of the benchmarking period. Must be <= 2099.

Returns:

An adjusted DataFrame with the same frequency as the original df_target,

with non-overlapping columns left untreated.

Return type:

pd.DataFrame

Raises:
  • TypeError – If inputs are not DataFrames, indices are not PeriodIndex, parameters are of incorrect type, or required columns are missing.

  • AssertionError – If the chosen start or end years are not available in either the indicator or target DataFrames.

Notes

  • Any series containing only zeros, non-numeric values, or NaNs will be excluded from benchmarking.

  • The method ensures consistency: the sum of the adjusted indicator series over each target period will exactly match the corresponding target value.

  • This corresponds to an additive Denton-style temporal disaggregation.

Examples

>>> import pandas as pd
>>> from pre_system.additive_benchmark import additive_benchmark
>>> idx_monthly = pd.period_range("2022-01", periods=6, freq="M")
>>> idx_quarterly = pd.period_range("2022Q1", periods=2, freq="Q")
>>> df_indicator = pd.DataFrame({"value": [1, 2, 3, 4, 5, 6]}, index=idx_monthly)
>>> df_target = pd.DataFrame({"value": [21, 15]}, index=idx_quarterly)
>>> additive_benchmark(df_indicator, df_target, ["value"], 2022, 2022).sum()
value    36.0
dtype: float64

pre_system.chaining module

chain_df(val_df, fp_df, serieslist=None, baseyear=None, startyear=None, endyear=None, appendvlname=False)

Chaining economic time series data.

Processes and validates data for chaining economic time series data, ensuring proper formats, types, and constraints across two dataframes. Also, performs warnings for data issues like NaN values or missing data columns while preparing time series for year-over-year chaining.

This function validates the input dataframes, extracts overlapping column series based on user input or intersection of dataframe columns, and ensures proper filtering to only include non-problematic data for all chaining operations. Columns with issues such as non-numeric data, NaN values, or zero-only values are warned about, and will be excluded from chaining processing. The function also checks for start, end, and base year constraints, ensuring valid time ranges for chaining.

Parameters:
  • val_df (DataFrame) – Input dataframe containing current price values data used for chaining.

  • fp_df (DataFrame) – Input dataframe containing fixed price values data used for chaining.

  • serieslist (list[str] | str | None) – List of column names, a single string, or None specifying the series to chain. Uses the intersection of columns from the dataframes if None.

  • baseyear (int | None) – An integer specifying the base year for chaining operations. Must be within valid constraints.

  • startyear (int | None) – Specifies the start year for limiting the chaining range. Computes automatically based on data ranges if not provided.

  • endyear (int | None) – Specifies the end year for limiting the chaining range. Computes automatically based on data ranges if not provided.

  • appendvlname (bool) – Whether to append suffix/prefix to chained series. Defaults to False.

Returns:

A dataframe containing the chained series for all specified or detected valid columns in the input dataframes. Columns with detected issues are excluded from the output.

Return type:

pd.DataFrame

Raises:
  • TypeError – If invalid types or indices are encountered in inputs, or if year constraints fail.

  • AssertionError – If the selected start or end year is not present in either the indicator or target DataFrames

pre_system.convert module

convert(input_df, to_freq)

Upsamples or downsamples a DataFrame with a PeriodIndex to the specified frequency.

Parameters:
  • input_df (pd.DataFrame) – The input DataFrame with a PeriodIndex to be converted.

  • to_freq (str) – The target frequency to which the DataFrame will be converted. Valid options: ‘A’ (annual), ‘Q’ (quarterly), or ‘M’ (monthly).

  • Returns

  • -------

  • pd.DataFrame – The converted DataFrame with the specified frequency.

  • Raises

  • ------

  • TypeError – If the DataFrame does not have a PeriodIndex.

  • ValueError – If the conversion is not possible for the given input frequency.

  • Notes

  • -----

  • function. (- The conversion is performed by resampling the input DataFrame using the sum aggregation)

  • frequency (- If the target frequency is lower or equal to the input)

  • only. (the conversion is done using resampling)

  • higher (- If the target frequency is)

  • values. (a first-order conditions matrix is constructed and solved to fill missing)

Return type:

DataFrame

convert_step(input_df, to_freq)

Converts a time series DataFrame from one periodic frequency to another.

The input DataFrame is expected to have a PeriodIndex as its index and consist of numeric columns. The conversion can be performed for annual, quarterly, or monthly frequencies, specified in the to_freq parameter.

Parameters:
  • input_df (DataFrame) – The input DataFrame with a PeriodIndex as its index and numeric columns representing time series data.

  • to_freq (str) – The desired periodic frequency to which the input DataFrame should be converted. Valid values are “A” (Annual), “Q” (Quarterly), or “M” (Monthly).

Returns:

A DataFrame resampled and converted to the desired frequency,

preserving the numeric structure of the input time series.

Return type:

pd.DataFrame

Raises:
  • TypeError – If the input DataFrame does not have a PeriodIndex or non-numeric columns.

  • ValueError – If the conversion is not possible for the given input frequency.

pre_system.formula module

class AddCorr(formula, correction_name)

Bases: Formula

Apply an additive correction to a formula.

Adds a named correction series to the evaluated series.

Parameters:
  • formula (Formula)

  • correction_name (str)

property baseyear: int | None

Base year used for the additive correction formula.

Returns:

The base year if set, otherwise None.

Return type:

int | None

evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the data using the provided DataFrames and return the evaluated series.

Parameters:
  • annual_df (pd.DataFrame) – The DataFrame containing annual data.

  • indicators_df (pd.DataFrame) – The DataFrame containing indicator data.

  • weights_df (pd.DataFrame, optional) – The DataFrame containing weight data. Defaults to None.

  • correction_df (pd.DataFrame, optional) – The DataFrame containing correction data. Defaults to None.

  • Raises

  • ------

  • ValueError – If the baseyear is not set.

  • TypeError – If any of the input DataFrames is not of type pd.DataFrame.

  • AttributeError – If the index of any DataFrame is not of type pd.PeriodIndex or has incorrect frequency.

  • IndexError – If the baseyear is out of range for any of the DataFrames.

  • NameError – If the required column names are not present in the DataFrames.

  • Returns

  • -------

  • pd.Series – The evaluated series.

  • test_dfs (bool)

Return type:

Series

indicators_weights(trace=True)

Indicator-weight pairs for this indicator formula.

This method uses the _formula.indicators_weights to compute the indicator weights when trace is set to True. If trace is False, it returns an empty list.

Parameters:

trace (bool) – A flag to indicate whether to calculate the weights with or without tracing.

Returns:

A list of tuples where each tuple contains

the indicator name as a string and its corresponding weight as a float. Returns an empty list if trace is False.

Return type:

list[tuple[str, float]]

property what: str

Textual representation of this additive correction formula.

Returns:

Expression showing the additive correction applied to the formula.

Return type:

str

class FDeflate(name, formula, indicators, weights=None, correction=None, normalise=False)

Bases: Formula

Deflate a base formula by indicator(s).

Computes a series proportional to a base formula divided by (possibly weighted) indicator(s), optionally normalised and corrected. See __init__ for parameter details.

Parameters:
  • name (str)

  • formula (Formula)

  • indicators (list[str])

  • weights (list[str] | list[float] | None)

  • correction (str | None)

  • normalise (bool)

evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the deflation formula.

Parameters:
  • annual_df (DataFrame) – Annual level series.

  • indicators_df (DataFrame) – Indicator series.

  • weights_df (DataFrame | None) – Optional weights used to combine indicators.

  • correction_df (DataFrame | None) – Optional correction series applied after deflation.

  • test_dfs (bool) – Whether to validate inputs (recommended).

Returns:

The deflated series aligned to the indicator frequency.

Return type:

pd.Series

Raises:
  • NameError – If a required column name is missing from one of the DataFrames.

  • AttributeError – If the index is not of type pd.PeriodIndex or has an incorrect frequency.

property indicators: list[str]

Indicator column names used by this deflation formula.

Returns:

Indicator identifiers expected in indicators_df.

Return type:

list[str]

indicators_weights(trace=True)

Indicator-weight pairs for this indicator formula.

Parameters:

trace (bool) – Ignored for this class; included for API symmetry.

Returns:

Pairs of indicators with their numeric weights. Raises TypeError if any weight is not a float.

Return type:

list[tuple[str, float]]

property weights: list[str] | list[float]

Weights to apply to indicators in deflation.

Returns:

Either names of weight columns (from weights_df) or constant numeric weights. Defaults to 1.0 per indicator when not provided.

Return type:

list[str] | list[float]

property what: str

Textual representation of this deflation formula.

Returns:

Expression showing base formula divided by weighted indicators (optionally normalised and corrected) and scaled to the base year.

Return type:

str

class FDiv(name, formula1, formula2)

Bases: Formula

Element-wise division of two formulas.

Divides one formula series by another with matching indices.

Parameters:
evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the data using the provided DataFrames and return the evaluated series.

Parameters:
  • annual_df (pd.DataFrame) – The DataFrame containing annual data.

  • indicators_df (pd.DataFrame) – The DataFrame containing indicator data.

  • weights_df (pd.DataFrame, optional) – The DataFrame containing weight data. Defaults to None.

  • correction_df (pd.DataFrame, optional) – The DataFrame containing correction data. Defaults to None.

  • Raises

  • ------

  • ValueError – If the baseyear is not set. If formula1 does not evaluate. If formula2 does not evaluate.

  • TypeError – If any of the input DataFrames is not of type pd.DataFrame.

  • AttributeError – If the index of any DataFrame is not of type pd.PeriodIndex or has incorrect frequency.

  • IndexError – If the baseyear is out of range for any of the DataFrames.

  • Returns

  • -------

  • pd.Series – The evaluated series.

  • test_dfs (bool)

Return type:

Series

indicators_weights(trace=True)

Indicator-weight pairs for this indicator formula.

This method aggregates the weights of indicators from a collection of formulas. If trace is set to True, the weights from each formula’s indicators are retrieved and combined.

Parameters:

trace (bool) – A flag to indicate whether to calculate the weights with or without tracing.

Returns:

A list of tuples where each tuple contains

the indicator name as a string and its corresponding weight as a float.

Return type:

list[tuple[str, float]]

property what: str

Textual representation of this division formula.

Returns:

Expression showing the division of two operand formulas.

Return type:

str

class FInflate(name, formula, indicators, weights=None, correction=None, normalise=False)

Bases: Formula

Inflate a base formula by indicator(s).

Computes a series proportional to a base formula multiplied by (possibly weighted) indicator(s), optionally normalised and corrected. See __init__ for parameter details.

Parameters:
  • name (str)

  • formula (Formula)

  • indicators (list[str])

  • weights (list[str] | list[float] | None)

  • correction (str | None)

  • normalise (bool)

evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the inflation formula.

Parameters:
  • annual_df (DataFrame) – Annual level series.

  • indicators_df (DataFrame) – Indicator series.

  • weights_df (DataFrame | None) – Optional weights used to combine indicators.

  • correction_df (DataFrame | None) – Optional correction series applied after inflation.

  • test_dfs (bool) – Whether to validate inputs (recommended).

Returns:

The inflated series aligned to the indicator frequency.

Return type:

pd.Series

Raises:
  • NameError – If a required column name is missing from one of the DataFrames.

  • AttributeError – If the index is not of type pd.PeriodIndex or has an incorrect frequency.

property indicators: list[str]

Indicator column names used by this inflation formula.

Returns:

Indicator identifiers expected in indicators_df.

Return type:

list[str]

indicators_weights(trace=True)

Indicator-weight pairs for this indicator formula.

Parameters:

trace (bool) – Ignored for this class; included for API symmetry.

Returns:

Pairs of indicators with their numeric weights. Raises TypeError if any weight is not a float.

Return type:

list[tuple[str, float]]

property weights: list[str] | list[float]

Weights to apply to indicators in inflation.

Returns:

Either names of weight columns (from weights_df) or constant numeric weights. Defaults to 1.0 per indicator when not provided.

Return type:

list[str] | list[float]

property what: str

Textual representation of this inflation formula.

Returns:

Expression showing base formula multiplied by weighted indicators (optionally normalised and corrected) and scaled to the base year.

Return type:

str

class FJoin(name, formula1, formula0, from_year)

Bases: Formula

Join two formulas at a given year.

Uses one formula up to (from_year - 1) and another from from_year onward.

Parameters:
evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the data using the provided DataFrames and return the evaluated series.

Parameters:
  • annual_df (pd.DataFrame) – The DataFrame containing annual data.

  • indicators_df (pd.DataFrame) – The DataFrame containing indicator data.

  • weights_df (pd.DataFrame, optional) – The DataFrame containing weight data. Defaults to None.

  • correction_df (pd.DataFrame, optional) – The DataFrame containing correction data. Defaults to None.

  • Raises

  • ------

  • ValueError – If the baseyear is not set.

  • TypeError – If any of the input DataFrames is not of type pd.DataFrame.

  • AttributeError – If the index of any DataFrame is not of type pd.PeriodIndex or has incorrect frequency.

  • IndexError – If the baseyear is out of range for any of the DataFrames.

  • NameError – If the required column names are not present in the DataFrames.

  • Returns

  • -------

  • pd.Series – The evaluated series.

  • test_dfs (bool)

Return type:

Series

property indicators: list[str]

All indicator names used by both joined formulas.

Returns:

Unique indicator names from both formulas.

Return type:

list[str]

indicators_weights(trace=True)

Indicator-weight pairs for this indicator formula.

This method uses the _formula.indicators_weights to compute the indicator weights when trace is set to True. If trace is False, it returns an empty list.

Parameters:

trace (bool) – A flag to indicate whether to calculate the weights with or without tracing.

Returns:

A list of tuples where each tuple contains

the indicator name as a string and its corresponding weight as a float. Returns an empty list if trace is False.

Return type:

list[tuple[str, float]]

property what: str

Textual representation of this join formula.

Returns:

Expression showing which formula is used for each year.

Return type:

str

class FMult(name, formula1, formula2)

Bases: Formula

Element-wise product of two formulas.

Multiplies two formula series with matching indices.

Parameters:
evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the data using the provided DataFrames and return the evaluated series.

Parameters:
  • annual_df (pd.DataFrame) – The DataFrame containing annual data.

  • indicators_df (pd.DataFrame) – The DataFrame containing indicator data.

  • weights_df (pd.DataFrame, optional) – The DataFrame containing weight data. Defaults to None.

  • correction_df (pd.DataFrame, optional) – The DataFrame containing correction data. Defaults to None.

  • Raises

  • ------

  • ValueError – If the baseyear is not set. If formula1 does not evaluate. If formula2 does not evaluate.

  • TypeError – If any of the input DataFrames is not of type pd.DataFrame.

  • AttributeError – If the index of any DataFrame is not of type pd.PeriodIndex or has incorrect frequency.

  • IndexError – If the baseyear is out of range for any of the DataFrames.

  • Returns

  • -------

  • pd.Series – The evaluated series.

  • test_dfs (bool)

Return type:

Series

indicators_weights(trace=True)

Indicator-weight pairs for this indicator formula.

This method aggregates the weights of indicators from a collection of formulas. If trace is set to True, the weights from each formula’s indicators are retrieved and combined.

Parameters:

trace (bool) – A flag to indicate whether to calculate the weights with or without tracing.

Returns:

A list of tuples where each tuple contains

the indicator name as a string and its corresponding weight as a float.

Return type:

list[tuple[str, float]]

property what: str

Textual representation of this product formula.

Returns:

Expression showing the product of two operand formulas.

Return type:

str

class FSum(name, *formulae)

Bases: Formula

Sum of multiple formulas.

Produces a series that is the element-wise sum of its operand formulas.

Parameters:
evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the data using the provided DataFrames and return the evaluated series.

Parameters:
  • annual_df (pd.DataFrame) – The DataFrame containing annual data.

  • indicators_df (pd.DataFrame) – The DataFrame containing indicator data.

  • weights_df (pd.DataFrame, optional) – The DataFrame containing weight data. Defaults to None.

  • correction_df (pd.DataFrame, optional) – The DataFrame containing correction data. Defaults to None.

  • Raises

  • ------

  • ValueError – If any of the formulae do not evaluate.

  • TypeError – If any of the input DataFrames is not of type pd.DataFrame.

  • AttributeError – If the index of any DataFrame is not of type pd.PeriodIndex or has incorrect frequency.

  • IndexError – If the baseyear is out of range for any of the DataFrames.

  • NameError – If the required column names are not present in the DataFrames.

  • Returns

  • -------

  • pd.Series – The evaluated series.

  • test_dfs (bool)

Return type:

Series

indicators_weights(trace=True)

Indicator-weight pairs for this indicator formula.

This method uses the _formula.indicators_weights to compute the indicator weights when trace is set to True. If trace is False, it returns an empty list.

Parameters:

trace (bool) – A flag to indicate whether to calculate the weights with or without tracing.

Returns:

A list of tuples where each tuple contains

the indicator name as a string and its corresponding weight as a float. Returns an empty list if trace is False.

Return type:

list[tuple[str, float]]

property what: str

Textual representation of this sum formula.

Returns:

Expression showing the sum of operand formulas.

Return type:

str

class FSumProd(name, formulae, weights)

Bases: Formula

Weighted sum of products of formulas.

Computes a linear combination of operand formulas with either numeric coefficients or weight column names.

Parameters:
  • name (str)

  • formulae (list[Formula])

  • weights (list[float] | list[str])

evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the data using the provided DataFrames and return the evaluated series.

Parameters:
  • annual_df (pd.DataFrame) – The DataFrame containing annual data.

  • indicators_df (pd.DataFrame) – The DataFrame containing indicator data.

  • weights_df (pd.DataFrame, optional) – The DataFrame containing weight data. Defaults to None.

  • correction_df (pd.DataFrame, optional) – The DataFrame containing correction data. Defaults to None.

  • Raises

  • ------

  • ValueError – If any of the formulae do not evaluate.

  • TypeError – If any of the input DataFrames is not of type pd.DataFrame.

  • AttributeError – If the index of any DataFrame is not of type pd.PeriodIndex or has incorrect frequency.

  • IndexError – If the baseyear is out of range for any of the DataFrames.

  • NameError – If the required column names are not present in the DataFrames.

  • Returns

  • -------

  • pd.Series – The evaluated series.

  • test_dfs (bool)

Return type:

Series

indicators_weights(trace=True)

Indicator-weight pairs for this indicator formula.

This method aggregates the weights of indicators from a collection of formulas. If trace is set to True, the weights from each formula’s indicators are retrieved and combined.

Parameters:

trace (bool) – A flag to indicate whether to calculate the weights with or without tracing.

Returns:

A list of tuples where each tuple contains

the indicator name as a string and its corresponding weight as a float.

Return type:

list[tuple[str, float]]

property what: str

Textual representation of this weighted sum-product formula.

Returns:

Expression showing the weighted sum of products of operand formulas.

Return type:

str

class Formula(name)

Bases: object

Abstract base class for all pre-system formulas.

Provides a common interface and input validation for computing time series from annual levels, indicator series, optional weights, and optional corrections. Subclasses supply the concrete computation in the what property, indicators_weights, and evaluate.

Variables:
  • _name (str) – Lower-cased identifier of the formula.

  • _baseyear (int | None) – Base year used for normalisation and alignment.

  • _calls_on (dict[str, Formula]) – Dependency formulas used by this formula.

Parameters:

name (str)

property baseyear: int | None

Base year used by the formula.

Returns:

The base year if set; otherwise None.

Return type:

int | None

property calls_on: dict[str, Formula]

Dependencies this formula uses.

Returns:

Mapping of dependency names to Formula instances.

Return type:

dict[str, Formula]

evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the formula using the provided data.

This function is only used by subclasses to check preconditions. In this baseclass it returns a dummy pd.Series object which is not used.

Parameters:
  • annual_df (DataFrame) – The annual data used for evaluation.

  • indicators_df (DataFrame) – The indicator data used for evaluation.

  • weights_df (DataFrame | None) – The weight data used for evaluation. Optional and defaults to None.

  • correction_df (DataFrame | None) – The correction data used for evaluation. Ootional and defaults to None.

  • test_dfs (bool) – If dataframes should be tested or not.

Return type:

Series

Returns:

A dummy pd.Series object. The return value is only valid for subclasses.

Raises:
  • ValueError – If the base year is not set or is out of range for the provided data.

  • AttributeError – If the index of any input DataFrame is not a Pandas PeriodIndex or if the frequency is incorrect.

property indicators: list[str]

Indicator column names referenced by this formula.

Returns:

Indicator identifiers expected in indicators_df.

Return type:

list[str]

indicators_weights(trace=True)

List indicator-weight pairs contributing to this formula.

Parameters:

trace (bool) – If True, include pairs from dependencies as well.

Returns:

List of (indicator, weight) pairs. The weight may be a float or a name that resolves to a weight series.

Return type:

list[tuple[str, float]]

info(i=0)

Print a tree view of this formula and its dependencies.

Parameters:

i (int) – Indentation level used internally for recursion.

Return type:

None

property name: str

Formula name.

Returns:

Lower-cased unique name of the formula.

Return type:

str

property weights: list[str] | list[float]

Weights used by the formula.

Returns:

Either names of weight columns (to be read from weights_df) or constant numeric weights.

Return type:

list[str] | list[float]

property what: str

Algebraic representation of the formula.

Returns:

A human-readable expression describing how the series is computed.

Return type:

str

class Indicator(name, annual, indicators, weights=None, correction=None, normalise=False, aggregation='sum')

Bases: Formula

Indicator-based disaggregation formula.

Uses one or more indicator series, optionally with weights and an optional correction factor, to distribute an annual level across sub-annual periods. See __init__ for parameter details.

Parameters:
  • name (str)

  • annual (str)

  • indicators (list[str])

  • weights (list[str] | list[float] | None)

  • correction (str | None)

  • normalise (bool)

  • aggregation (str)

evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the data using the provided DataFrames and return the evaluated series.

Parameters:
  • annual_df (pd.DataFrame) – The DataFrame containing annual data.

  • indicators_df (pd.DataFrame) – The DataFrame containing indicator data.

  • weights_df (pd.DataFrame, optional) – The DataFrame containing weight data. Defaults to None.

  • correction_df (pd.DataFrame, optional) – The DataFrame containing correction data. Defaults to None.

  • Raises

  • ------

  • ValueError – If the baseyear is not set.

  • TypeError – If any of the input DataFrames is not of type pd.DataFrame.

  • AttributeError – If the index of any DataFrame is not of type pd.PeriodIndex or has incorrect frequency.

  • IndexError – If the baseyear is out of range for any of the DataFrames.

  • NameError – If the required column names are not present in the DataFrames.

  • Returns

  • -------

  • pd.Series – The evaluated series.

  • test_dfs (bool)

Return type:

Series

property indicators: list[str]

Indicator column names used by this formula.

Returns:

Indicator identifiers expected in indicators_df.

Return type:

list[str]

indicators_weights(trace=True)

Indicator-weight pairs for this indicator formula.

Parameters:

trace (bool) – Ignored for this class; included for API symmetry.

Returns:

Pairs of indicators with their numeric weights. Raises TypeError if any weight is not a float.

Return type:

list[tuple[str, float]]

Raises:

TypeError – If any weight is not a float.

property weights: list[str] | list[float]

Weights to apply to indicators.

Returns:

Either names of weight columns (from weights_df) or constant numeric weights. Defaults to 1.0 per indicator when not provided.

Return type:

list[str] | list[float]

property what: str

Textual representation of this indicator formula.

Returns:

Expression showing how the annual level is distributed by indicators (optionally weighted, normalised, and corrected).

Return type:

str

class MultCorr(formula, correction_name)

Bases: Formula

Apply a multiplicative correction to a formula.

Multiplies the evaluated series by a named correction series.

Parameters:
  • formula (Formula)

  • correction_name (str)

property baseyear: int | None

Base year used for the correction formula.

Returns:

The base year if set, otherwise None.

Return type:

int | None

evaluate(annual_df, indicators_df, weights_df=None, correction_df=None, test_dfs=True)

Evaluate the data using the provided DataFrames and return the evaluated series.

Parameters:
  • annual_df (pd.DataFrame) – The DataFrame containing annual data.

  • indicators_df (pd.DataFrame) – The DataFrame containing indicator data.

  • weights_df (pd.DataFrame, optional) – The DataFrame containing weight data. Defaults to None.

  • correction_df (pd.DataFrame, optional) – The DataFrame containing correction data. Defaults to None.

  • Raises

  • ------

  • ValueError – If the baseyear is not set.

  • TypeError – If any of the input DataFrames is not of type pd.DataFrame.

  • AttributeError – If the index of any DataFrame is not of type pd.PeriodIndex or has incorrect frequency.

  • IndexError – If the baseyear is out of range for any of the DataFrames.

  • NameError – If the required column names are not present in the DataFrames.

  • Returns

  • -------

  • pd.Series – The evaluated series.

  • test_dfs (bool)

Return type:

Series

indicators_weights(trace=True)

Indicator-weight pairs for this indicator formula.

This method uses the _formula.indicators_weights to compute the indicator weights when trace is set to True. If trace is False, it returns an empty list.

Parameters:

trace (bool) – A flag to indicate whether to calculate the weights with or without tracing.

Returns:

A list of tuples where each tuple contains

the indicator name as a string and its corresponding weight as a float. Returns an empty list if trace is False.

Return type:

list[tuple[str, float]]

property what: str

Textual representation of this multiplicative correction formula.

Returns:

Expression showing the multiplicative correction applied to the formula.

Return type:

str

pre_system.mind4 module

mind4(mnr, rea, liste_d4, basisaar, startaar, freq='M')

Executes benchmarking of monthly or quarterly data against annual data for specific series, using the MinD4 method.

The function validates input data types, checks for consistency, and addresses edge cases such as missing values or invalid contents. The function allows scaling adjustments for leading and trailing periods using specified start and basis years, which determine the timeframe of analysis.

Author: Vemund Rundberget, Seksjon for makroøkonomi, Forksningsavdelingen, SSB

Parameters:
  • mnr (DataFrame) – DataFrame with monthly or quarterly data having a pd.PeriodIndex.

  • rea (DataFrame) – DataFrame with annual data having a pd.PeriodIndex.

  • liste_d4 (list[str] | str) – List or single string of series names to benchmark.

  • basisaar (int) – Final year to include in the analysis.

  • startaar (int) – Initial year to include in the analysis.

  • freq (Literal['M', 'Q']) – Frequency of the time series data (‘M’ for monthly, ‘Q’ for quarterly). Default is “M”.

Returns:

Dataframe containing the benchmarking results, after MinD4 adjustments.

Return type:

pd.DataFrame

Raises:

TypeError – If freq is not “M” or “Q”; if liste_d4 is not a list or string; if mnr/rea are not DataFrames with a PeriodIndex; if required series are missing in mnr/rea; if startaar/basisaar are not integers, if basisaar >= 2050, if basisaar < startaar, or if there are years present in the monthly/quarterly data that are missing in the yearly data.

pre_system.minm4 module

minm4(mnr, rea, liste_m4, basisaar, startaar, freq='M')

Perform benchmarking of monthly or quarterly data against yearly data using MinD4 method.

This function takes two sets of data (monthly/quarterly and yearly), validates them, and applies the MinD4 method to “benchmark” or adjust the input data accordingly. It ensures consistency between the series in terms of data structure, formats, and numerical properties.

Author: Vemund Rundberget, Seksjon for makroøkonomi, Forksningsavdelingen, SSB

Parameters:
  • mnr (DataFrame) – DataFrame containing monthly or quarterly data to be benchmarked. The index must be a pandas PeriodIndex.

  • rea (DataFrame) – DataFrame containing yearly data for benchmarking. The index must be a pandas PeriodIndex.

  • liste_m4 (list[str] | str) – A list of series (column names from the DataFrame) to be benchmarked, or a single series as a string.

  • basisaar (int) – The final year for benchmarking (benchmarking range end).

  • startaar (int) – The start year for benchmarking (benchmarking range start).

  • freq (Literal['M', 'Q']) – Frequency of the data. Use “M” for monthly and “Q” for quarterly. Defaults to “M”.

Return type:

DataFrame

Returns:

A DataFrame with the benchmarked values for the specified series in liste_m4.

Raises:

TypeError – If the input parameters or dataframes do not meet the expected types, structures, or contents.

pre_system.multiplicative_benchmark module

multiplicative_benchmark(df_indicator, df_target, liste_km, startyear, endyear)

Perform multiplicative benchmarking of high-frequency indicator data against low-frequency target data over a given time range.

This method adjusts a high-frequency indicator series (e.g., monthly data) so that its aggregated values match a lower-frequency target series (e.g., annual data). The adjustment is multiplicative: indicators are divided by a ratio of their aggregated sums to the target values, and then interpolated back to the high-frequency index.

Author: Magnus Helliesen Kvåle, National accounts Department and Vemund Rundberget, Macroeconomics Department, Research Division, SSB

Parameters:
  • df_indicator (DataFrame) – DataFrame with the high-frequency indicator series. Must have a PeriodIndex at a higher frequency (e.g., monthly).

  • df_target (DataFrame) – DataFrame with the low-frequency target (benchmark) series. Must have a PeriodIndex at a lower frequency (e.g., yearly).

  • liste_km (list[str] | str) – Column names to benchmark. If a single string is provided, it will be wrapped in a list.

  • startyear (int) – The starting year (inclusive) of the benchmarking period.

  • endyear (int) – The ending year (inclusive) of the benchmarking period. Must be <= 2099.

Returns:

A DataFrame with the benchmarked indicator series, indexed by the

same frequency as df_indicator.

Return type:

pd.DataFrame

Raises:
  • TypeError – If inputs are not DataFrames, indices are not PeriodIndex, or parameters have incorrect types.

  • AssertionError – If the selected start or end year is not present in either the indicator or target DataFrames.

Warns:

UserWarning – If zero-only series, non-numeric values, or NaNs are detected. Such series are excluded from benchmarking.

Notes

  • Series containing only zeros, non-numeric values, or NaNs are excluded.

  • Consistency is ensured: the sum of the adjusted indicator over a target period exactly matches the corresponding target value.

  • Equivalent to the multiplicative Denton method used in temporal disaggregation of time series.

Examples

>>> import pandas as pd
>>> from pre_system.multiplicative_benchmark import multiplicative_benchmark
>>> idx_monthly = pd.period_range("2018-01", "2019-12", freq="M")
>>> idx_yearly = pd.period_range("2018", "2019", freq="Y")
>>> df_indicator = pd.DataFrame({"A": range(len(idx_monthly))}, index=idx_monthly)
>>> df_target = pd.DataFrame({"A": [66, 210]}, index=idx_yearly)
>>> multiplicative_benchmark(df_indicator, df_target, "A", 2018, 2019).head()
           A
2018-01  0.0
2018-02  1.0
2018-03  2.0
2018-04  3.0
2018-05  4.0

pre_system.overlay module

overlay(*dfs)

Combines multiple Pandas DataFrames or Series by overlaying their values based on index alignment.

Return type:

DataFrame | Series

Parameters:

dfs (DataFrame | Series)

Parameters:

*dfspandas.DataFrame or pandas.Series

Multiple DataFrames or Series to be combined.

Returns:

: pandas.DataFrame or pandas.Series Combined DataFrame or Series with overlaid values.

Raises:

TypeError

If the input is a mixture of DataFrames and Series.

AttributeError

If not all DataFrames/Series have Pandas.PeriodIndex or if they don’t share the same frequency.

Notes:

This function overlays values from multiple DataFrames or Series, aligning them based on their indices. It creates a new DataFrame or Series by combining the input objects. The index of the returned object is based on the union of indices from all input DataFrames or Series.

pre_system.pre_system module

class PreSystem(name)

Bases: object

Container for formulas and their input data.

Manages registration of Formula objects and the DataFrames required to evaluate them (annuals, indicators, and optional weights and corrections).

Variables:
  • name (str) – Name of this pre-system.

  • baseyear (int | None) – Common base year applied to all formulas.

  • formulae (dict[str, Formula]) – Registered formulas by name.

  • annuals_df (pd.DataFrame | None) – Annual level series (PeriodIndex with annual frequency).

  • indicators_df (pd.DataFrame | None) – Indicator series (PeriodIndex).

  • weights_df (pd.DataFrame | None) – Optional weights (annual series).

  • corrections_df (pd.DataFrame | None) – Optional corrections aligned to indicator frequency.

Parameters:

name (str)

add_formula(formula)

Add a formula to the PreSystem.

Parameters:
  • formula (Formula) – The Formula object to be added.

  • Raises

  • ------

  • TypeError – If formula is not of type Formula.

  • KeyError – If any of the dependencies of the formula are not registered.

  • NameError – If a formula with the same name already exists and points to a different formula.

Return type:

None

property annuals_df: DataFrame | None

Annual level data.

Returns:

Annual series indexed by PeriodIndex with annual frequency.

Return type:

pd.DataFrame | None

property baseyear: int | None

Gets the base year value.

This property retrieves the value of the base year, if set. If the base year is not defined, it will return None.

Returns:

The base year value if defined, otherwise None.

Return type:

int | None

property corrections_df: DataFrame | None

Optional corrections to adjust indicator series.

Returns:

Corrections indexed by PeriodIndex with same frequency as indicators.

Return type:

pd.DataFrame | None

property evaluate: DataFrame

Evaluate all registered formulas using the provided data.

Returns:

pd.DataFrame

The evaluated formulas as a DataFrame.

evaluate_formula(name)

Evaluate a specific formula using the provided data.

Parameters:
  • name (str) – The name of the formula to evaluate.

  • Returns

  • -------

  • pd.Series – The evaluated formula as a Series.

Return type:

Series

evaluate_formulae(*names)

Evaluate specific formulae using the provided data.

Parameters:
  • formula_name (str) – The names of the formulae to evaluate.

  • Returns

  • -------

  • pd.DataFrame – The evaluated formulae as a DataFrame.

  • names (str)

Return type:

DataFrame

formula(name)

Get a formula from the PreSystem.

Parameters:
  • name (str) – The name of the formula to retrieve.

  • Returns

  • -------

  • None (Formula or) – The requested formula, or None if it doesn’t exist.

Return type:

Formula | None

property formulae: dict[str, Formula]

Mapping of formula names to Formula instances.

Returns:

Registered formulas keyed by name.

Return type:

dict[str, Formula]

property indicators: list[str]

All unique indicator names referenced by registered formulas.

Returns:

List of unique indicator identifiers used across formulas.

Return type:

list[str]

property indicators_df: DataFrame | None

Indicator series used by formulas.

Returns:

Indicator series indexed by PeriodIndex.

Return type:

pd.DataFrame | None

info()

Print a human-readable summary of the PreSystem configuration.

Return type:

None

Notes

Writes to stdout; intended for quick inspection during development.

property name: str

Provides access to the private _name attribute as a read-only property.

Returns:

The name attribute.

Return type:

str

property weights_df: DataFrame | None

Optional weights used by some formulas.

Returns:

Annual weights indexed by PeriodIndex with annual frequency.

Return type:

pd.DataFrame | None