SSB Coefficient Maker¶
Features¶
Arbitrary decimal precision support using mpmath for more accurate calculations
Validation system for detecting and handling invalid values (NaN, Inf, pd.NA)
Coefficient calculation based on formula definitions stored in a dataframe
Comprehensive error reporting with detailed diagnostics for debugging formulas
Support for mixed operations between DataFrames and Series
Configurable precision and error handling to suit different use cases
Flexible column naming in coefficient definition tables
Requirements¶
python >=3.10
click >=8.0.1
pandas >=2.2.3
numpy >=2.2.3
sympy >=1.13.3
mpmath >=1.3.0
pydantic >=2.10.6
Installation¶
You can install SSB Coefficient Maker via pip from PyPI:
pip install ssb-coefficient-maker
# or alternatively, if you're using poetry
poetry add ssb-coefficient-maker
Usage¶
Basic Formula Evaluation¶
The FormulaEvaluator
allows you to evaluate mathematical expressions using pandas DataFrames and Series:
import pandas as pd
import numpy as np
from ssb_coefficient_maker import FormulaEvaluator
# Create some sample data
data = {
'matrix_a': pd.DataFrame({
'col1': [1.0, 2.0, 3.0],
'col2': [4.0, 5.0, 6.0],
'col3': [7.0, 8.0, 9.0],
}),
'vector_b': pd.Series([10.0, 20.0, 30.0]) # Note: length matches the number of columns in matrix_a
}
# Initialize the evaluator with default settings
evaluator = FormulaEvaluator(data)
# Evaluate a formula
result = evaluator.evaluate_formula('matrix_a * vector_b')
print(result)
This would produce output similar to:
col1 col2 col3
0 10.0 80.0 210.0
1 20.0 100.0 240.0
2 30.0 120.0 270.0
Computing Multiple Coefficients¶
import pandas as pd
from ssb_coefficient_maker import CoefficientCalculator
# Create input data
data = {
'input_matrix': pd.DataFrame({
'A': [1.0, 2.0],
'B': [3.0, 4.0]
}),
'adjustment': pd.Series([0.9, 1.1], index=['A', 'B']) # Series with column names as index
}
# Define coefficient formulas
coef_map = pd.DataFrame({
'coefficient_name': ['adjusted_matrix', 'squared_matrix'],
'formula': ['input_matrix * adjustment', 'input_matrix * input_matrix']
})
# Create calculator with custom column names and safe settings
calculator = CoefficientCalculator(
data,
coef_map,
result_name_col='coefficient_name', # Specify which column contains result names
formula_name_col='formula', # Specify which column contains formulas
adp_enabled=True, # Use arbitrary precision
fill_invalid=True, # Replace invalid values with zeros
verbose=True # Print detailed information during calculation
)
# Compute all coefficients
results = calculator.compute_coefficients()
# Access the results
adjusted = results['adjusted_matrix']
squared = results['squared_matrix']
Handling Division by Zero¶
import pandas as pd
from ssb_coefficient_maker import FormulaEvaluator
# Data with potential division by zero
data = {
'numerator': pd.DataFrame({'A': [1.0, 2.0], 'B': [3.0, 4.0]}),
'denominator': pd.DataFrame({'A': [1.0, 0.0], 'B': [0.0, 2.0]})
}
# Safe evaluator that replaces Inf/NaN with zeros
safe_eval = FormulaEvaluator(data, fill_invalid=True)
result = safe_eval.evaluate_formula('numerator / denominator')
print(result)
Output:
A B
0 1.0 0.0
1 0.0 2.0
Working with High Precision¶
import pandas as pd
from ssb_coefficient_maker import FormulaEvaluator
# Create data with fractions that produce repeating decimals
data = {
'numerator': pd.Series([1, 2, 1]),
'denominator': pd.Series([3, 3, 7])
}
# Compare precision differences in division operations
print("Arbitrary precision result (50 digits):")
high_prec = FormulaEvaluator(data, decimal_precision=50)
print(high_prec.evaluate_formula('numerator / denominator'))
print("\nStandard precision result (float64):")
std_prec = FormulaEvaluator(data, adp_enabled=False)
print(std_prec.evaluate_formula('numerator / denominator'))
The actual representation of these values would be:
# Arbitrary precision result (50 digits):
# Each value is stored as an mpmath.mpf object with 50 digits of precision
0 0.33333333333333333333333333333333333333333333333333
1 0.66666666666666666666666666666666666666666666666667
2 0.14285714285714285714285714285714285714285714285714
dtype: object
# Standard precision result (float64):
# Each value is stored as a 64-bit floating point number with ~15-17 significant digits
0 0.3333333333333333
1 0.6666666666666666
2 0.14285714285714285
dtype: float64
Please see the [Reference Guide] for more detailed examples and advanced usage.
Contributing¶
Contributions are very welcome. To learn more, see the Contributor Guide.
License¶
Distributed under the terms of the MIT license, SSB Coefficient Maker is free and open source software.
Issues¶
If you encounter any problems, please file an issue along with a detailed description.
Credits¶
This project was generated from Statistics Norway’s SSB PyPI Template.