control_framework_base module¶
- class ControlFrameworkBase(partitions, partitions_skjema, conn)¶
Bases:
object
Base class for running control checks.
Designed to work on partitioned data following the recommended altinn3 data structure. Manages inserts and updates to the ‘kontrollutslag’ table via a connection interface.
To use this class you need to use this setup: class MyControls(ControlFrameworkBase):
- def __init__(self, partitions: list[int | str], partitions_skjema: dict[str, int | str], conn: object) -> None:
super().__init__(partitions, partitions_skjema, conn)
- def a_control_func(self):
# Your code here return dataframe
The flow of updating the control table works like this:
First call ‘execute_controls’, this begins the entire process.
- ‘control_updates’ is run, during which the code checks existing controls, runs all controls and creates a dataframe with all results.
‘run_all_controls’ is run, which in turn calls ‘run_control’ for each individual control. The results from control_updates is used to check if there has been any changes since last executing controls. If there are no changes, the process stops here.
Based on the results from ‘control_updates’ it generates an update query where each change in the results, where the result of a control has changed for an observation, is updated in the ‘kontrollutslag’ table.
The update query is run, and the process is complete.
- Parameters:
partitions (list[int | str])
partitions_skjema (dict[str, int | str])
conn (object)
- control_new_rows()¶
Identifies new rows that are not already present in ‘kontrollutslag’.
- Returns:
DataFrame of new rows to insert.
- Return type:
pd.DataFrame
- control_updates()¶
Identifies rows in ‘kontrollutslag’ where the control output has changed.
- Returns:
DataFrame of rows that need to be updated.
- Return type:
pd.DataFrame
- execute_controls()¶
Executes control checks and updates existing rows in ‘kontrollutslag’ if needed.
- Returns:
Number of rows updated.
- Return type:
int
- generate_update_query(df_updates)¶
Generates a SQL UPDATE query for updating rows in ‘kontrollutslag’.
- Parameters:
df_updates (pd.DataFrame) – DataFrame with updates to apply.
- Returns:
SQL query string.
- Return type:
str
- insert_new_rows()¶
Inserts any new control results that are not already in ‘kontrollutslag’.
- Returns:
Number of rows inserted.
- Return type:
int
- Raises:
AttributeError – If ‘conn’ does not have ‘insert’ method.
- run_all_controls()¶
Runs control methods named like ‘control_<kontrollid>’ where <id> is in self.controls.
- Returns:
Combined DataFrame with all control results.
- Return type:
pd.DataFrame
- Raises:
TypeError – if ‘df’ variable to return is not pd.DataFrame.
- run_control(control)¶
Runs a single control.
- Parameters:
control (
str
) – Name of a control method to run implemented in the supplied control class built upon ControlFrameworkBase.- Returns:
Dataframe containing results from the control.
- Return type:
pd.Dataframe
- Raises:
TypeError – If control method does not return pd.dataframe.