ssb_timeseries.dataframes

A collection of helper functions to work with dataframes.

are_equal(*frames)

Check if dataframes are equal.

Return type:

bool

Parameters:

frames (TypeAliasForwardRef('narwhals.typing.IntoDataFrame') | TypeAliasForwardRef('narwhals.typing.IntoLazyFrame'))

copy(df)

Return an (eager) copy of the dataframe.

Return type:

typing.Any

Parameters:

df (narwhals.typing.FrameT)

eager(df)

Collect or compute for lazy implementations.

Return type:

typing.Any

Parameters:

df (narwhals.typing.Frame)

empty_frame(*, columns=None, schema=None, implementation='arrow')

Return a dataframe or Arrow table with no data.

Return type:

typing.Any

Parameters:
  • columns (list[str] | None)

  • schema (Schema | None)

  • implementation (str)

infer_datatype(df, **kwargs)

Checks dataframe columns and kwargs to identify SeriesType.

Parameters:
  • df (Union[narwhals.typing.IntoDataFrame, narwhals.typing.IntoLazyFrame]) – The dataframe to check.

  • **kwargs – Override values for keys ‘ versioning’, ‘temporality’.

Return type:

SeriesType

Returns:

SeriesType if the dataframe and/or kwargs provides sufficient information to infer Versioning and Temporality.

Versioning is determined by assessing in order of (priority): - the kwarg value for the key ‘versioning’, if provided. - if dataframe columns contain ‘as_of’, ‘as_of_tz’ or ‘as_of_utc’. - if keys of kwargs contain ‘as_of’, ‘as_of_tz’ or ‘as_of_utc’.

Temporality is determined by assessing in order of (priority): - the kwarg value for the key ‘temporality’, if provided. - if the dataframe columns contain ‘valid_from’ and ‘valid_to’.

is_df_like(obj)

Checks if an object is “dataframe-like” for Narwhals compatibility.

This is a robust, duck-typing alternative to isinstance(obj, IntoFrameT), which is not possible.

Parameters:

obj (typing.Any) – The object to check.

Return type:

bool

Returns:

True if the object has dataframe-like attributes, False otherwise.

is_empty(df)

Check if dataframe is empty.

Return type:

bool

Parameters:

df (TypeAliasForwardRef('narwhals.typing.IntoDataFrame') | TypeAliasForwardRef('narwhals.typing.IntoLazyFrame'))

merge_data(old, new, date_cols, **kwargs)

Merge new data into an existing dataframe, handling overlaps for period-based data.

For AT temporality, it keeps the last entry for duplicates based on date columns. For FROM_TO temporality, it uses an anti-join to replace rows with matching valid_from and valid_to pairs.

Return type:

Table

Parameters:
  • old (IntoFrameT)

  • new (IntoFrameT)

  • date_cols (Iterable[str])

rename_columns(df, substitutions)

Rename columns of dataframe.

Return type:

typing.Any

Parameters:
  • df (Any)

  • substitutions (dict[str, str])

to_arrow(df, schema=None)

Convert any Narwhals compatible Data Frame to Pyarrow table, cast schema if provided.

Return type:

Table

Parameters:
  • df (TypeAliasForwardRef('narwhals.typing.IntoDataFrame') | TypeAliasForwardRef('narwhals.typing.IntoLazyFrame'))

  • schema (Schema | None)

to_numpy(df, dtype=None, *, output_type='homogeneous')

Converts a dataframe to a NumPy ndarray.

Parameters:
  • dtype (Union[type[Any], dtype[Any], _SupportsDType[dtype[Any]], tuple[Any, Any], list[Any], _DTypeDict, str, None]) – type | DTypeLike | None The desired dtype for the resulting array. If None and output_type is ‘homogeneous’, will upcast to ‘object’ if columns have mixed types. dtype is ignored when output_type=’structured’ as column dtypes are preserved.

  • output_type (Literal['homogeneous', 'structured']) – “homogeneous” | “structured” ‘homogeneous’: Returns a 2D array, possibly with dtype=’object’ for mixed types. ‘structured’: Returns a 1D structured array, preserving individual column dtypes.

  • df (narwhals.typing.Frame)

Return type:

ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]

Returns:

numpy.ndarray

A NumPy array representation of the data.