ssb_timeseries.fs

The main purpose of the ssb_timeseries.fs module is to allow file based IO that works on both a local file system and Google Cloud Storage.

cp(from_path, to_path)

Copy file from one location to another.

This function handles copying files between local and GCS paths, automatically selecting the correct backend.

Parameters:
  • from_path (str | PathLike[str]) – The path to the source file.

  • to_path (str | PathLike[str]) – The path to the destination file.

Return type:

None

existing_subpath(path)

Return the existing part of a path on local or GCS file system.

Return type:

str | PathLike[str]

Parameters:

path (str | PathLike[str])

exists(path)

Check if a given (local or GCS) path exists.

Return type:

bool

Parameters:

path (str | PathLike[str])

file_count(path, create=False)

Count files in path. Should work regardless of wether source and target location is local fs or GCS to local.

Return type:

int

Parameters:
  • path (str | PathLike[str])

  • create (bool)

find(search_path, equals='', contains='', pattern='', search_sub_dirs=True, full_path=False, replace_root=False)

Find files and subdirectories with names matching pattern. Should work for both local and GCS filesystems.

Return type:

list[str]

Parameters:
  • search_path (str | PathLike[str])

  • equals (str)

  • contains (str)

  • pattern (str)

  • search_sub_dirs (bool)

  • full_path (bool)

  • replace_root (bool)

fs_type(path)

Check filesystem type (local or GCS) for a given path.

Return type:

str

Parameters:

path (str | PathLike[str])

is_gcs(path)

Check if path is on GCS.

Return type:

bool

Parameters:

path (str | PathLike[str])

is_local(path)

Check if path is local.

Return type:

bool

Parameters:

path (str | PathLike[str])

ls(path, pattern='*', create=False)

List files. Should work regardless of wether the filesystem is local or GCS.

Return type:

list[str]

Parameters:
  • path (str)

  • pattern (str)

  • create (bool)

mk_parent_dir(path)

Ensure a parent directory exists. … regardless of wether fielsystem is local or GCS.

Return type:

None

Parameters:

path (str | PathLike[str])

mkdir(path)

Make directory regardless of filesystem is local or GCS.

Return type:

None

Parameters:

path (str | PathLike[str])

mv(from_path, to_path)

Move file from one location to another.

This function handles moving files between local and GCS paths, automatically selecting the correct backend.

Parameters:
  • from_path (str | PathLike[str]) – The path to the source file.

  • to_path (str | PathLike[str]) – The path to the destination file.

Return type:

None

path(*args)

Join args to form path. Make sure that gcs paths are begins with double slash: gs://…

Return type:

str

Parameters:

args (str | PathLike[str])

path_to_str(path)

Normalise as strings.

This is a trick to make automated tests pass on Windows.

Return type:

str | PathLike[str]

Parameters:

path (str | PathLike[str])

read_json(path)

Read json file from path on either local fs or GCS.

Return type:

dict

Parameters:

path (str | PathLike[str])

read_parquet(path, lazy=False, implementation='pyarrow', **kwargs)

Read a Parquet file into a dataframe.

This function can read from both local and GCS paths.

Parameters:
  • path – The path to the Parquet file.

  • lazy – If True, returns a lazy dataframe. Defaults to False.

  • implementation – The backend to use for reading the file. Defaults to “pyarrow”.

  • **kwargs – Additional keyword arguments passed to the backend.

Returns:

A Narwhals dataframe.

Return type:

narwhals.typing.Frame

read_text(path, file_format='')

Read a text file from specified path on either local fs or GCS.

Return type:

dict

Parameters:
  • path (str | PathLike[str])

  • file_format (str)

remove_prefix(path)

Helper function to compensate for some os.* functions shorten gs://<path> to gs:/<path>.

Return type:

str

Parameters:

path (str | PathLike[str])

rm(path)

Remove a file from either the local filesystem or GCS.

This function is non-recursive. For a recursive variant, see rmtree().

Parameters:

path (str | PathLike[str]) – The path to the file to be removed.

Return type:

None

rmtree(path)

Recursively remove a directory and all its subdirectories and files regardless of local or GCS filesystem.

Return type:

None

Parameters:

path (str)

same_path(*args)

Return common part of path, for two or more files. Files must be on same file system, but the file system can be either local or GCS.

Return type:

str | PathLike[str]

touch(path)

Touch file regardless of wether the filesystem is local or GCS; return path.

Return type:

str | PathLike[str]

Parameters:

path (str | PathLike[str])

wrap_return_as_str(func)

Decorator to normalise outputs using path_to_str().

Return type:

Callable

Parameters:

func (Callable)

write_json(path, content)

Write json file to path on either local fs or GCS.

Return type:

None

Parameters:
  • path (str | PathLike[str])

  • content (str | dict)

write_parquet(data, path, schema=None, **kwargs)

Write a dataframe to a Parquet file.

This function can write to both local and GCS paths, automatically selecting the correct filesystem backend. It also handles schema validation.

Parameters:
  • data (Union[Table, narwhals.typing.IntoDataFrame, narwhals.typing.IntoLazyFrame]) – The dataframe to write (can be a PyArrow Table or any Narwhals-compatible dataframe).

  • path (str | PathLike[str]) – The destination path for the Parquet file.

  • schema (Schema | None) – An optional PyArrow schema to validate against before writing.

  • **kwargs – Additional keyword arguments passed to the backend.

Return type:

None

write_text(path, content, file_format)

Write json file to path on either local fs or GCS.

Return type:

None

Parameters:
  • path (str | PathLike[str])

  • content (str | dict)

  • file_format (str)