ssb_timeseries.fs¶
The main purpose of the ssb_timeseries.fs module is to allow file based IO that works on both a local file system and Google Cloud Storage.
- cp(from_path, to_path)¶
Copy file from one location to another.
This function handles copying files between local and GCS paths, automatically selecting the correct backend.
- Parameters:
from_path (
str|PathLike[str]) – The path to the source file.to_path (
str|PathLike[str]) – The path to the destination file.
- Return type:
None
- existing_subpath(path)¶
Return the existing part of a path on local or GCS file system.
- Return type:
str|PathLike[str]- Parameters:
path (str | PathLike[str])
- exists(path)¶
Check if a given (local or GCS) path exists.
- Return type:
bool- Parameters:
path (str | PathLike[str])
- file_count(path, create=False)¶
Count files in path. Should work regardless of wether source and target location is local fs or GCS to local.
- Return type:
int- Parameters:
path (str | PathLike[str])
create (bool)
- find(search_path, equals='', contains='', pattern='', search_sub_dirs=True, full_path=False, replace_root=False)¶
Find files and subdirectories with names matching pattern. Should work for both local and GCS filesystems.
- Return type:
list[str]- Parameters:
search_path (str | PathLike[str])
equals (str)
contains (str)
pattern (str)
search_sub_dirs (bool)
full_path (bool)
replace_root (bool)
- fs_type(path)¶
Check filesystem type (local or GCS) for a given path.
- Return type:
str- Parameters:
path (str | PathLike[str])
- is_gcs(path)¶
Check if path is on GCS.
- Return type:
bool- Parameters:
path (str | PathLike[str])
- is_local(path)¶
Check if path is local.
- Return type:
bool- Parameters:
path (str | PathLike[str])
- ls(path, pattern='*', create=False)¶
List files. Should work regardless of wether the filesystem is local or GCS.
- Return type:
list[str]- Parameters:
path (str)
pattern (str)
create (bool)
- mk_parent_dir(path)¶
Ensure a parent directory exists. … regardless of wether fielsystem is local or GCS.
- Return type:
None- Parameters:
path (str | PathLike[str])
- mkdir(path)¶
Make directory regardless of filesystem is local or GCS.
- Return type:
None- Parameters:
path (str | PathLike[str])
- mv(from_path, to_path)¶
Move file from one location to another.
This function handles moving files between local and GCS paths, automatically selecting the correct backend.
- Parameters:
from_path (
str|PathLike[str]) – The path to the source file.to_path (
str|PathLike[str]) – The path to the destination file.
- Return type:
None
- path(*args)¶
Join args to form path. Make sure that gcs paths are begins with double slash: gs://…
- Return type:
str- Parameters:
args (str | PathLike[str])
- path_to_str(path)¶
Normalise as strings.
This is a trick to make automated tests pass on Windows.
- Return type:
str|PathLike[str]- Parameters:
path (str | PathLike[str])
- read_json(path)¶
Read json file from path on either local fs or GCS.
- Return type:
dict- Parameters:
path (str | PathLike[str])
- read_parquet(path, lazy=False, implementation='pyarrow', **kwargs)¶
Read a Parquet file into a dataframe.
This function can read from both local and GCS paths.
- Parameters:
path – The path to the Parquet file.
lazy – If True, returns a lazy dataframe. Defaults to False.
implementation – The backend to use for reading the file. Defaults to “pyarrow”.
**kwargs – Additional keyword arguments passed to the backend.
- Returns:
A Narwhals dataframe.
- Return type:
narwhals.typing.Frame
- read_text(path, file_format='')¶
Read a text file from specified path on either local fs or GCS.
- Return type:
dict- Parameters:
path (str | PathLike[str])
file_format (str)
- remove_prefix(path)¶
Helper function to compensate for some os.* functions shorten gs://<path> to gs:/<path>.
- Return type:
str- Parameters:
path (str | PathLike[str])
- rm(path)¶
Remove a file from either the local filesystem or GCS.
This function is non-recursive. For a recursive variant, see rmtree().
- Parameters:
path (
str|PathLike[str]) – The path to the file to be removed.- Return type:
None
- rmtree(path)¶
Recursively remove a directory and all its subdirectories and files regardless of local or GCS filesystem.
- Return type:
None- Parameters:
path (str)
- same_path(*args)¶
Return common part of path, for two or more files. Files must be on same file system, but the file system can be either local or GCS.
- Return type:
str|PathLike[str]
- touch(path)¶
Touch file regardless of wether the filesystem is local or GCS; return path.
- Return type:
str|PathLike[str]- Parameters:
path (str | PathLike[str])
- wrap_return_as_str(func)¶
Decorator to normalise outputs using path_to_str().
- Return type:
Callable- Parameters:
func (Callable)
- write_json(path, content)¶
Write json file to path on either local fs or GCS.
- Return type:
None- Parameters:
path (str | PathLike[str])
content (str | dict)
- write_parquet(data, path, schema=None, **kwargs)¶
Write a dataframe to a Parquet file.
This function can write to both local and GCS paths, automatically selecting the correct filesystem backend. It also handles schema validation.
- Parameters:
data (
Union[Table, narwhals.typing.IntoDataFrame, narwhals.typing.IntoLazyFrame]) – The dataframe to write (can be a PyArrow Table or any Narwhals-compatible dataframe).path (
str|PathLike[str]) – The destination path for the Parquet file.schema (
Schema|None) – An optional PyArrow schema to validate against before writing.**kwargs – Additional keyword arguments passed to the backend.
- Return type:
None
- write_text(path, content, file_format)¶
Write json file to path on either local fs or GCS.
- Return type:
None- Parameters:
path (str | PathLike[str])
content (str | dict)
file_format (str)