Filter by location¶

Get the rows of a GeoDataFrame that match and/or do not match a spatial predicate.

Note that sfilter_split will be twice as fast, and give same results, as combining sfilter and sfilter_inverse if you want both the matching and non-matching features.

sfilter(gdf, other, predicate='intersects', distance=None, n_jobs=None, rtree_runner=None)[source]¶

Filter a GeoDataFrame or GeoSeries by spatial predicate.

Does an sjoin and returns the rows of ‘gdf’ that were returned without getting duplicates or columns from ‘other’. Works with unique and non-unique index.

Like ‘select by location’ in ArcGIS/QGIS, except that the selection is permanent.

Parameters:

gdf (GeoDataFrame | GeoSeries) – The GeoDataFrame.
other (GeoDataFrame | GeoSeries | Geometry) – The geometry object to filter ‘gdf’ by.
predicate (str) – Spatial predicate to use. Defaults to ‘intersects’.
distance (int | float | None) – Max distance to allow if predicate==”dwithin”.
n_jobs (int | None) – Number of workers.
rtree_runner (RTreeQueryRunner | None) – Optionally debug/manipulate the spatial indexing operations. See the ‘runners’ module for example implementations.

Return type:

GeoDataFrame

Returns:

A copy of ‘gdf’ with only the rows matching the spatial predicate with ‘other’.

Examples:¶

>>> import sgis as sg
>>> df1 = sg.to_gdf([(0, 0), (0, 1)])
>>> df1
                  geometry
0  POINT (0.00000 0.00000)
1  POINT (0.00000 1.00000)
>>> df2 = sg.to_gdf([(0, 0), (1, 2)])
>>> df2
                  geometry
0  POINT (0.00000 0.00000)
1  POINT (1.00000 2.00000)

Keep rows in df1 intersecting any geometry in df2.

>>> sg.sfilter(df1, df2)
                  geometry
0  POINT (0.00000 0.00000)

Equivelent to sjoin-ing and selecting based on integer index (in case of non-unique index).

>>> df1["idx"] = range(len(df1))
>>> joined = df1.sjoin(df2)
>>> df1.loc[df1["idx"].isin(joined["idx"])].drop(columns="idx")
                      geometry
0  POINT (0.00000 0.00000)

Also equivelent to using the intersects method, which is often a lot slower since df2 must be dissolved:

>>> df1.loc[df1.intersects(df2.union_all())]
                  geometry
0  POINT (0.00000 0.00000)

sfilter_inverse(gdf, other, predicate='intersects', distance=None, n_jobs=1, rtree_runner=None)[source]¶

Filter a GeoDataFrame or GeoSeries by inverse spatial predicate.

Returns the rows that do not match the spatial predicate.

Parameters:

gdf (GeoDataFrame | GeoSeries) – The GeoDataFrame or GeoSeries.
other (GeoDataFrame | GeoSeries | Geometry) – The geometry object to filter ‘gdf’ by.
predicate (str) – Spatial predicate to use. Defaults to ‘intersects’.
distance (int | float | None) – Max distance to allow if predicate==”dwithin”.
n_jobs (int) – Number of workers.
rtree_runner (RTreeQueryRunner | None) – Optionally debug/manipulate the spatial indexing operations. See the ‘runners’ module for example implementations.

Return type:

GeoDataFrame | GeoSeries

Returns:

A copy of ‘gdf’ with only the rows that do not match the spatial predicate with ‘other’.

Examples:¶

>>> import sgis as sg
>>> df1 = sg.to_gdf([(0, 0), (0, 1)])
>>> df1
                  geometry
0  POINT (0.00000 0.00000)
1  POINT (0.00000 1.00000)
>>> df2 = sg.to_gdf([(0, 0), (1, 2)])
>>> df2
                  geometry
0  POINT (0.00000 0.00000)
1  POINT (1.00000 2.00000)

Keep the rows in df1 that do not intersect a geometry in df2.

>>> not_intersecting = sg.sfilter_inverse(df1, df2)
>>> not_intersecting
                  geometry
1  POINT (0.00000 1.00000)

Equivelent to sjoin-ing and selecting based on index (which requires the index to be unique).

>>> df1 = df1.reset_index(drop=True)
>>> joined = df1.sjoin(df2)
>>> not_intersecting = df1.loc[~df1.index.isin(joined.index)]

Also equivelent to using the intersects method, which is often slower since df2 must be dissolved:

>>> not_intersecting = df1.loc[~df1.intersects(df2.union_all())]

sfilter_split(gdf, other, predicate='intersects', distance=None, n_jobs=1, rtree_runner=None)[source]¶

Split a GeoDataFrame or GeoSeries by spatial predicate.

Like sfilter, but returns both the rows that do and do not match the spatial predicate as separate GeoDataFrames.

Parameters:

gdf (GeoDataFrame | GeoSeries) – The GeoDataFrame.
other (GeoDataFrame | GeoSeries | Geometry) – The geometry object to filter ‘gdf’ by.
predicate (str) – Spatial predicate to use. Defaults to ‘intersects’.
distance (int | float | None) – Max distance to allow if predicate==”dwithin”.
n_jobs (int) – Number of workers.
rtree_runner (RTreeQueryRunner | None) – Optionally debug/manipulate the spatial indexing operations. See the ‘runners’ module for example implementations.

Return type:

tuple[GeoDataFrame, GeoDataFrame]

Returns:

A tuple of GeoDataFrames, one with the rows that match the spatial predicate and one with the rows that do not.

Examples:¶

>>> import sgis as sg
>>> df1 = sg.to_gdf([(0, 0), (0, 1)])
>>> df1
                  geometry
0  POINT (0.00000 0.00000)
1  POINT (0.00000 1.00000)
>>> df2 = sg.to_gdf([(0, 0), (1, 2)])
>>> df2
                  geometry
0  POINT (0.00000 0.00000)
1  POINT (1.00000 2.00000)

Split df1 into the rows that do and do not intersect df2.

>>> intersecting, not_intersecting = sg.sfilter_split(df1, df2)
>>> intersecting
                  geometry
0  POINT (0.00000 0.00000)
>>> not_intersecting
                  geometry
1  POINT (0.00000 1.00000)

Equivelent to sjoin-ing and selecting based on index (which requires the index to be unique).

>>> df1 = df1.reset_index(drop=True)
>>> joined = df1.sjoin(df2)
>>> intersecting = df1.loc[df1.index.isin(joined.index)]
>>> not_intersecting = df1.loc[~df1.index.isin(joined.index)]

Also equivelent to using the intersects method, which is often slower since df2 must be dissolved:

>>> filt = df1.intersects(df2.union_all())
>>> intersecting = df1.loc[filt]
>>> not_intersecting = df1.loc[~filt]