Duplicate and overlapping geometries¶
- get_intersections(gdf, geom_type=None, keep_geom_type=None, predicate='intersects', n_jobs=1)[source]¶
Find geometries that intersect in a GeoDataFrame.
Does an intersection with itself and keeps only the geometries that appear more than once.
Note that the returned GeoDataFrame in most cases contain two rows per intersection pair. It might also contain more than two overlapping polygons if there were multiple overlapping. These can be removed with update_geometries. See example below.
- Parameters:
gdf (
GeoDataFrame
) – GeoDataFrame of polygons.geom_type (
str
|None
) – Optionally specify which geometry type to keep. Either “polygon”, “line” or “point”.keep_geom_type (
bool
|None
) – Whether to keep the original geometry type. If mixed geometry types and keep_geom_type=True, an exception is raised.n_jobs (
int
) – Number of threads.predicate (
str
|None
) – Spatial predicate for the spatial tree.
- Return type:
GeoDataFrame
- Returns:
A GeoDataFrame of the overlapping polygons.
Examples:¶
Create three partially overlapping polygons.
>>> import sgis as sg >>> circles = sg.to_gdf([(0, 0), (1, 0), (2, 0)]).pipe(sg.buff, 1.2) >>> circles.area 0 4.523149 1 4.523149 2 4.523149 dtype: float64
Get the duplicates.
>>> duplicates = sg.get_intersections(circles) >>> duplicates["area"] = duplicates.area >>> duplicates geometry area 0 POLYGON ((1.19941 -0.03769, 1.19763 -0.07535, ... 2.194730 0 POLYGON ((1.19941 -0.03769, 1.19763 -0.07535, ... 0.359846 1 POLYGON ((0.48906 -1.08579, 0.45521 -1.06921, ... 2.194730 1 POLYGON ((2.19941 -0.03769, 2.19763 -0.07535, ... 2.194730 2 POLYGON ((0.98681 -0.64299, 0.96711 -0.61085, ... 0.359846 2 POLYGON ((1.48906 -1.08579, 1.45521 -1.06921, ... 2.194730
We get two rows for each intersection pair.
To get no overlapping geometries without , we can put geometries on top of each other rowwise.
>>> updated = sg.update_geometries(duplicates) >>> updated["area"] = updated.area >>> updated area geometry 0 2.194730 POLYGON ((1.19941 -0.03769, 1.19763 -0.07535, ... 1 1.834884 POLYGON ((2.19763 -0.07535, 2.19467 -0.11293, ...
It might be appropriate to sort the dataframe by columns. Or put large polygons first and NaN values last.
>>> updated = ( ... sg.sort_large_first(duplicates) ... .pipe(sg.sort_nans_last) ... .pipe(sg.update_geometries) ... ) >>> updated area geometry 0 2.19473 POLYGON ((1.19941 -0.03769, 1.19763 -0.07535, ... 1 2.19473 POLYGON ((2.19763 -0.07535, 2.19467 -0.11293, ...
- update_geometries(gdf, geom_type=None, keep_geom_type=None, grid_size=None, n_jobs=1, predicate='intersects')[source]¶
Puts geometries on top of each other rowwise.
Since this operation is done rowwise, it’s important to first sort the GeoDataFrame approriately. See example below.
- Parameters:
gdf (
GeoDataFrame
) – The GeoDataFrame to be updated.keep_geom_type (
bool
|None
) – If True, return only geometries of original type in case of intersection resulting in multiple geometry types or GeometryCollections. If False, return all resulting geometries (potentially mixed types).geom_type (
str
|None
) – Optionally specify what geometry type to keep., if there are mixed geometry types. Must be either “polygon”, “line” or “point”.grid_size (
int
|None
) – Precision grid size to round the geometries. Will use the highest precision of the inputs by default.n_jobs (
int
) – Number of threads.predicate (
str
|None
) – Spatial predicate for the spatial tree.
- Return type:
GeoDataFrame
Example:¶
Create two circles and get the overlap.
>>> import sgis as sg >>> circles = sg.to_gdf([(0, 0), (1, 1)]).pipe(sg.buff, 1) >>> duplicates = sg.get_intersections(circles) >>> duplicates idx geometry 0 1 POLYGON ((0.03141 0.99951, 0.06279 0.99803, 0.... 1 2 POLYGON ((1.00000 0.00000, 0.96859 0.00049, 0....
The polygons are identical except for the order of the coordinates.
>>> poly1, poly2 = duplicates.geometry >>> poly1.equals(poly2) True
‘update_geometries’ gives different results based on the order of the GeoDataFrame.
>>> sg.update_geometries(duplicates) idx geometry 0 1 POLYGON ((0.03141 0.99951, 0.06279 0.99803, 0....
>>> dups_rev = duplicates.iloc[::-1] >>> sg.update_geometries(dups_rev) idx geometry 1 2 POLYGON ((1.00000 0.00000, 0.96859 0.00049, 0....
It might be appropriate to put the largest polygons on top and sort all NaNs to the bottom.
>>> updated = ( ... sg.sort_large_first(duplicates) ... .pipe(sg.sort_nans_last) ... .pipe(sg.update_geometries) >>> updated idx geometry 0 1 POLYGON ((0.03141 0.99951, 0.06279 0.99803, 0....