Buffer, dissolve and explode

Functions that buffer, dissolve and/or explodes geometries while fixing geometries.

The functions do the same as the geopandas buffer, dissolve and explode methods, except for the following:

  • Geometries are made valid after buffer and dissolve.

  • The buffer resolution defaults to 50 (geopandas’ default is 16).

  • If ‘by’ is not specified, the index will be labeled 0, 1, …, n - 1 after exploded, instead of 0, 0, …, 0 as it will with the geopandas defaults.

  • index_parts is set to False, which will be the default in a future version of geopandas.

  • The buff function returns a GeoDataFrame, the geopandas method returns a GeoSeries.

buff(gdf, distance, resolution=50, copy=True, join_style='round', **buffer_kwargs)[source]

Buffers a GeoDataFrame with high resolution and returns a new GeoDataFrame.

Parameters:
  • gdf (GeoDataFrame | GeoSeries) – the GeoDataFrame that will be buffered, dissolved and exploded.

  • distance (int | float) – the distance (meters, degrees, depending on the crs) to buffer the geometry by

  • resolution (int) – The number of segments used to approximate a quarter circle. Here defaults to 50, as opposed to the default 16 in geopandas.

  • join_style (int | str) – Buffer join style.

  • copy (bool) – Whether to copy the GeoDataFrame before buffering. Defaults to True.

  • **buffer_kwargs – additional keyword arguments passed to geopandas’ buffer.

Return type:

GeoDataFrame

Returns:

A buffered GeoDataFrame.

buffdiss(gdf, distance, resolution=50, copy=True, n_jobs=1, join_style='round', **dissolve_kwargs)[source]

Buffers and dissolves geometries.

It takes a GeoDataFrame and buffer, fixes, dissolves and fixes geometries. If the ‘by’ parameter is not specified, the index will labeled 0, 1, …, n - 1, instead of 0, 0, …, 0. If ‘by’ is speficied, this will be the index.

Parameters:
  • gdf (GeoDataFrame) – the GeoDataFrame that will be buffered and dissolved.

  • distance (int | float) – the distance (meters, degrees, depending on the crs) to buffer the geometry by

  • resolution (int) – The number of segments used to approximate a quarter circle. Here defaults to 50, as opposed to the default 16 in geopandas.

  • join_style (int | str) – Buffer join style.

  • copy (bool) – Whether to copy the GeoDataFrame before buffering. Defaults to True.

  • n_jobs (int) – Number of threads to use. Defaults to 1.

  • **dissolve_kwargs – additional keyword arguments passed to geopandas’ dissolve.

Return type:

GeoDataFrame

Returns:

A buffered GeoDataFrame where geometries are dissolved.

Examples:

Create some random points.

>>> import sgis as sg
>>> import numpy as np
>>> points = sg.read_parquet_url(
...     "https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/points_oslo.parquet"
... )[["geometry"]]
>>> points["group"] = np.random.choice([*"abd"], len(points))
>>> points["number"] = np.random.random(size=len(points))
>>> points
                           geometry group    number
0    POINT (263122.700 6651184.900)     a  0.878158
1    POINT (272456.100 6653369.500)     a  0.693311
2    POINT (270082.300 6653032.700)     b  0.323960
3    POINT (259804.800 6650339.700)     a  0.606745
4    POINT (272876.200 6652889.100)     a  0.194360
..                              ...   ...       ...
995  POINT (266801.700 6647844.500)     a  0.814424
996  POINT (261274.000 6653593.400)     b  0.769479
997  POINT (263542.900 6645427.000)     a  0.925991
998  POINT (269226.700 6650628.000)     b  0.431972
999  POINT (264570.300 6644239.500)     d  0.555239

Buffer by 100 meters and dissolve.

>>> sg.buffdiss(points, 100)
                                            geometry group    number
0  MULTIPOLYGON (((256421.833 6649878.117, 256420...     d  0.580157

Dissolve by ‘group’ and get sum of columns.

>>> sg.buffdiss(points, 100, by="group", aggfunc="sum")
                                                geometry      number
group
a      MULTIPOLYGON (((258866.258 6648220.031, 258865...  167.265619
b      MULTIPOLYGON (((258404.858 6647830.931, 258404...  171.939169
d      MULTIPOLYGON (((258180.258 6647935.731, 258179...  156.964300

To get the ‘by’ columns as columns, not index.

>>> sg.buffdiss(points, 100, by="group", as_index=False)
  group                                           geometry    number
0     a  MULTIPOLYGON (((258866.258 6648220.031, 258865...  0.323948
1     b  MULTIPOLYGON (((258404.858 6647830.931, 258404...  0.687635
2     d  MULTIPOLYGON (((258180.258 6647935.731, 258179...  0.580157
buffdissexp(gdf, distance, *, resolution=50, index_parts=False, copy=True, grid_size=None, n_jobs=1, join_style='round', **dissolve_kwargs)[source]

Buffers and dissolves overlapping geometries.

It takes a GeoDataFrame and buffer, fixes, dissolves, fixes and explodes geometries. If the ‘by’ parameter is not specified, the index will labeled 0, 1, …, n - 1, instead of 0, 0, …, 0. If ‘by’ is speficied, this will be the index.

Parameters:
  • gdf (GeoDataFrame) – the GeoDataFrame that will be buffered, dissolved and exploded.

  • distance (int | float) – the distance (meters, degrees, depending on the crs) to buffer the geometry by

  • resolution (int) – The number of segments used to approximate a quarter circle. Here defaults to 50, as opposed to the default 16 in geopandas.

  • index_parts (bool) – If False (default), the index after dissolve is respected. If True, an integer index level is added during explode.

  • copy (bool) – Whether to copy the GeoDataFrame before buffering. Defaults to True.

  • grid_size (float | int | None) – Rounding of the coordinates. Defaults to None.

  • n_jobs (int) – Number of threads to use. Defaults to 1.

  • join_style (int | str) – Buffer join style.

  • **dissolve_kwargs – additional keyword arguments passed to geopandas’ dissolve.

Return type:

GeoDataFrame

Returns:

A buffered GeoDataFrame where overlapping geometries are dissolved.

buffdissexp_by_cluster(gdf, distance, *, resolution=50, copy=True, n_jobs=1, join_style='round', **dissolve_kwargs)[source]

Buffers and dissolves overlapping geometries.

Works exactly like buffdissexp, but, before dissolving, the geometries are divided into clusters based on overlap (uses the function sgis.get_polygon_clusters). The geometries are then dissolved based on this column (and optionally other columns).

This might be many times faster than a regular buffdissexp, if there are many non-overlapping geometries.

Parameters:
  • gdf (GeoDataFrame) – the GeoDataFrame that will be buffered, dissolved and exploded.

  • distance (int | float) – the distance (meters, degrees, depending on the crs) to buffer the geometry by

  • resolution (int) – The number of segments used to approximate a quarter circle. Here defaults to 50, as opposed to the default 16 in geopandas.

  • join_style (int | str) – Buffer join style.

  • copy (bool) – Whether to copy the GeoDataFrame before buffering. Defaults to True.

  • n_jobs (int) – int = 1,

  • **dissolve_kwargs – additional keyword arguments passed to geopandas’ dissolve.

Return type:

GeoDataFrame

Returns:

A buffered GeoDataFrame where overlapping geometries are dissolved.

diss(gdf, by=None, aggfunc='first', as_index=True, grid_size=None, n_jobs=1, **dissolve_kwargs)[source]

Dissolves geometries.

It takes a GeoDataFrame and dissolves and fixes geometries.

Parameters:
  • gdf (GeoDataFrame) – the GeoDataFrame that will be dissolved and exploded.

  • by (str | Sequence[str] | None) – Columns to dissolve by.

  • aggfunc (str | Callable | dict[str, str | Callable]) – How to aggregate the non-geometry colums not in “by”.

  • as_index (bool) – Whether the ‘by’ columns should be returned as index. Defaults to True to be consistent with geopandas.

  • grid_size (float | int | None) – Rounding of the coordinates. Defaults to None.

  • n_jobs (int) – Number of threads to use. Defaults to 1.

  • **dissolve_kwargs – additional keyword arguments passed to geopandas’ dissolve.

Return type:

GeoDataFrame

Returns:

A GeoDataFrame with dissolved geometries.

diss_by_cluster(gdf, predicate=None, n_jobs=1, **dissolve_kwargs)[source]

Dissolves overlapping geometries through clustering with sjoin and networkx.

Works exactly like dissexp, but, before dissolving, the geometries are divided into clusters based on overlap (uses the function sgis.get_polygon_clusters). The geometries are then dissolved based on this column (and optionally other columns).

This might be many times faster than a regular dissexp, if there are many non-overlapping geometries.

Parameters:
  • gdf (GeoDataFrame) – the GeoDataFrame that will be dissolved and exploded.

  • predicate – Spatial predicate to use.

  • n_jobs (int) – Number of threads to use. Defaults to 1.

  • **dissolve_kwargs – Keyword arguments passed to geopandas’ dissolve.

Return type:

GeoDataFrame

Returns:

A GeoDataFrame where overlapping geometries are dissolved.

dissexp(gdf, by=None, aggfunc='first', as_index=True, index_parts=False, grid_size=None, n_jobs=1, **dissolve_kwargs)[source]

Dissolves overlapping geometries.

It takes a GeoDataFrame and dissolves, fixes and explodes geometries.

Parameters:
  • gdf (GeoDataFrame) – the GeoDataFrame that will be dissolved and exploded.

  • by (str | Sequence[str] | None) – Columns to dissolve by.

  • aggfunc (str | Callable | dict[str, str | Callable]) – How to aggregate the non-geometry colums not in “by”.

  • as_index (bool) – Whether the ‘by’ columns should be returned as index. Defaults to True to be consistent with geopandas.

  • index_parts (bool) – If False (default), the index after dissolve is respected. If True, an integer index level is added during explode.

  • grid_size (float | int | None) – Rounding of the coordinates. Defaults to None.

  • n_jobs (int) – Number of threads to use. Defaults to 1.

  • **dissolve_kwargs – additional keyword arguments passed to geopandas’ dissolve.

Return type:

GeoDataFrame

Returns:

A GeoDataFrame where overlapping geometries are dissolved.

dissexp_by_cluster(gdf, predicate='intersects', n_jobs=1, **dissolve_kwargs)[source]

Dissolves overlapping geometries through clustering with sjoin and networkx.

Works exactly like dissexp, but, before dissolving, the geometries are divided into clusters based on overlap (uses the function sgis.get_polygon_clusters). The geometries are then dissolved based on this column (and optionally other columns).

This might be many times faster than a regular dissexp, if there are many non-overlapping geometries.

Parameters:
  • gdf (GeoDataFrame) – the GeoDataFrame that will be dissolved and exploded.

  • predicate (str | None) – Spatial predicate to use.

  • n_jobs (int) – Number of threads to use. Defaults to 1.

  • **dissolve_kwargs – Keyword arguments passed to geopandas’ dissolve.

Return type:

GeoDataFrame

Returns:

A GeoDataFrame where overlapping geometries are dissolved.