Cleaning geometries

coverage_clean(gdf, tolerance, duplicate_action='fix', grid_sizes=(None,), n_jobs=1)[source]

Fix thin gaps, holes, slivers and double surfaces.

The operations might raise GEOSExceptions, so it might be nessecary to set the ‘grid_sizes’ argument, it might also be a good idea to run coverage_clean twice to fill gaps resulting from these GEOSExceptions.

Rules: - Holes (interiors) thinner than the tolerance are closed. - Gaps between polygons are filled if thinner than the tolerance. - Sliver polygons thinner than the tolerance are eliminated into the neighbor polygon with the longest shared border. - Double surfaces thinner than the tolerance are eliminated. If duplicate_action is “fix”, thicker double surfaces will be updated. - Line and point geometries are removed with no warning. - MultiPolygons and GeometryCollections are exploded to Polygons. - Index is reset.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame to be cleaned.

  • tolerance (int | float) – distance (usually meters) used as the minimum thickness for polygons to be eliminated. Any gap, hole, sliver or double surface that are empty after a negative buffer of tolerance / 2 are eliminated into the neighbor with the longest shared border.

  • duplicate_action (str) – Either “fix”, “error” or “ignore”. If “fix” (default), double surfaces thicker than the tolerance will be updated from top to bottom (function update_geometries) and then dissolved into the neighbor polygon with the longest shared border. If “error”, an Exception is raised if there are any double surfaces thicker than the tolerance. If “ignore”, double surfaces are kept as is.

  • grid_sizes (tuple[None | int]) – One or more grid_sizes used in overlay and dissolve operations that might raise a GEOSException. Defaults to (None,), meaning no grid_sizes.

  • n_jobs (int) – Number of threads.

Return type:

GeoDataFrame

Returns:

A GeoDataFrame with cleaned polygons.

explore_geosexception(e, *gdfs, logger=None)[source]

Extract the coordinates of a GEOSException and show in map.

Parameters:
  • e (GEOSException) – The exception thrown by a GEOS operation, which potentially contains coordinates information.

  • *gdfs (GeoDataFrame) – One or more GeoDataFrames to display for context in the map.

  • logger (Any | None) – An optional logger to log the error with visualization. If None, uses standard output.

Return type:

None

remove_spikes(gdf, tolerance, n_jobs=1)[source]

Remove thin spikes from polygons.

Parameters:
  • gdf (GeoDataFrame) – A GeoDataFrame.

  • tolerance (int | float) – Spike tolerance.

  • n_jobs (int) – Number of threads.

Return type:

GeoDataFrame

Returns:

A GeoDataFrame.