NetworkAnalysis¶

Class for doing network analysis.

class NetworkAnalysis(network, rules, log=True, detailed_log=False)[source]¶

Bases: object

Class for doing network analysis.

Parameters:

network (GeoDataFrame) – A GeoDataFrame of line geometries.
rules (NetworkAnalysisRules | dict) – The rules for the analysis, either as an instance of NetworkAnalysisRules or a dictionary with the parameters as keys.
log (bool) – If True (default), a DataFrame with information about each analysis run will be stored in the ‘log’ attribute.
detailed_log (bool) – If True, the log DataFrame will include columns for all arguments passed to the analysis method, plus standard deviation and percentiles (25th, 50th, 75th) of the weight column in the results. Defaults to False.

The class implements methods for doing network analysis based on GeoDataFrames of origin and destination points.

The ‘od_cost_matrix’ method is the fastest, and returns a DataFrame with only indices and travel costs between each origin-destination pair.

The ‘get_route’ method does the same, but also returns the line geometry of the routes. ‘get_k_routes’ can be used to find multiple routes between each OD pair.

The service_area methods only take a set of origins, and return the lines that can be reached within one or more breaks.

The ‘get_route_frequencies’ method is a bit different. It returns the individual line segments that were visited with an added column for how many times the segments were used.

network¶: A Network instance that holds the lines and nodes (points).

rules¶: NetworkAnalysisRules instance.

log¶: A DataFrame with information about each analysis run.

Examples:¶

Read example data.

>>> import sgis as sg
>>> roads = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/roads_oslo_2022.parquet")

Preparing the lines for directed network analysis.

>>> connected_roads = sg.get_connected_components(roads).query("connected == 1")

>>> directed_roads = sg.make_directed_network(
...     connected_roads,
...     direction_col="oneway",
...     direction_vals_bft=("B", "FT", "TF"),
...     minute_cols=("drivetime_fw", "drivetime_bw"),
...     dropnegative=True,
...     dropna=True,
... )

>>> rules = sg.NetworkAnalysisRules(weight="minutes", directed=True)
>>> nwa = sg.NetworkAnalysis(network=directed_roads, rules=rules, detailed_log=False)
>>> nwa
NetworkAnalysis(
    network=Network(6364 km, percent_bidirectional=87),
    rules=NetworkAnalysisRules(weight=minutes, directed=True, search_tolerance=250, search_factor=0, split_lines=False, ...),
    log=True, detailed_log=True,
)

Now we’re ready for network analysis.

copy(deep=True)[source]¶

Returns a (deep) copy of the class instance.

Parameters:: deep (bool) – Whether to return a deep or shallow copy. Defaults to True.
Return type:: NetworkAnalysis

get_k_routes(origins, destinations, *, k, drop_middle_percent, rowwise=False, destination_count=None, cutoff=None)[source]¶

Returns the geometry of 1 or more routes between origins and destinations.

Finds the route with the lowest cost (minutes, meters, etc.) from a set of origins to a set of destinations. Then the middle part of the route is removed from the graph the new low-cost path is found. Repeats k times. If k=1, it is identical to the get_route method.

Parameters:

origins (GeoDataFrame) – GeoDataFrame of points from where the routes will originate.
destinations (GeoDataFrame) – GeoDataFrame of points from where the routes will terminate.
k (int) – the number of low-cost routes to find.
drop_middle_percent (int) – how many percent of the middle part of the routes that should be removed from the graph before the next k route is calculated. If set to 100, only the median edge will be removed. If set to 0, all but the first and last edge will be removed. The graph is copied for each od pair.
rowwise (bool) – if False (default), it will calculate the cost from each origins to each destination. If true, it will calculate the cost from origin 1 to destination 1, origin 2 to destination 2 and so on.
destination_count (int | None) – number of closest destinations to keep for each origin. If None (default), all trips will be included. The number of destinations might be higher than the destination_count if trips have equal cost.
cutoff (int | float | None) – the maximum cost (weight) for the trips. Defaults to None, meaning all rows will be included. NaNs will also be removed if cutoff is specified.

Return type:

GeoDataFrame

Returns:

A DataFrame with the geometry of the k routes between origin and destination. Also returns the column ‘k’, a weight column and the columns ‘origin’ and ‘destination’, containing the indices of the origins and destinations GeoDataFrames.

Note

How many percent of the route to drop from the graph, will determine how many k routes will be found. If 100 percent of the route is dropped, it is very hard to find more than one path for each OD pair. If ‘drop_middle_percent’ is 1, the resulting routes might be very similar, depending on the layout of the network.

Raises:

ValueError – if drop_middle_percent is not between 0 and 100.

Parameters:

origins (GeoDataFrame)
destinations (GeoDataFrame)
k (int)
drop_middle_percent (int)
rowwise (bool)
destination_count (int | None)
cutoff (int | float | None)

Return type:

GeoDataFrame

Examples:¶

Create the NetworkAnalysis instance.

>>> import sgis as sg
>>> roads = sg.read_parquet_url('https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/roads_oslo_2022.parquet')
>>> directed_roads = sg.get_connected_components(roads).loc[lambda x: x["connected"] == 1].pipe(sg.make_directed_network_norway, dropnegative=True)
>>> rules = sg.NetworkAnalysisRules(weight="minutes", directed=True)
>>> nwa = sg.NetworkAnalysis(network=directed_roads, rules=rules, detailed_log=False)

Getting 10 fastest routes from one point to another point.

>>> points = sg.read_parquet_url('https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/points_oslo.parquet')
>>> point1 = points.iloc[[0]]
>>> point2 = points.iloc[[1]]

>>> k_routes = nwa.get_k_routes(
...             point1,
...             point2,
...             k=10,
...             drop_middle_percent=1
... )
>>> k_routes
   origin  destination    minutes   k                                           geometry
0       0            1  13.039830   1  MULTILINESTRING Z ((272281.367 6653079.745 160...
1       0            1  14.084324   2  MULTILINESTRING Z ((272281.367 6653079.745 160...
2       0            1  14.238108   3  MULTILINESTRING Z ((272281.367 6653079.745 160...
3       0            1  14.897682   4  MULTILINESTRING Z ((271257.900 6654378.100 193...
4       0            1  14.962593   5  MULTILINESTRING Z ((271257.900 6654378.100 193...
5       0            1  15.423934   6  MULTILINESTRING Z ((272281.367 6653079.745 160...
6       0            1  16.217271   7  MULTILINESTRING Z ((272281.367 6653079.745 160...
7       0            1  16.483982   8  MULTILINESTRING Z ((272281.367 6653079.745 160...
8       0            1  16.513253   9  MULTILINESTRING Z ((272281.367 6653079.745 160...
9       0            1  16.551196  10  MULTILINESTRING Z ((272281.367 6653079.745 160...

We got all 10 routes because only the middle 1 percent of the routes are removed in each iteration. Let’s compare with dropping middle 50 and middle 100 percent.

>>> k_routes = nwa.get_k_routes(
...             point1,
...             point2,
...             k=10,
...             drop_middle_percent=50
...         )
>>> k_routes
   origin  destination    minutes  k                                           geometry
0       0            1  13.039830  1  MULTILINESTRING Z ((272281.367 6653079.745 160...
1       0            1  14.238108  2  MULTILINESTRING Z ((272281.367 6653079.745 160...
2       0            1  20.139294  3  MULTILINESTRING Z ((272281.367 6653079.745 160...
3       0            1  23.506778  4  MULTILINESTRING Z ((265226.515 6650674.617 88....

>>> k_routes = nwa.get_k_routes(
...             point1,
...             point2,
...             k=10,
...             drop_middle_percent=100
...         )
>>> k_routes
   origin  destination   minutes  k                                           geometry
0       0            1  13.03983  1  MULTILINESTRING Z ((272281.367 6653079.745 160...

get_route(origins, destinations, *, rowwise=False, destination_count=None, cutoff=None, n_jobs=None)[source]¶

Returns the geometry of the low-cost route between origins and destinations.

Finds the route with the lowest cost (minutes, meters, etc.) from a set of origins to a set of destinations. If the weight is meters, the shortest route will be found. If the weight is minutes, the fastest route will be found.

Parameters:

origins (GeoDataFrame) – GeoDataFrame of points from where the routes will originate
destinations (GeoDataFrame) – GeoDataFrame of points from where the routes will terminate.
rowwise (bool) – if False (default), it will calculate the cost from each origins to each destination. If true, it will calculate the cost from origin 1 to destination 1, origin 2 to destination 2 and so on.
destination_count (int | None) – number of closest destinations to keep for each origin. If None (default), all trips will be included. The number of destinations might be higher than the destination_count if trips have equal cost.
cutoff (int | float | None) – the maximum cost (weight) for the trips. Defaults to None, meaning all rows will be included. NaNs will also be removed if cutoff is specified.
n_jobs (int | None) – Number of parallell jobs.

Return type:

GeoDataFrame

Returns:

A DataFrame with the geometry of the routes between origin and destination. Also returns a weight column and the columns ‘origin’ and ‘destination’, containing the indices of the origins and destinations GeoDataFrames.

Examples:¶

Create the NetworkAnalysis instance.

>>> import sgis as sg
>>> roads = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/roads_oslo_2022.parquet")
>>> directed_roads = sg.get_connected_components(roads).loc[lambda x: x["connected"] == 1].pipe(sg.make_directed_network_norway, dropnegative=True)
>>> rules = sg.NetworkAnalysisRules(weight="minutes", directed=True)
>>> nwa = sg.NetworkAnalysis(network=directed_roads, rules=rules, detailed_log=False)

Get routes from 1 to 1000 points.

>>> points = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/points_oslo.parquet")

>>> routes = nwa.get_route(points.iloc[[0]], points)
>>> routes
    origin  destination    minutes                                           geometry
0         1            2  12.930588  MULTILINESTRING Z ((272281.367 6653079.745 160...
1         1            3  10.867076  MULTILINESTRING Z ((270054.367 6653367.774 144...
2         1            4   8.075722  MULTILINESTRING Z ((259735.774 6650362.886 24....
3         1            5  14.659333  MULTILINESTRING Z ((272281.367 6653079.745 160...
4         1            6  14.406460  MULTILINESTRING Z ((257034.948 6652685.595 156...
..      ...          ...        ...                                                ...
992       1          996  10.858519  MULTILINESTRING Z ((266881.100 6647824.860 132...
993       1          997   7.461032  MULTILINESTRING Z ((262623.190 6652506.640 79....
994       1          998  10.698588  MULTILINESTRING Z ((263489.330 6645655.330 11....
995       1          999  10.109855  MULTILINESTRING Z ((269217.997 6650654.895 166...
996       1         1000  14.657289  MULTILINESTRING Z ((264475.675 6644245.782 114...

[997 rows x 4 columns]

get_route_frequencies(origins, destinations, weight_df=None, default_weight=None, rowwise=False, strict=False, frequency_col='frequency', n_jobs=None)[source]¶

Finds the number of times each line segment was visited in all trips.

Finds the route with the lowest cost (minutes, meters, etc.) from a set of origins to a set of destinations and summarises the number of times each segment was used. The aggregation is done on the line indices, which is much faster than getting the geometries and then dissolving.

The trip frequencies can be weighted (multiplied) based on ‘weight_df’. See example below.

Parameters:

origins (GeoDataFrame) – GeoDataFrame of points from where the routes will originate.
destinations (GeoDataFrame) – GeoDataFrame of points from where the routes will terminate.
weight_df (DataFrame | None) – A long formated DataFrame where each row contains the indices of an origin-destination pair and the number to multiply the frequency for this route by. The DataFrame can either contain three columns (origin index, destination index and weight. In that order) or only a weight column and a MultiIndex where level 0 is origin index and level 1 is destination index.
default_weight (int | float | None) – If set, OD pairs not represented in ‘weight_df’ will be given a default weight value.
rowwise (bool) – if False (default), it will calculate the cost from each origins to each destination. If true, it will calculate the cost from origin 1 to destination 1, origin 2 to destination 2 and so on.
strict (bool) – If True, all OD pairs must be in weigth_df if specified. Defaults to False.
frequency_col (str) – Name of column with the number of times each road was visited. Defaults to ‘frequency’.
n_jobs (int | None) – Number of parallell jobs.

Return type:

GeoDataFrame

Returns:

A GeoDataFrame with all line segments that were visited at least once, with a column with the number of times the line segment was used in the individual routes.

Note

The resulting lines will keep all columns of the ‘gdf’ of the Network.

Raises:

ValueError – If weight_df is not a DataFrame with one or three columns that contain weights and all indices of ‘origins’ and ‘destinations’.

Parameters:

origins (GeoDataFrame)
destinations (GeoDataFrame)
weight_df (DataFrame | None)
default_weight (int | float | None)
rowwise (bool)
strict (bool)
frequency_col (str)
n_jobs (int | None)

Return type:

GeoDataFrame

Examples:¶

Create the NetworkAnalysis instance.

>>> import sgis as sg
>>> import pandas as pd
>>> roads = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/roads_oslo_2022.parquet")
>>> directed_roads = sg.get_connected_components(roads).loc[lambda x: x["connected"] == 1].pipe(sg.make_directed_network_norway, dropnegative=True)
>>> rules = sg.NetworkAnalysisRules(weight="minutes", directed=True)
>>> nwa = sg.NetworkAnalysis(network=directed_roads, rules=rules, detailed_log=False)

Get some points.

>>> points = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/points_oslo.parquet")
>>> origins = points.iloc[:25]
>>> destinations = points.iloc[25:50]

Get number of times each road was visited for trips from 25 to 25 points.

>>> frequencies = nwa.get_route_frequencies(origins, destinations)
>>> frequencies[["source", "target", "frequency", "geometry"]]
       source target  frequency                                           geometry
160188  77264  79112        1.0  LINESTRING Z (268641.225 6651871.624 111.355, ...
153682  68376   4136        1.0  LINESTRING Z (268542.700 6652162.400 121.266, ...
153679  75263  75502        1.0  LINESTRING Z (268665.600 6652165.400 117.466, ...
153678  75262  75263        1.0  LINESTRING Z (268660.000 6652167.100 117.466, ...
153677  47999  75262        1.0  LINESTRING Z (268631.500 6652176.800 118.166, ...
...       ...    ...        ...                                                ...
151465  73801  73802      103.0  LINESTRING Z (265368.600 6647142.900 131.660, ...
151464  73800  73801      103.0  LINESTRING Z (265362.800 6647137.100 131.660, ...
151466  73802  73632      103.0  LINESTRING Z (265371.400 6647147.900 131.660, ...
151463  73799  73800      123.0  LINESTRING Z (265359.600 6647135.400 131.660, ...
152170  74418  74246      130.0  LINESTRING Z (264579.835 6651954.573 113.209, ...

[8556 rows x 4 columns]

The frequencies can be weighted for each origin-destination pair by specifying ‘weight_df’. This can be a DataFrame with three columns, where the first two contain the indices of the origin and destination (in that order), and the third the number to multiply the frequency by. ‘weight_df’ can also be a DataFrame with a 2-leveled MultiIndex, where level 0 is the origin index and level 1 is the destination.

Constructing a DataFrame with all od-pair combinations and give all rows a weight of 10.

>>> od_pairs = pd.MultiIndex.from_product(
...     [origins.index, destinations.index], names=["origin", "destination"]
... )
>>> weight_df = pd.DataFrame(index=od_pairs).reset_index()
>>> weight_df["weight"] = 10
>>> weight_df
     origin  destination  weight
0         0           25      10
1         0           26      10
2         0           27      10
3         0           28      10
4         0           29      10
..      ...          ...     ...
620      24           45      10
621      24           46      10
622      24           47      10
623      24           48      10
624      24           49      10

[625 rows x 3 columns]

All frequencies will now be multiplied by 10.

>>> frequencies = nwa.get_route_frequencies(origins, destinations, weight_df, weight_df=weight_df)
>>> frequencies[["source", "target", "frequency", "geometry"]]
       source target  frequency                                           geometry
160188  77264  79112       10.0  LINESTRING Z (268641.225 6651871.624 111.355, ...
153682  68376   4136       10.0  LINESTRING Z (268542.700 6652162.400 121.266, ...
153679  75263  75502       10.0  LINESTRING Z (268665.600 6652165.400 117.466, ...
153678  75262  75263       10.0  LINESTRING Z (268660.000 6652167.100 117.466, ...
153677  47999  75262       10.0  LINESTRING Z (268631.500 6652176.800 118.166, ...
...       ...    ...        ...                                                ...
151465  73801  73802     1030.0  LINESTRING Z (265368.600 6647142.900 131.660, ...
151464  73800  73801     1030.0  LINESTRING Z (265362.800 6647137.100 131.660, ...
151466  73802  73632     1030.0  LINESTRING Z (265371.400 6647147.900 131.660, ...
151463  73799  73800     1230.0  LINESTRING Z (265359.600 6647135.400 131.660, ...
152170  74418  74246     1300.0  LINESTRING Z (264579.835 6651954.573 113.209, ...

[8556 rows x 4 columns]

‘weight_df’ can also be a DataFrame with one column (the weight) and a MultiIndex.

>>> weight_df = pd.DataFrame(index=od_pairs)
>>> weight_df["weight"] = 10
>>> weight_df
       weight
0  25      10
   26      10
   27      10
   28      10
   29      10
...       ...
24 45      10
   46      10
   47      10
   48      10
   49      10

[625 rows x 1 columns]

od_cost_matrix(origins, destinations, *, rowwise=False, destination_count=None, cutoff=None, lines=False)[source]¶

Fast calculation of many-to-many travel costs.

Finds the the lowest cost (minutes, meters, etc.) from a set of origins to a set of destinations. The index of the origins and destinations are used as values for the returned columns ‘origins’ and ‘destinations’.

Parameters:

origins (GeoDataFrame) – GeoDataFrame of points from where the trips will originate
destinations (GeoDataFrame) – GeoDataFrame of points from where the trips will terminate
rowwise (bool) – if False (default), it will calculate the cost from each origins to each destination. If true, it will calculate the cost from origin 1 to destination 1, origin 2 to destination 2 and so on.
destination_count (int | None) – number of closest destinations to keep for each origin. If None (default), all trips will be included. The number of destinations might be higher than the destination_count if trips have equal cost.
cutoff (int | float | None) – the maximum cost (weight) for the trips. Defaults to None, meaning all rows will be included. NaNs will also be removed if cutoff is specified.
lines (bool) – if True, returns a geometry column with straight lines between origin and destination. Defaults to False.

Return type:

DataFrame | GeoDataFrame

Returns:

A DataFrame with the weight column and the columns ‘origin’ and ‘destination’, containing the indices of the origins and destinations GeoDataFrames. If lines is True, also returns a geometry column with straight lines between origin and destination.

Examples:¶

Create the NetworkAnalysis instance.

>>> import sgis as sg
>>> roads = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/roads_oslo_2022.parquet")
>>> directed_roads = sg.get_connected_components(roads).loc[lambda x: x["connected"] == 1].pipe(sg.make_directed_network_norway, dropnegative=True)
>>> rules = sg.NetworkAnalysisRules(weight="minutes", directed=True)
>>> nwa = sg.NetworkAnalysis(network=directed_roads, rules=rules, detailed_log=False)

Create some origin and destination points.

>>> points = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/points_oslo.parquet")
>>> origins = points.loc[:99, ["geometry"]]
>>> origins
                          geometry
0   POINT (263122.700 6651184.900)
1   POINT (272456.100 6653369.500)
2   POINT (270082.300 6653032.700)
3   POINT (259804.800 6650339.700)
4   POINT (272876.200 6652889.100)
..                             ...
95  POINT (270348.000 6651899.400)
96  POINT (264845.600 6649005.800)
97  POINT (263162.000 6650732.200)
98  POINT (272322.700 6653729.100)
99  POINT (265622.800 6644644.200)

[100 rows x 1 columns]

>>> destinations = points.loc[100:199, ["geometry"]]
>>> destinations
                           geometry
100  POINT (265997.900 6647899.400)
101  POINT (263835.200 6648677.700)
102  POINT (265764.000 6644063.900)
103  POINT (265970.700 6651258.500)
104  POINT (264624.300 6649937.700)
..                              ...
195  POINT (258175.600 6653694.300)
196  POINT (258772.200 6652487.600)
197  POINT (273135.300 6653198.100)
198  POINT (270582.300 6652163.800)
199  POINT (264980.800 6647231.300)

[100 rows x 1 columns]

Travel time from 100 to 100 points.

>>> od = nwa.od_cost_matrix(origins, destinations)
>>> od
      origin  destination    minutes
0          0          100   8.765621
1          0          101   6.383407
2          0          102  13.482324
3          0          103   6.410121
4          0          104   5.882124
...      ...          ...        ...
9995      99          195  20.488644
9996      99          196  16.721241
9997      99          197  19.977029
9998      99          198  15.233163
9999      99          199   6.439002

[10000 rows x 3 columns]

Assign aggregated values onto the origins (or destinations).

>>> origins["minutes_min"] = od.groupby("origin")["minutes"].min()
>>> origins["minutes_mean"] = od.groupby("origin")["minutes"].mean()
>>> origins["n_missing"] = len(origins) - od.groupby("origin")["minutes"].count()
>>> origins
                          geometry  minutes_min  minutes_mean  n_missing
0   POINT (263122.700 6651184.900)     0.966702     11.628637          0
1   POINT (272456.100 6653369.500)     2.754545     16.084722          0
2   POINT (270082.300 6653032.700)     1.768334     15.304246          0
3   POINT (259804.800 6650339.700)     2.776873     14.044023          0
4   POINT (272876.200 6652889.100)     0.541074     17.565747          0
..                             ...          ...           ...        ...
95  POINT (270348.000 6651899.400)     1.529400     15.427027          0
96  POINT (264845.600 6649005.800)     1.336207     11.239592          0
97  POINT (263162.000 6650732.200)     1.010721     11.904372          0
98  POINT (272322.700 6653729.100)     3.175472     17.579399          0
99  POINT (265622.800 6644644.200)     1.116209     12.185800          0

[100 rows x 4 columns]

Join the results onto the ‘origins’ via the index.

>>> joined = origins.join(od.set_index("origin"))
>>> joined
                          geometry  destination    minutes
0   POINT (263122.700 6651184.900)          100   8.765621
0   POINT (263122.700 6651184.900)          101   6.383407
0   POINT (263122.700 6651184.900)          102  13.482324
0   POINT (263122.700 6651184.900)          103   6.410121
0   POINT (263122.700 6651184.900)          104   5.882124
..                             ...          ...        ...
99  POINT (265622.800 6644644.200)          195  20.488644
99  POINT (265622.800 6644644.200)          196  16.721241
99  POINT (265622.800 6644644.200)          197  19.977029
99  POINT (265622.800 6644644.200)          198  15.233163
99  POINT (265622.800 6644644.200)          199   6.439002

[10000 rows x 3 columns]

Keep only travel times of 10 minutes or less. This is the same as using the cutoff parameter.

>>> ten_min_or_less = od.loc[od.minutes <= 10]
>>> joined = origins.join(ten_min_or_less.set_index("origin"))
>>> joined
                          geometry  destination   minutes
0   POINT (263122.700 6651184.900)        100.0  8.765621
0   POINT (263122.700 6651184.900)        101.0  6.383407
0   POINT (263122.700 6651184.900)        103.0  6.410121
0   POINT (263122.700 6651184.900)        104.0  5.882124
0   POINT (263122.700 6651184.900)        106.0  9.811828
..                             ...          ...       ...
99  POINT (265622.800 6644644.200)        173.0  4.305523
99  POINT (265622.800 6644644.200)        174.0  6.094040
99  POINT (265622.800 6644644.200)        177.0  5.944194
99  POINT (265622.800 6644644.200)        183.0  8.449906
99  POINT (265622.800 6644644.200)        199.0  6.439002

[2195 rows x 3 columns]

Keep the three fastest times from each origin. This is the same as using the destination_count parameter.

>>> three_fastest = od.loc[od.groupby("origin")["minutes"].rank() <= 3]
>>> joined = origins.join(three_fastest.set_index("origin"))
>>> joined
                          geometry  destination   minutes
0   POINT (263122.700 6651184.900)        135.0  0.966702
0   POINT (263122.700 6651184.900)        175.0  2.202638
0   POINT (263122.700 6651184.900)        188.0  2.931595
1   POINT (272456.100 6653369.500)        171.0  2.918100
1   POINT (272456.100 6653369.500)        184.0  2.754545
..                             ...          ...       ...
98  POINT (272322.700 6653729.100)        184.0  3.175472
98  POINT (272322.700 6653729.100)        189.0  3.179428
99  POINT (265622.800 6644644.200)        102.0  1.648705
99  POINT (265622.800 6644644.200)        134.0  1.116209
99  POINT (265622.800 6644644.200)        156.0  1.368926

[294 rows x 3 columns]

Use set_index to use column as identifier insted of the index.

>>> origins["areacode"] = np.random.choice(["0301", "3401"], len(origins))
>>> od = nwa.od_cost_matrix(
...    origins.set_index("areacode"),
...    destinations
... )
>>> od
     origin  destination    minutes
0      0301          100   8.765621
1      0301          101   6.383407
2      0301          102  13.482324
3      0301          103   6.410121
4      0301          104   5.882124
...     ...          ...        ...
9995   3401          195  20.488644
9996   3401          196  16.721241
9997   3401          197  19.977029
9998   3401          198  15.233163
9999   3401          199   6.439002

[10000 rows x 3 columns]

Travel time from 1000 to 1000 points rowwise.

>>> points_reversed = points.iloc[::-1]
>>> od = nwa.od_cost_matrix(points, points_reversed, rowwise=True)
>>> od
     origin  destination    minutes
0         0          999  14.692667
1         1          998   8.452691
2         2          997  16.370569
3         3          996   9.486131
4         4          995  16.521346
..      ...          ...        ...
995     995            4  16.794610
996     996            3   9.611700
997     997            2  19.968743
998     998            1   9.484374
999     999            0  14.892648

[1000 rows x 3 columns]

precice_service_area(origins, breaks, *, dissolve=True)[source]¶

Precice, but slow version of the service_area method.

It finds all the network lines that can be reached within each break. Lines that are partly within the break will be split at the point where the weight value is exactly correct. Note that this takes more time than the regular ‘service_area’ method.

Parameters:

origins (GeoDataFrame) – GeoDataFrame of points from where the service areas will originate
breaks (int | float | tuple[int | float]) – one or more integers or floats which will be the maximum weight for the service areas. Calculates multiple areas for each origins if multiple breaks.
dissolve (bool) – If True (default), each service area will be dissolved into one long multilinestring. If False, the individual line segments will be returned.

Return type:

GeoDataFrame

Returns:

A GeoDataFrame with one row per break per origin, with a dissolved line geometry. If dissolve is False, it will return all the columns of the network.gdf as well.

Examples:¶

Create the NetworkAnalysis instance.

>>> import sgis as sg
>>> roads = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/roads_oslo_2022.parquet")
>>> directed_roads = sg.get_connected_components(roads).loc[lambda x: x["connected"] == 1].pipe(sg.make_directed_network_norway, dropnegative=True)
>>> rules = sg.NetworkAnalysisRules(weight="minutes", directed=True)
>>> nwa = sg.NetworkAnalysis(network=directed_roads, rules=rules, detailed_log=False)

10 minute service area for one origin point.

>>> points = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/points_oslo.parquet")

>>> sa = nwa.precice_service_area(
...         points.iloc[[0]],
...         breaks=10,
...     )
>>> sa
    idx  minutes                                           geometry
0    1       10  MULTILINESTRING Z ((264348.673 6648271.134 17....

Service areas of 5, 10 and 15 minutes from three origin points.

>>> sa = nwa.precice_service_area(
...         points.iloc[:2],
...         breaks=[5, 10, 15],
...     )
>>> sa
    idx  minutes                                           geometry
0    1        5  MULTILINESTRING Z ((265378.000 6650581.600 85....
1    1       10  MULTILINESTRING Z ((264348.673 6648271.134 17....
2    1       15  MULTILINESTRING Z ((263110.060 6658296.870 154...
3    2        5  MULTILINESTRING Z ((273330.930 6653248.870 208...
4    2       10  MULTILINESTRING Z ((266909.769 6651075.250 114...
5    2       15  MULTILINESTRING Z ((264348.673 6648271.134 17....

service_area(origins, breaks, *, dissolve=True)[source]¶

Returns the lines that can be reached within breaks (weight values).

It finds all the network lines that can be reached within each break. Lines that are only partly within the break will not be included. The index of the origins is used as values in the ‘origins’ column.

Parameters:

origins (GeoDataFrame) – GeoDataFrame of points from where the service areas will originate
breaks (int | float | tuple[int | float]) – one or more integers or floats which will be the maximum weight for the service areas. Calculates multiple areas for each origins if multiple breaks.
dissolve (bool) – If True (default), each service area will be dissolved into one long multilinestring. If False, the individual line segments will be returned.

Return type:

GeoDataFrame

Returns:

A GeoDataFrame with one row per break per origin, with the origin index and a dissolved line geometry. If dissolve is False, it will return each line that is part of the service area.

Examples:¶

Create the NetworkAnalysis instance.

>>> import sgis as sg
>>> roads = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/roads_oslo_2022.parquet")
>>> directed_roads = sg.get_connected_components(roads).loc[lambda x: x["connected"] == 1].pipe(sg.make_directed_network_norway, dropnegative=True)
>>> rules = sg.NetworkAnalysisRules(weight="minutes", directed=True)
>>> nwa = sg.NetworkAnalysis(network=directed_roads, rules=rules, detailed_log=False)

10 minute service area for three origin points.

>>> points = sg.read_parquet_url("https://media.githubusercontent.com/media/statisticsnorway/ssb-sgis/main/tests/testdata/points_oslo.parquet")
>>> service_areas = nwa.service_area(
...         points.loc[:2],
...         breaks=10,
... )
>>> service_areas
   origin  minutes                                           geometry
0       0       10  MULTILINESTRING Z ((264348.673 6648271.134 17....
1       1       10  MULTILINESTRING Z ((266909.769 6651075.250 114...
2       2       10  MULTILINESTRING Z ((266909.769 6651075.250 114...

Service areas of 5, 10 and 15 minutes from three origin points.

>>> service_areas = nwa.service_area(
...         points.iloc[:2],
...         breaks=[5, 10, 15],
... )
>>> service_areas
   origin  minutes                                           geometry
0       0        5  MULTILINESTRING Z ((265378.000 6650581.600 85....
1       0       10  MULTILINESTRING Z ((264348.673 6648271.134 17....
2       0       15  MULTILINESTRING Z ((263110.060 6658296.870 154...
3       1        5  MULTILINESTRING Z ((273330.930 6653248.870 208...
4       1       10  MULTILINESTRING Z ((266909.769 6651075.250 114...
5       1       15  MULTILINESTRING Z ((264348.673 6648271.134 17....