CVRPTW example over Homberger dataset

[ ]:
import cudf
from cuopt import routing
from cuopt.routing import utils
from scipy.spatial import distance
import numpy as np
import pandas as pd
import geopandas as gpd
import veroviz as vrv

Download Homberger dataset for 1000 nodes

We are using a Gehring & Homberger instance with 1,000 locations. Each location has a specific demand with delivery time windows. Vehicles have capacities. cuOpt finds efficient routes to fulfill orders from all nodes with minimal number of vehicles, reducing the cost to delivery.

Read more

More information about the dataset can be found here

GH - H. Gehring and J. Homberger, “A Parallel Two-phase Metaheuristic for Routing Problems with Time Windows” Asia-Pacific Journal of Operational Research, 18, 35-47, (2001)

[ ]:
wget https://www.sintef.no/globalassets/project/top/vrptw/homberger/1000/homberger_1000_customer_instances.zip
mkdir homberger_1000_customer_instances
unzip -o homberger_1000_customer_instances.zip -d homberger_1000_customer_instances/

Functions to read and extract data from one of the instances

We have a built-in utility in cuOpt that reads and processes the instance file to return vehicle capacity, number of vehicles and a CUDA DataFrame containing the order information. This utility is specific to a Gehring & Homberger instance file.

[ ]:
# Reads one of the homberger's instance definition to read dataset
# This function is specifically designed only to read homberger's instance definition
def read_data(filename):
    df, vehicle_capacity, n_vehicles = utils.create_from_file(filename)
    return df, vehicle_capacity, n_vehicles

# Extract the data
df, vehicle_capacity, n_vehicles = read_data('homberger_1000_customer_instances/C1_10_1.TXT')

Let us take a look at the extracted data

[ ]:
print("Number of locations          : ", df["demand"].shape[0]-1)
print("Number of vehicles available : ", n_vehicles)
print("Capacity of each vehicle     : ", vehicle_capacity)
print("Initial Orders information")

Let’s Run cuOpt

We have all our data ready, it’s time to run cuOpt! But, before we launch cuOpt and get our optimized routes, let’s dig a little deeper into the workflow step-by-step.

Step 1: Get the cost matrix ready

To be able to find and optimize the routes for the vehicles, we first need to have information about the delivery locations and the distance between each location. This can either be provided as a cost matrix or a set of X and Y coordinates of all the locations.

This function converts the X and Y coordinates from the dataset into a Euclidean distance matrix. Skip this step if you already have a cost matrix ready or want to initialize your data model directly with X and Y coordinates (We’ll talk more on this later)

[ ]:
# Build euclidean distance matrix from the x, y coordinates obtained from the dataset
# This later helps in mapping.
def build_matrix(df):
    coords = list(zip(df['xcord'].to_arrow().to_pylist(), df['ycord'].to_arrow().to_pylist()))
    distances = distance.cdist(coords, coords, 'euclidean')
    return cudf.DataFrame(distances).astype(np.float32)

Step 2: Create the Data Model

First, we need to create and initialize the data model with the number of vehicles in the fleet and number of nodes (delivery locations) to be visited. Next, we need to provide the cost/distance information. This can be provided in two ways using set_matrix or set_coordinates API.

add_cost_matrix takes in a square cost matrix containing the cost of travel which can be distance, time or any other metric, taken pairwise, between all locations.

Read more

Diagonal elements should be 0.cuOpt supports unsymmetric matrices.

We can additionally set the vehicle priorities if desired using set_vehicle_priorities where 0 is the highest priority and INT_MAX is the lowest priority.

Read more

Higher priority vehicles will be used before lower priority vehicles. The vehicle priorities are a hint to the solver, but the main goal is to minimize the number of vehicles. The size of this array must be equal to fleet_size.

cuOpt also supports secondary matrix initialization with add_transit_time_matrix. This provides an option to add a secondary matrix, for example time matrix, that can be used to check the constraints satisfiability.

Read more

For instance, a secondary matrix can be used to model the time between locations with time windows referring to it while the solver could optimize for distance.

[ ]:
def data_model_initialization(df, nodes, n_vehicles, vehicle_priorities, vehicle_capacity):

    my_data_model = routing.DataModel(nodes, n_vehicles)

    # Add cost matrix information
    distances = build_matrix(df)

    # Set vehicle priorities
    if len(vehicle_priorities) > 0:

    capacity = cudf.Series(vehicle_capacity)

    # Add capacity dimension
    my_data_model.add_capacity_dimension("demand", df['demand'], capacity)

    # Add delivery time windows
    my_data_model.set_order_time_windows(df['earliest_time'], df['latest_time'])

    return my_data_model

Step 3: Initialize and set up the Solver

Once we have our data model, we need to initialize our solver with the data model and all the time, capacity and solver constraints.

Adding capacity constraints:

A vehicle can have various capacity constraints like weight, volume and the number of orders it can carry. To add these, we can use the add_capacity_dimension which takes in information regarding the demand value for each location and the capacity value for each vehicle in the fleet. cuOpt supports multiple capacity dimensions, however, we have restricted support to only a single capacity dimension for this demonstration.

Adding time constraints:

Each order demand has a time slot for delivery, that is, a time constraint that denotes the earliest and latest time the order needs to be delivered. These constraints can be added using set_order_time_windows which takes in the earliest and latest time for each delivery. Each order demand can also have a service time associated with fulfilling the order. These can be added using set_order_service_times which takes in the service time for each delivery and an optional vehicle id to model vehicle dependent service times. We can additionally provide the penalties of each location using set_order_penalties allowing us to model node priority.

Read more

Consider a scenario at peak times where you might have a few orders to deliver but not enough vehicles. It might turn out that a solution that meets all delivery time window requirements is not feasible. In these cases, even though an optimal solution cannot be found, it might be a better strategy to allow a little deviation from our delivery time windows instead of leaving our customers without their orders. We can use set_solution_scope to soft time windows that expand the route search to unfeasible regions. This feature is not available in the restricted version of cuOpt we are using for this demonstration.

Adding solver constraints:

Now consider an opposite scenario where we have more number of vehicles than required for delivery. By default, cuOpt minimizes the number of vehicles in its solution. However, if we have more vehicles available, why not use all of them? For this purpose, we can utilize the set_min_vehicles API to request a minimum number of vehicles in the solution although the resulting solution may not be optimal.

set_number_of_climbers can be used to set the number of climbers for the local search optimization. Number of climbers are number of instances trying to find solutions that start at different initial statuses. Hence, the higher the number of climbers, the higher the probability of getting better results in the long run.

Read more

However, if the preference is to have good results in a short time, a lower number of climbers is better. By default, the number of climbers is chosen by considering the occupancy of a small GPU and experimented run-time vs the number of climbers trade-off (taht is, the best result in the shortest time).

Solving time varies greatly with the number of locations and the deault value may not be appropriate in some scenarios. So use set_time_limit, set a solving time and experiment with it. After a while, the cost of the solution will converge.

Read more

The higher the solving time, the higher the accuracy, which can impact the accuracy. By default, it is set to num_locations/5 by considering number of climbers vs run-time trade-off.

More options

By default cuOpt includes the return trip back to the depot in all computed routes. In certain situations, for example if we use contractor vehicles instead of store vehicles, we don’t intend for the vehicle to return to the depot. In this case we can use the set_drop_return_trips to control if individual vehicles in the fleet return to the depot after the last stop.

cuOpt also provides the flexibility of optimizing routes for you if you have previously computed routes. This can be done by adding the initial solution via add_initial_solution

We could also budget to let vehicles wait at locations up to a specified time with set_slack_max. By default vehicles can wait an infinite amount of time.

[ ]:
def solver_initialization():
    solver_settings = routing.SolverSettings()

    # Set seconds update and climbers

    return solver_settings

Step 4: Call solve and get the routes

Now that you have fed cuOpt all your data and requirements, it’s time to call for solve and let cuOpt find you your optimized routes!

Once cuOpt has found your solution, you can view the various details of the route as well as the route itself. get_cost gives you the total cost of the final solution. get_vehicle_count returns the number of vehicles used in the optimal solution. Finally, to view routes you can use get_routes to get the route, vehicle ids for each stop and the arrival stamp in a CUDA Dataframe.

[ ]:
def call_solve(my_data_model, solver_settings):
    routing_solution = routing.Solve(my_data_model, solver_settings)
    final_cost = routing_solution.get_cost()
    vehicle_count = routing_solution.get_vehicle_count()
    cu_status = routing_solution.get_status()
    if cu_status != 0:
          !!!!!!!!!!!!        Failed: Solution within constraints could not be found     !!!!!!!!!!!!!!!!
        -------------------------------------------------------------------------------------------------- """)
        print("Final Cost         : ", final_cost)
        print("Number of Vehicles : ", vehicle_count)
    return routing_solution

Let’s combine these four simple steps to make a function that launches cuOpt on our initial data

[ ]:
# Run cuopt on a given dataset, for number vehicles with particular capacity and vehicle_priorities.
def run_cuopt(df, n_vehicles, vehicle_capacity, vehicle_priorities=[]):

    nodes = df["demand"].shape[0]

    my_data_model = data_model_initialization(df, nodes, n_vehicles, vehicle_priorities, vehicle_capacity)

    solver_settings =  solver_initialization()

    # Solve for routes and cost
    routing_solution = call_solve(my_data_model, solver_settings)

    return routing_solution

Looks like we are all set to see cuOpt in action! Solve on Gehring & Homeberger instance data

We have our Homeberger instance data read into a CUDA Dataframe. It specifies the number of vehicles as 250 and the capacity as 200. So, let’s try to find a solution for it with 250 vehicles each with a capacity of 200.

[ ]:
veh_capacity = cudf.Series([vehicle_capacity]*n_vehicles)
solution = run_cuopt(df, n_vehicles, veh_capacity)

Unfeasible solution case

Suppose we only have 70 vehicles each with capacity 200. We can observe that finding a routing solution that satisfies the time and capacity constraints is not possible.

[ ]:
n_vehicles = 70
vehicle_capacity = 200
vehicle_capacity = [vehicle_capacity]*n_vehicles
solution = run_cuopt(df, n_vehicles, vehicle_capacity)

Mixed Fleet and Priorities

Image of Fleet

What if we have 130 more backup vehicles with a capacity of 100 available to us. However, these should only be used when we have no more store vehicles.

This can be done by initializing a mixed fleet and setting priorities.
Store vehicles are set to have higher priorities (priority set to 0) than backup vehicles (priority set to 1)
[ ]:
store_vehicles = 70
store_vehicle_capacity = 200
backup_vehicles = 130
backup_vehicle_capacity = 100
n_vehicles = store_vehicles + backup_vehicles
vehicle_capacity = [store_vehicle_capacity]*store_vehicles + [backup_vehicle_capacity]*backup_vehicles
vehicle_priorities = cudf.Series([0 for i in range(0, 70)] + [100 for i in range(70, 200)])

# Run cuopt with mixed fleet and priorities
solution = run_cuopt(df, n_vehicles, vehicle_capacity, vehicle_priorities)

You can observe in the resulting routes that all 70 store vehicles are used first.

Let’s Try to Visualize Our Routes

Great! Now that we have found our optimized routes it’s time to send the vehicles on their way

Note : Veroviz is used just to visualize the routes and it’s not part of cuOpt. And we are not comparing cuOpt against veroviz as routing solution.

We have used veroviz to help us with the visualization.
Suppose we have three vehicles ready to be loaded; let’s pick three routes, one for each vehicle, and try to visualize the path they will be taking.
[ ]:
# Get X and Y coordinates for nodes in 3 selected routes
routes_df = solution.get_route()
truck_ids = routes_df['truck_id'].unique()[3:6].to_arrow().to_pylist()+routes_df['truck_id'].unique()[90:93].to_arrow().to_pylist()
routes_df = routes_df[routes_df['truck_id'].isin(truck_ids)]
pdf = df.to_pandas()
routes_df = routes_df.to_pandas()
routes_df = routes_df.merge(pdf, how='left', left_on='route', right_on='vertex')
routes_df = routes_df[['vertex', 'xcord', 'ycord', 'truck_id']]

Map generic X and Y coordinates to actual Longitude and Latitude

This will help in visualization on a map. We have scaled the coordinates to select NVIDIA headquarters as our depot!

[ ]:
# Function to convert generic X and Y coordinates to longitude and latitude, so it can be visualized on a map.
def map_XY_to_LongLat(routes_df):
  gdf = gpd.GeoDataFrame(routes_df, geometry=gpd.points_from_xy(routes_df.xcord*30, routes_df.ycord*30))
  gdf.crs = {'init': 'epsg:3310'}
  gdf['xy_geometry'] = gdf['geometry']
  gdf.to_crs({'init': 'epsg:4326'}, inplace=True)
  gdf.rename(columns={'geometry': 'lat_long_geometry'}, inplace=True)

  gdf.lat_long_geometry = gdf.lat_long_geometry.apply(lambda p: [p.y -0.71, p.x -2.05 ])
  routes = []
  nodes = [gdf['lat_long_geometry'].iloc[0]]
  for i in truck_ids:
    gdf_id = gdf[gdf['truck_id']==i]
    nodes = nodes + gdf_id['lat_long_geometry'].tolist()[1:-1]
  return nodes, routes
[ ]:
nodes, routes = map_XY_to_LongLat(routes_df)

Mark nodes and find paths using veroviz

[ ]:
# Creates leaflet to display nodes and routes on the map as per given input
def get_vrv_leaflet(nodes, routes):
  # Single depot node:
  nodesDF = vrv.createNodesFromLocs(
    locs             = [nodes[0]],
    nodeType         = 'depot',
    leafletColor     = 'red',
    leafletIconType  = 'home')

  # 3 Customer nodes:
  nodesDF = vrv.createNodesFromLocs(
    initNodes       = nodesDF,
    locs            = nodes[1:],
    leafletColor    = 'blue',
    leafletIconType = 'star')

  assignmentsDF = vrv.initDataframe('assignments')
  color = ['green', 'red', 'blue', 'black', 'brown', 'purple']
  for i,route in enumerate(routes):
          for location in range(len(route)-1):
            startloc = route[location]
            endloc = route[location+1]
            shapepointsDF = vrv.getShapepoints2D(
                    startLoc         = startloc,
                    endLoc           = endloc,
                    routeType        = 'fastest',
                    leafletColor     = color[i],
                    leafletWeight    = 6,
                    leafletOpacity   = 0.6,
                    dataProvider = 'OSRM-online',
                    useArrows = False,
            assignmentsDF = pd.concat([assignmentsDF, shapepointsDF], ignore_index=True, sort=False)

  return vrv.createLeaflet(nodes = nodesDF, arcs = assignmentsDF, mapBackground = 'Cartodb Positron')

leaflet = get_vrv_leaflet(nodes, routes)

Display nodes and routes on the map

[ ]:


Copyright (c) 2021-2022, NVIDIA CORPORATION. Unless required by applicable law or agreed to in writing, software distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. It is not available for commercial and production purposes.