As part of our collaboration with the World Bank to develop analysis tools that support improved transportation outcomes in cities around the globe we have added functionality to OpenTripPlanner that allows efficient calculation of best, worst and average travel times.

Trip planning services usually tell you the fastest or easiest way to reach a destination given a specific departure or arrival time. Spatial analyses based on this kind of trip planning can produce misleading results because the travel times used are quite sensitive to the exact departure time specified. For example, if the analysis is performed at 9:00 AM and a train connecting two important locations runs only once per hour and departs at 8:55 and 9:55, travel time results for that connection may include an unnecessary 55 minute wait. Better travel times might be achieved by departing at a different time within a window: there might be an decrease in service frequency after 9AM which would allow a faster journey time by departing sometime in the previous hour.

When summarizing the performance of a transportation system, one number (the best achievable travel time, or the travel time at a single predetermined departure time) is not adequate. One number cannot capture the ways in which service frequency affects the variance and uncertainty of travel time, and therefore the reliability of the system in daily use by riders. Rather than a single travel time, such analyses should aim to reflect the range of travel times that may be experienced by the rider on a given itinerary, or ideally even the statistical distribution of travel times.

The form of this distribution and the way it can be calculated vary depending on whether transit is running on fixed schedules or a frequency-based system where stop-to-stop travel times are known but exact departure times are not, with the operator simply maintaining a published service frequency. The latter case is much simpler to visualize so we will begin by taking it as an example.

Consider an itinerary from home to work requiring one transfer, where both legs of the journey use frequency-based services. The headway for the first bus is 10 minutes and the ride time to the transfer point is 15 minutes. There the passenger will alight and wait to board the second bus which has a headway of 5 minutes. The second ride to reach the final destination also lasts 15 minutes. There are several sources of variation in travel time: different scheduled travel times for the same route over the course of the day, unintentional deviation from scheduled travel times due to traffic congestion, and time spent waiting for a vehicle to arrive. For now we will ignore the first two variables and concentrate on wait time.

After a short walk to the bus stop (of constant and negligible duration for the sake of this discussion), we face an unknown wait for the first bus. This may last anywhere from zero minutes if the vehicle pulls up just as we arrive at the stop, up to ten minutes (the full headway of the route) if the vehicle has just pulled away as we reach the stop. Once on board the vehicle, the duration of the ride is fixed; the total elapsed time to reach the transfer point will be fifteen minutes plus zero to ten minutes. There is a constant probability of arrival at the transfer point over the range from 15 to 25 minutes after our initial arrival at the origin stop, and the probability is equal to the inverse of the route’s headway (i.e. its frequency). Because this distribution is so simple, the minimum and maximum travel times suffice to describe it, with the average or “expected” travel time falling right in the middle of the range at 20 minutes.

At the transfer point we experience the same thing for the second bus: an indeterminate wait from zero to five minutes, followed by a fixed ride time of fifteen minutes. But this same wait and ride can begin with equal probability anywhere in the range resulting from the first ride. The resulting total trip time is the sum of two random variables (the wait times for the two individual legs) shifted to the right by the constant ride times. Its distribution is therefore the convolution of the distributions for those legs which is trapezoidal or triangular in shape depending on the relative headways of the two routes. Due to the simplicity and symmetry of both distributions, the expected ride time is still halfway between the minimum and maximum. Each additional ride on a frequency-based route smooths and spreads the distribution a bit further, and after only three or four rides it begins to approximate the familiar bell-shaped Gaussian curve that is so common in nature.

Now that we’ve seen how to model the effect of multiple rides in sequence, we can look at how multiple rides combine in parallel. Imagine that for a home to work trip there are three different options available. The express train takes only twenty minutes to reach the destination, but runs once every forty minutes. A frequent bus runs every ten minutes but the trip takes 45 minutes (Bus A). An infrequent bus runs every forty minutes and takes 50 minutes (Bus B). All of these routes converge on the work location with different probability distributions. We can model the collective experience of a large number of people realizing this journey as a mixture distribution, which is essentially the weighted sum of all the individual distributions, with weights determined by the proportion of the time each itinerary will be superior to all the others. The key idea here is that people will not randomly choose from all possible itineraries converging on a point. Only those itineraries which stand a chance of being fastest or easiest at one time or another are likely to be chosen.

The proportion of people choosing each possible itinerary is conditioned by several factors, the strongest of which are the availability of vehicle arrival predictions, aversion to travel time risk, and vehicle capacity. Depending on our assumptions about these factors, the distributions for the different itineraries will need to be weighted differently. We can envison a strategy where the rider A. boards one of the routes that offers the best minimum or maximum travel time (the Pareto-optimal routes on min or max) with no knowledge of when the others will arrive; B. boards any of the routes that is sometimes fastest, but with no knowledge of when the others will arive; or C. has good arrival predictions available and always takes the fastest option at the moment.