# Simulation-based Optimization (SO)

The use of simulation‐based models for optimization is known as simulation‐based optimization (SO). Stakeholders worldwide use high‐resolution urban mobility and traffic simulators to support their decisions, but rarely do so for optimization due to the high computational runtime costs of evaluating the simulation models. Thus, there is a need for efficient optimization algorithms, which are designed to identify solutions with improved performance within a tight computational budget (i.e., within few simulation evaluations). Efficiency is critical for stakeholders. However, existing SO algorithms are general‐purpose algorithms (i.e., they are not transportation specific) designed to achieve asymptotic performance properties. Our research has pioneered the design of computationally efficient SO algorithms for urban transportation problems.

## Computationally efficient algorithms

Efficiency is achieved by exploiting problem structure. Unlike existing SO algorithms that treat the simulator as a black‐box and exploit little to no problem structure, our breakthroughs in the field of SO have been achieved by coupling the simulation observations with analytical problem‐specific approximations of the simulation‐based objective function. This approximation is known as a metamodel. More specifically, for a given optimization problem, we formulate a metamodel and embed it within the SO algorithm. The challenge is to formulate a metamodel that is a good approximation of the unknown problem‐specific simulation‐based objective function in the entire feasible region, while remaining computationally tractable, differentiable and scalable. Our metamodels achieve these goals. A few examples are listed below.

### Congestion pricing

Based on a Singapore congestion pricing case study, the algorithm of Osorio & Atasoy outperformed a general‐purpose algorithm by reducing the time needed to identify good quality solutions by one order of magnitude: what took days now takes hours. The algorithm also identified solutions with an average 18% improvement in revenue. More generally, the case study indicates that the analytical network model provides the SO algorithm with analytical structural information that enables the algorithm to: (i) identify good quality solutions fast, (ii) become robust to both the quality of the initial points and to the simulators stochasticity.

### Signal control

We have addressed traffic signal control problems of dimension 50 to 260 with computational budgets of 50 to 150 simulation runs, while past approaches used budgets 2 orders of magnitude higher (i.e., they used approximately 40,000 simulation runs). Some examples: Osorio & Chong; Osorio & Nanduri(1); Osorio & Nanduri(2); Osorio & Selvam; Chong & Osorio; Chen, Osorio & Santos; Osorio & Bierlaire; Osorio et al.

### Model calibration

Transportation agencies as well as private mobility stakeholders worldwide are increasingly interested in developing large-scale models of their networks. Thus, there is a critical need for efficient calibration algorithms that facilitate the development of valid and reliable models. We have designed algorithms that enable OD (origin‐destination matrix) calibration problems of dimension 4000 to over 16000 to be addressed with computational budgets of 20 to 50 simulation runs. These algorithms identified solutions with objective function estimates that outperformed standard benchmark methods by 1 to 2 orders of magnitude: Osorio(1); Osorio(2).

## Algorithms for high-dimensional problems

### Continuous SO

Continuous SO problems with about 200 decision variables (referred to as `of dimension ~200’) are considered high‐dimensional. We have pushed the frontiers of high‐dimensional SO for transportation by 1 to 2 orders of magnitude, addressing problems of dimension ~2500 (Zhang & Osorio), ~4000 (Osorio(1)), and ~16200 (Osorio(2)).

For instance, when applied to a Singapore dynamic OD calibration problem of dimension 16200, the algorithm of Osorio identified solutions with estimated objective function values that were one order of magnitude better than those of traditional benchmark algorithms. It led to solutions with an average improvement to fit to link counts of 77%.

### Discrete SO

Discrete SO problems of dimension in the order of 100 variables are considered high-dimensional. The algorithm of Zhou et al. addressed a class of problems with a dimension of approximately 300. Fundamental transportation optimization problems are naturally formulated as high-dimensional discrete problems. Hence, these algorithmic ideas lay the foundations for a variety of important discrete transportation SO problems to be addressed efficiently.

## Scalable algorithms for large-scale networks

We have designed SO algorithms that enable simulators of large-scale networks to be efficiently used for optimization. Scalability is achieved by formulating metamodels based on analytical traffic network models that scale linearly with the number of links in the network and scale independently of other link or network attributes such as link length or the dimension of the route choice set. A few examples are listed below.

### singapore - Congestion pricing

The work of Osorio & Atasoy formulates a traffic network model that, for a network with n links, is specified as a system of n nonlinear equations, making it a scalable model. It is used to jointly optimize 16 tolls distributed throughout Singapore considering a traffic simulator with over 1000 links, over 4000 ODs (origin‐destination pairs) and over 18000 routes.

### Berlin Metropolitan area - calibration

The work of Zhang & Osorio tackles an OD calibration problem of dimension approximately 2500 for the Berlin metropolitan area. Scalability is achieved by formulating an analytical network model that, for a network with n links, is specified as a simple system of n linear equations. Hence, the model scales linearly with the number of links in the network, and scales independently of other link attributes (e.g., link length) or other network (e.g., route choice set) attributes. When compared to a benchmark method for the Berlin metropolitan area case study, the proposed method yields an average 65% improvement in the quality of the solution, as measured by its objective function estimates, while simultaneously reducing the computational runtimes by an average 30%. It is the analytical structural information, provided by this simple system of linear equations, that enables the calibration algorithm to become robust to both the quality of the initial points and to the stochasticity of the simulator. It enables the algorithm to identify solutions with significantly improved performance, compared to initial solutions, as of the first algorithmic iteration, when few simulation observations are available. Our work with the Berlin case study started off with the design of calibration algorithms for disaggregate route choice models (Zhang et al.).

### New york city - signal control

For traffic signal control, problems with about 15 controlled intersections and 70 links are considered large‐scale. Our signal control work for Midtown Manhattan (NYC) controlled 96 intersections in a network of about 900 links and 3600 lanes (Osorio et al.), allowing us to push the frontiers of large-scale signal control. This work was carried out in collaboration with the New York City Department of Transportation (NYCDOT). With them, we also designed signal plans that mitigate peak‐period congestion in an area of Manhattan that controls the access to the NYCDOT’s busiest bridge, the Queensboro Bridge. When compared via simulation to the field signal plans, our proposed signal plans led to an improvement of 26% in average queue-lengths, 10% in average trip travel times, 25% in the occurrence of vehicular spillbacks and 2% in average throughput. Dr. Mohamad Talas, NYCDOT Director of System Engineering, considers the SO approach “economically viable, with cost savings for any jurisdiction that needs to assess and improve traffic conditions for a large area of the transportation network” (Chandler). These NYC case studies also highlight the importance of formulating, and using, analytical information of between-link interactions (e.g., occurrence and impact of spillbacks) for the control of congested urban networks.

**BOSTON AREA - CAR-SHARING**

This algorithm of Zhou et al. optimizes the design of car-sharing services based on the use of disaggregate reservation data, simulation models and analytical mathematical programming models. In collaboration with Ford Motor Company and Zipcar, the algorithm was applied to optimize the spatial allocation of Zipcar’s Boston car-sharing fleet, considering over 300 car-sharing stations and a fleet size of approximately 900 vehicles. Compared, via simulation, to a solution deployed in the field, the proposed solution improved, on average, profit by 6% and vehicle utilization by 3%. Scalability is achieved by embedding a scalable mixed-integer program (MIP) within the general-purpose discrete SO algorithm. The MIP provides the SO algorithm with analytical problem-specific structural information, enabling it to address high-dimensional problems for large-scale car-sharing networks. The work illustrates how disaggregate mobility data (such as car-sharing reservation data) can be used without aggregation for optimization. This work was then extended to study other major markets of Boston, NYC, Chicago, San Francisco and Toronto. Ongoing work develops algorithms that exploit the disaggregate information in user search data.

## Discrete SO

Discrete SO algorithms are general-purpose algorithms designed to achieve asymptotic performance properties. This generality comes at the cost of a lack of scalability (i.e., suitability for high-dimensional problems) and of computational efficiency. The work of Zhou et al. shows how the combination of analytical optimization methods (e.g., mathematical programming models such as MIPs) with general purpose discrete SO algorithms enables the design of algorithms with both asymptotic performance guarantees and good short-term (e.g., few simulation evaluations) performance. This combination enables general-purpose discrete SO algorithms to become scalable , computationally efficient and robust to the quality of the initial solutions. The approach combines the merits of both analytical optimization methods and simulation‐based optimization methods, and paves the way for the abundant analytical discrete optimization literature to be used to enhance the performance of discrete SO algorithms.

The framework of Zhou et al. shows how our metamodel ideas, originally designed for continuous problems, carry over to discrete problems. More specifically, the metamodel is formulated based on a MIP. Since, analytical MIP formulations are available in the literature for a broad variety of discrete transportation problems, the proposed metamodel framework can be readily used to tackle various challenging and important transportation problems.

The algorithm is applied to a car-sharing fleet assignement problem for Zipcar’s Boston market based on the use of disaggregate reservation data. It illustrates how disaggregate mobility data can be used without aggregation for high-dimensional SO. This lays the foundations for a variety of discrete transportation problems to be addressed efficiently with high-resolution disaggregate data.

## Dynamic SO & real-time SO

Dynamic SO problems have time-dependent decision variables. Our SO metamodel ideas rely on the formulation and use of problem-specific analytical network models that approximate the simulation-based objective function. Hence, the main challenge in formulating a metamodel approach for dynamic SO problems is the need to formulate an analytical network model that accounts for the temporal dependencies in the dynamic problem, yet remains sufficiently scalable and efficient so that the overall SO algorithm remains scalable and efficient. The work of Chong & Osorio has achieved this by formulating an analytical dynamic (i.e., time-dependent) network model that remains differentiable, scalable and, most importantly, sufficiently computationally efficient to address large-scale SO problems. The model combines ideas from transient queueing theory, finite capacity queueing theory and traffic flow theory. This is the first computationally efficient algorithm for dynamic large‐scale transportation SO problems. When applied to a city-wide dynamic signal control problem, the proposed signal plans improved, via simulation, average trip travel times by 25% compared to a signal plan derived with a widely used mainstream commercial signal control software.

Until now, the computational cost of high‐resolution mobility models precluded their use for optimization, let alone for real‐time optimization. The algorithm of Chong & Osorio has laid the foundations for our ongoing work that designs real‐time SO algorithms for traffic management, congestion pricing, and model calibration problems. This pioneering real‐time SO work will enable mobility services to be tailored in real‐time to the prevailing needs and behaviors of individual users, as well as to the prevailing congestion patterns across the city. More generally, it enables high‐resolution real‐time mobility data to be used to enhance the flexibility, real‐time responsiveness and resiliency of our mobility systems.

To tackle dynamic SO problems, the work of Osorio proposes to the use of metamodels that rely on time-independent analytical network models. The temporal dependencies are then captured in a very simple way through the use of parametric general-purpose functions, such as polynomials. The resulting approach is computationally efficient and scalable. Interestingly, the Singapore dynamic OD calibration case study of this work shows that if sufficient spatial dependency information is captured analytically (in this case through the use of a time-independent analytical network model), then it can be sufficient to capture temporal dependency in a simple (i.e., more tractable) way, compared to the use of a time-dependent analytical network model. More specifically, for a problem with t time intervals and m OD pairs, the corresponding SO problem is a high-dimensional problem of dimension t∙m. The proposed approach solves, at every iteration of the SO algorithm, a set of t decoupled analytical optimization problems each of dimension m. This decoupling yields a scalable SO approach with a complexity that scales independently of the number of time periods (i.e., t problems can be solved independently). This scalable formulation has established the foundations for our ongoing real‐time calibration research.

## Multimodel or multifidelity SO

To design efficient SO algorithms for large-scale networks, the work of Osorio & Selvam combined ideas from SO with ideas from multi-model optimization. It proposed an algorithm that embeds two simulators: one is more efficient but less accurate than the other. When compared to an algorithm that evaluates only the most accurate simulator, the proposed algorithm identifies solutions with equivalent performance at half the computational cost. These multi-model ideas lay the foundations for our work in real-time SO.

# Analytical Probabilistic Traffic Flow Theory

Major transportation agencies in the US and in Europe are increasingly interested in estimating and improving the reliability, resilience and robustness of their networks. To contribute to this challenge, we have formulated analytical probabilistic network models that can yield analytical approximations of link, path and network probability distributions. These models account for uncertainties in demand and in supply. Moreoever, our work formulates traffic theoretic and queueing theoretic analytical models that are suitable for optimization (i.e., are differentiable and tractable). These models are the foundations of our ongoing work in large‐scale real‐time network optimization.

The models are based on a stochastic formulation of the link transmission model (LTM), which itself is an operational formulation of Newell’s simplified theory of kinematic waves. More specifically, the proposed models are analytical probabilistic and differentiable formulations of the deterministic and non-differentiable LTM. The proposed models yield a probabilistic description of the link’s upstream and downstream boundary conditions. In this area, we have taken a unique approach by combining ideas from deterministic traffic flow theory with probabilistic finite (space) capacity queueing theory, and transient queueing theory. The main challenge is to formulate scalable models because the dimension of the state‐space increases exponentially with the number of links in the network or with the link’s space capacity. To address this curse of dimensionality, we have proposed formulations that combine ideas from decomposition techniques as well as aggregation‐disaggregation techniques.

## Stochastic network models

Flötteröd & Osorio formulates a stochastic network link transmission model, which extends our stochastic link models. The scalability of this traffic‐theoretic network model is achieved by decomposing a general topology network into a set of overlapping subnetworks and approximating the transient joint distribution of each subnetwork. The subnetwork overlap allows to approximate stochastic dependencies across multiple subnetworks with a complexity that is linear in the number of subnetworks. Additionally, the network model maintains mutually consistent overlapping subnetwork distributions.

## Stochastic Link models

Lu & Osorio(2) formulates a model that tracks the transient probabilities of two of the link’s boundary states. This leads to a model with a state space dimension that is constant (i.e., it does not depend on any link attributes, such as link length). In other words, the model has constant complexity, whereas our past formulations have a complexity that scales either linearly or cubically with link length. This makes the model suitable for large-scale network optimization. Compared to benchmark models, our model yields estimates with comparable accuracy, while the computational efficiency is enhanced by at least one order of magnitude. The model is used to address a city-wide traffic signal control problem. Compared to a benchmark model, the proposed model enhances computational efficiency by 2 orders of magnitude, while deriving signal plans with similar performance. It yields signal plans that outperform those derived by a widely used commercial signal control software.

Osorio & Flötteröd formulates a model that derives the joint transient probability distribution of the link’s upstream and downstream boundary conditions. For a link with space capacity s, the model complexity is cubic in s (i.e., it is in the order of *O*(s^3). Lu & Osorio(1) derive the marginal, rather than the joint, distribution of the link’s upstream boundary conditions and the marginal distribution of the link’s downstream boundary conditions. This provides a simplified description of the spatial and temporal dependencies between the upstream and the downstream boundary conditions. This simplification leads to a model complexity that is linear in s, rather than cubic. Hence, the model has enhanced computational efficiency. The model yields significant gains in computational efficiency while preserving accuracy. The computational runtimes of the model increase linearly with the link’s space capacity, while our past model (that with cubic complexity) has an exponential increase in runtimes. The model yields accurate distributional approximations of the link’s boundary conditions. When used to address a city-wide signal control problem, the model is shown to be robust to the quality of the initial signal plans. It yields signal plans that systematically outperform both initial plans, as well as a plan derived by a widely used commercial signal control software.

Other queueing based analytical link & network models include: Osorio et al., Chong & Osorio; Osorio & Wang; Osorio & Yamani; Osorio & Bierlaire.

# High-resolution data & high-resolution models

Some additional examples of our recent research in the design of methods that enable the use of high-resolution mobility data and models to inform decision making are given below.

### Car-sharing demand estimation

Fields et al. proposes a demand estimation technique for mobility services based on the use of disaggregate high-resolution data. This work estimates car-sharing demand based on the use of detailed reservation data. The proposed method is a sampling technique that detruncates and uncensors disaggregate reservation data and infers (disaggregate) demand. The method is applied to the Boston area. It has also been applied to other major markets of Zipcar, such as New York City, Toronto, San Francisco and Chicago. Ongoing work develops algorithms that exploit the disaggregate information in Zipcar’s user search data.

This demand estimation method was used in Zhou et al. to inform the spatial allocation of car-sharing vehicular fleet. Together the works of Fields et al. and of Zhou et al. illustrate how our metamodel ideas carry over to address data‐driven optimization. More generally, this line of work contributes to the use of disaggregate mobility data to inform the design of next generation personalized mobility services, which will be tailored to the unique needs and preferences of individual clients.

### calibration of microscopic car-following models

The general idea of our metamodel simulation-based optimization (SO) work is the formulation of an analytical network model that approximates the relationship between the calibration parameters (i.e., simulation inputs such as an OD matrix) and the aggregate network performance metrics (i.e., simulation outputs such as link flows). It was an open question whether these ideas could be extended for the calibration of microscopic model parameters (such as car-following model parameters), for which the simulator’s input-output relationship is particularly difficult to approximate analytically. The work of Osorio & Punzo formulated an analytical network model that approximates well the relationship between the microscopic car-following parameters and the network performance metrics. This new car-following calibration algorithm allows, for the first time, microscopic calibration problems to be addressed with very tight computational budgets (i.e., few simulation runs). The increasing heterogeneity of vehicle technologies, accentuated by advancements in vehicle automation and connectivity, makes the calibration of microscopic vehicle‐specific models a crucial problem for the next generation of mobility models. This line of research anticipates and responds to these needs.

### Sustainable Traffic Management

There is a pressing necessity to mitigate the impacts of urban congestion on our environment. We have designed efficient simulation‐based signal control algorithms that enhance sustainability by mitigating vehicular emissions (Osorio & Nanduri(1)) and fuel consumption (Osorio & Nanduri(2)). These algorithms can contribute to enhance our understanding of the relationship between vehicle technology and city‐wide energy and environmental impacts. The design of a metamodel algorithm for sustainable simulation‐based signal control problems requires the formulation of an analytical approximation of the mapping between signal plans and city‐wide vehicular emissions or fuel consumption. Metrics such as city‐wide emissions or fuel consumption are difficult to approximate analytically because they depend on individual vehicle technologies and instantaneous vehicle performance (e.g., acceleration). In both of these papers, we proposed analytical and differentiable expressions for these intricate metrics. These expressions were then used to formulate metamodels for the design of efficient sustainable simulation‐based signal control algorithms.