When close is better than optimal
By Douglas A. Samuelson
Suppose you want to reach the highest point on a mountain. On the ground, with good information about the terrain, you know how to do that. You may have to find ways around obstacles and recognize local peaks that aren’t the one you want, but the problem is straightforward.
Now suppose you’re trying to get there by parachute. There are clouds partially blocking your vision, and there are shifting winds. Suppose also that landing at a high altitude, not necessarily the highest possible, is the objective, so it’s worthwhile to miss the highest peak if you reduce your chances of going down a crevasse.
This example, courtesy of John Sall, executive vice president (and programming guru) of SAS Institute, typifies many production control and other risk management applications. To illustrate, he showed an example of a Google Earth map of the area around Long’s Peak in Rocky Mountain National Park (see Figure 1).
Figure 1: Google Earth map of the area around Long's Peak, Colorado. Aiming for "The Loft" gives you a better chance of landing at a high altitude than trying for the peak.
You’re better off to aim for “The Loft” and be fairly sure of ending up above 13,400 feet than trying for the 14,200-foot peak and possibly ending up below 12,500 feet in one of the chasms near the peak.
Welcome to the world of stochastic optimization. Instead of seeking the maximum of an objective function, we try to maximize the expected value of that function, taking uncertainties and data perturbations into account. This technique has been known and applied for decades. It is the sensible thing to do in risk management and for many production situations. One of John Sall’s big recent and current interests is a good stochastic optimization routine for production problems, to be included in JMP. He was the lead developer for SAS and JMP, so what he wants to get programmed is likely to get done.
The challenge is specifying the objective function, somehow specifying all those random factors and their effects. This challenge becomes more difficult as one tries to implement the specification in most optimization packages, since the usual linear programming matrix notation doesn’t work for this kind of objective function.
Meanwhile, on the other side of the traditional divide within operations research, simulation has steadily expanded its reach. The data requirements are more modest, as the modeler can just make some assumptions about random distributions of unknowns and then test, by multiple replications with varying assumed values, the sensitivity of the results to the assumptions. As the assumptions proliferate, however, the number of replications grows exponentially. Therefore, many simulation packages now include a statistical design of experiments capability or an easy interface to one. Efficient designs make it possible to estimate a whole response surface, which in turn suggests where the low-cost or high-output combinations of input are. Courtesy of John Sall again, Figure 2 presents an example for a production problem.
Figure 2: In this chemical production example, the highest possible yield is at the intersection of the two black grids, 535 degrees and .08 time units, but small variations quickly reduce the yield.
The intersection of the grids identifies the optimum, but it is near a sharp drop-off. Cooking more slowly, at a lower temperature, yields a more robust solution. In this case, the number of dimensions is small enough that we can see the better approach. But what if there were many dimensions, and we needed a computer to help us find the best solution, taking risk into account?
So here’s the next breakthrough in risk minimization/production maximization with uncertainty: simulate, produce a whole response surface, and put that surface as the objective function into a stochastic optimization routine. Given the progress SAS was making a year and a half ago, I had hoped that by now I’d be describing how to do it, complete with worked examples. We’re not there yet, but there is progress to report, enough that OR/MS analysts might benefit from thinking about how they would use this capability and urging the developers to get the implementation done.
Progress Assessments
Several developers are, in fact, working hard to provide the integrated capability analysts need. SAS released its Simulation Studio product in 2009. It offers the power and convenience of importing data directly from its formatted files into the simulation routine, fitting probability distributions to the actual data, then embedding the simulation in the experimental design routine to generate a response surface. Early reports from users indicate that SAS has maintained its usual high quality of implementation and of integration with its other products. SAS certainly has abundant resources to devote to continuing development and support. So far, so good.
Importing the response surface to a stochastic optimization, however, is not a near-term offering. Simulation Studio runs only in mainframe SAS and Windows. Therefore, even the version integrated with JMP runs only in the Windows version of JMP – somewhat ironic, since JMP was originally designed as a Mac product, but that’s the situation. The stochastic optimization routine, meanwhile, is planned for only the Mac version of JMP for the foreseeable future, and focused on production problems. Hence the bright promise is still there, but the wait is likely to be another few years.
Frontline Systems is another provider that clearly gets the idea. Their “Backgrounder on Robust Optimization, Stochastic Programming and Simulation Optimization,” available online, lays out the possibilities clearly and in detail. This reporter has not yet had the opportunity to assess their current and in-progress software. They just won an INFORMS Impact Prize recently for the Excel Solver, so they deserve to be taken seriously. It is not obvious, however, how much having everything embedded in spreadsheets, which facilitated ease of use at the beginning, may eventually limit the capabilities and flexibility of this package. In particular, this limitation has kept Frontline Systems identified more with optimization than with simulation, as many simulation analysts strongly prefer GUI input and animation output.
Many current simulation practitioners learned by using ARENA, the dominant package in university instruction in the 1990s and early 2000s. ARENA also has a good experimental design routine and the ability to produce a response surface. Data input is difficult, however, usually requiring a separate distribution fitting routine such as simulation guru Averill Law’s ExpertFit. ARENA is designed around Graphical User Interface (GUI) inputs, so manipulating and editing actual data is complicated. More important, ARENA – the company, not just the software – was purchased by Rockwell International in 2001, and many aspects of the design got changed in ways that appeared to sacrifice features and ease of use for internal computational design considerations. The company and the package seem more focused now on maintaining the presence in instruction and improving on existing applications. In this reporter’s opinion, ARENA is not the company to bet on to win the race to develop the integrated capability, although they certainly do merit watching.
Several of the major players in ARENA before the Rockwell acquisition have now emerged with a new company and product, Simio. As often occurs when the team on a successful product development gets the chance to do another one like it, taking into account what they’ve learned, this is an impressive offering. The GUI input is easy to learn and use, with many options available for more advanced users but sensibly defaulted for the less experienced. Data input is easy via spreadsheets or some database files. The animation output is outstanding. There is an experimental design package in which the user can set up and run a range of replications, generating a response surface.
The question remains as to how to export a response surface in usable form for input to a stochastic optimization package. The scalability of the package for large models is also a question this reporter has wondered about but not explored, as the beautiful graphics most likely impose substantial computational overhead. Simio is available for Windows only and looks unlikely to expand to Mac or other platforms any time soon, so, like SAS Simulation Studio, it isn’t integrated with the JMP stochastic optimization. Exporting the response surface to a spreadsheet and importing those values, with some smoothing and interpolation probably needed somewhere in the process, would be required. Still, Simio looks like another of the major contenders to watch.
CrystalBall, now part of ORACLE, is another possible contender. This package, too, relies on spreadsheet intermediary products to transfer a response surface to the optimization routine. This reporter has not explored the package and therefore has little to say about it, but it does connect smoothly to data input – that’s what ORACLE wanted when they bought it – and their developers, based on a brief conversation at the INFORMS national meeting following this reporter’s presentation about this new trend in software, seem to have the right idea about completing the integrated capability.
This brings us to ExtendSim, formerly called just Extend, which this reporter regards as the closest to the overall capability at this time. ExtendSim has the ease of learning and use and the robust user base to stay in this market, good animation features, and full integration with data input, via spreadsheets or from some data base packages, and fitting probability distribution to these input data sets. The latest version also includes a Scenario Manager that tracks changes in inputs and associated output files. It creates a record of runs and results and thereby supports a general experimental management activity. This also facilitates communication with more advanced, full-featured experimental design routines.
Ironically, since ExtendSim does run on Macs, it also has a smooth integration with JMP’s experimental design features, enabling the user to perform complicated factorial experiments and produce a response surface. As with the other packages, this output then has to be transmitted to an optimization routine via spreadsheets or some other such intermediate structure. However, when the JMP stochastic optimization routine is available, ExtendSim appears likely to have a cleaner path to it than SAS Simulation Studio. (While not generally a fan of conspiracy theories, this reporter apologizes to the user community if this observation somehow leads to a delay in the release of the JMP stochastic optimizer until the SAS interface to it is also ready. Then again, maybe ImagineThat, the developer of Extend, will end up collaborating with Frontline Systems.)
Next Steps
As developers progress toward delivering the integrated capability to specify a problem, use simulation to explore the effects of random variation and pass the results to a stochastic optimization that will yield a risk-advised robust best solution, some new issues for research arise. For one, experimental design programs smooth response surfaces automatically; we need to know more about how to identify peaks and drops of interest that might be understated by the generated response surface. On the other hand, stochastic optimization on rough response surfaces poses new modeling and computational challenges. In plain language, the need to search for local ups and downs poses difficulties that could substantially reduce the practical value of the integrated simulation-experimental design-stochastic optimization approach.
At an even more fundamental level, moving from deterministic optimization, with its nearly insatiable data requirements and tendency to be highly sensitive to random variations and perturbations, to stochastic optimization represents a huge shift in thinking. Which method to apply where is an evolving question, dependent on what techniques and software are available but also on the analyst’s ability to explain results to the decision-making client. The science of how to make decisions, including what constitutes evidence, how we assess how good the evidence is, and what we do to identify and account for alternative assumptions, will never be a closed subject. This, too, is a fruitful area both for further research and for continuous improvement in practice.
Conclusion
Combining simulation, design of experiments and stochastic optimization offers a major improvement in our ability to solve problems of risk minimization and of maximizing expected production under uncertainty. Recent developments in software offer an exciting new integrated capability to formulate and solve such problems. While the software development is taking longer than its advocates had hoped, and some there are some challenges in how to use the capability also remain, the approach merits considerable attention, as it looks like one of the most promising advances in analytics methods for the coming decade.
Doug Samuelson (samuelsondoug@yahoo.com) is a senior operations research analyst for IBM in Herndon, Va., working on defense-related analyses. He is also president and chief scientist of InfoLogix, Inc., a small R&D and consulting firm in Annandale, Va.
Author’s note:
This article is about the prospects for one analytical approach, based on what the author happens to know about, and is not meant as a comprehensive survey or comparative evaluation of relevant vendors. Anyone with a good story to tell about current or in-progress software and modeling methods to advance this approach is welcome to contact me about it.
References
- John Sall, 2008a, http://blogs.sas.com/jmp/index.php?/archives/110-The-Challenge-of-Optimizing-Products-and-Processes.html
- John Sall, 2008b, http://blogs.sas.com/jmp/index.php?/archives/115-Cooking-Optimization-Should-You-Cook-Hot-and-Fast,-or-Warm-and-Slow.html
- John Sall, 2008c, http://blogs.sas.com/jmp/index.php?/archives/117-Experiments-on-Experiments,-Models-of-Models.html
- www.arenasimulation.com/arena_Home.aspx
- www.crystalballservices.com/Software/OracleCrystalBall.aspx
- www.extendsim.com/index.html
- www.sas.com
- www.simio.com/index.html
- www.solver.com/press200710.htm (Frontline Systems)
