Complexity, Terror and Murphy’s Law
By Patricia M. McCormick, Greg McNeill, Doug Hendrix and Tom McCormick
In the nine years since the attacks of 9/11, a number of advances have been made in the use of modeling and simulation to examine the threat and to conduct hazards risk analysis in support of Homeland Security. However, the principal and as yet unmet challenge to the operations research community is the development of an approach that can support the timely, effective, enterprise-wide analysis of all hazard risks to support Homeland Security.
A survey of the modeling and simulation literature reveals numerous tools available to support such efforts. For example, a recent RAND report identified 37 separate tools that could be used to support risk assessment and disaster planning [1]. Sandia National Labs have developed and made available a suite of nine risk assessment methodology tools for infrastructure such as dams, power transmission facilities or municipal water systems [1]. Detailed simulation work has been done to examine both the phenomenology of terror attacks or natural disasters and the likely impact of such attacks (see, for example, “Mass Egress and Post-disaster Responses – Homeland Security Institute analyzes the aftermath of catastrophes” [3].
In almost every case, however, the modeling and simulation efforts have been directed at the examination of a single attack phenomenology or a single facility. There is a distinct lack of any capability to address the enterprise-wide impact of set of systematic terrorist attacks inside the United States, natural disasters or, the worst of all worlds, a set of opportunistic terrorist attacks during the immediate aftermath of a major natural disaster. Yet this is precisely the evolutionary type of threat that Secretary Napolitano described in her recent testimony to the U.S. Senate: “The terrorist threat changes quickly, and we have observed important changes in the threat even since this Committee convened a similar hearing last year. The threat is evolving in several ways that make it more difficult for law enforcement or the intelligence community to detect and disrupt plots.
One overarching theme of this evolution is the diversification of the terrorist threat on many levels. These include the sources of the threat, the methods that terrorists use and the targets that they seek to attack” [4].
The Requirement
The need for a modeling and simulation capability to address enterprise-wide risk assessments has been clearly recognized. In his article “Enabling Homeland Security with Modeling and Simulation,” Charles Hutchings, director for Modeling and Simulation in the Tests and Standards Division of the Department of Homeland Security, notes: “Despite studies and reports indicating the value of M&S, DHS does not have an enterprise approach or policy to develop, evaluate and use M&S capabilities for homeland security” [5].
Why is it important to have an enterprise-wide approach? In a word: complexity. A quick review of the appropriate documentation is very revealing. In the National Infrastructure Protection Plan, DHS identified 20 particular sectors of the national infrastructure as “Critical Infrastructure and Key Resources,” or CIKR [6]. The interdependence of these CIKR is particularly worth noting. Of the 20 CIKR identified in the plan, each CIKR sector is dependent on from 4 to 11 other CIKR sectors in order to function. Certain CIKR sectors, such as energy or communications, are vital to almost every other sector. The risk of a cascading failure if attacks or disasters were to impact these sectors simultaneously is clear.
More importantly, however, the ability to mitigate the impacts of such attacks or disasters, or to repair post attack damage, is equally as complex. The National Response Framework, or NRF, developed by DHS in partnership with state, local, tribal and territorial (SLTT) governments and U.S. industry defines 15 emergency support functions (ESF) needed to mitigate or restore infrastructure and services following an attack or disaster [7]. The NRF identifies some 47 separate and discrete organizations or groups of organizations that play a vital role in executing these ESFs. On average, execution of any one ESF requires the cooperation of up to 17 different agencies or organizations. And, mitigation or restoration of services for any one CIKR requires the invocation of from four to 11 of these ESF. The sheer number of organizations involved guarantees that Murphy’s Law will inevitably impact the effort to keep all players situational awareness high so that they are able to contribute rapidly and effectively.
This is where the ability to provide an enterprise-wide modeling and simulation capability becomes critical. Such a capability would allow planners to test and exercise plans before they are implemented. It would enable the ability to determine likely failure modes and do root cause analysis of cascading failures. In short, it would allow planners to see the error of their ways before the next Katrina.
One Approach
A great deal of work has been done to examine the way in which such a capability could be developed. A number of papers by Sanjay Jain, Charles Mclean and Y. Tina Lee have discussed the use of a modeling and simulation federation to instantiate such a capability. Indeed, DHS has developed a program, “Complex Event Modeling, Simulation and Analysis” (CEMSA), to put such a capability in place. The approach recommended by Jain, Mclean and Lee is a modular, federated modeling and simulation capability, as shown in Figure 1.
Figure 1: M&S conceptual model per Mclean and Jain’s “Components for an Incident Management Simulation and Gaming Framework and Related Developments” [8].
The key points discussed by Jain, Mclean and Lee in their seminal papers include:
- No single “monolithic” simulation can encompass all of the relevant scenarios, phenomenology, infrastructure issues and mission requirements that an enterprise-level simulation must be able to support. Modularity is required.
- The modular structure of the simulation should be governed by the type simulation best for the function of the module. A modular simulation could incorporate agent-based modules, discrete event modules, computational models and other capabilities as required. Modularity thus increases flexibility.
- With respect to the operational application of any enterprise level simulation, the ability to ensure that standards are met and that verification and validation can be successfully accomplished is significantly enhanced by the use of a modular approach.
- With respect to the actual employment of the M&S capability, a modular simulation makes significant sense in that it enables analysts to be flexible in the timing and detail level of their approach. By running modules off line and caching results, or by running modules in parallel, an analyst can save time and costs, both of which are usually critical factors.
- Finally, the use of a modular approach, with well-defined standard interfaces among the modules, will allow the relatively rapid and efficient update of simulations as capabilities evolve. This precludes the need to redevelop major portions of simulation code, and helps control the cost of simulation maintenance, which can be significant.
An Example
Earlier this year, the authors of this article were part of a team tasked to develop just such an enterprise-wide modeling and simulation capability for a Department of Defense (DoD) customer. The key element of the effort was the conduct of a multi-departmental workshop in which representatives from DoD, as well as the Department of Homeland Security (DHS), the Department of Justice (DoJ) and the Department of State (DoS) interacted in a carefully planned counter-terrorism scenario. The purpose of the workshop was to evaluate the operational utility of eight separate potential architectures for the exchange of information among the players to forestall a set of four terrorist attacks based on the DHS National Planning Scenarios. The scenario put a premium on the ability of the departments involved to rapidly and effectively exchange mission critical information.
To support the workshop, the team essentially used the approach described by McLean, Jain and Lee, with the addition of two critical modules – one to simulate information flow in the enterprise and one to track the “state of the world.”
The workshop was based on a complex scenario in which the Federal Enterprise players were required to share information collected outside the United States, within the United States and at locations along the U.S. border, with the express purpose of preventing a series of attacks planned by a hostile force. The scenario consisted of a series of 10 operationally linked vignettes, which encompassed approximately 30 specific events in which the Federal Enterprise players could interact [in various manners] with varying numbers of the more than 30 hostile players in the scenario.
The team linked this set of events and vignettes by means of an event trace matrix that provided a snapshot at the end of each event of the status of each of the five major databases maintained by the Federal Enterprise players. The result was a time-stamped history of the state of knowledge for the entire scenario for each of eight alternative architecture evaluations. This “state of the world” data supported a Bayes inference approach to states of knowledge of the five departmental databases represented in the simulation.
The resulting architectural approach as instantiated by the team is shown in Figure 2.
Figure 2: The simulation team enterprise-level simulation architecture.
It is worth noting that, exactly as discussed by McLean, Jain and Lee, the architecture federation contained a variety of simulation types. The telecommunications simulation, for example, is an object-oriented discrete event simulation. The business process simulation is a high fidelity process model. Both the specialized event and enterprise operations simulations are agent-based, complex adaptive systems models. Finally, as noted above, the “state of the world” simulation is a Bayes inference engine.
Using this approach, the team was able, within a period of just six months, to develop the simulation, conduct the necessary simulation runs to evaluate eight separate alternative information sharing architectures, mine the voluminous data produced by the simulations, and conduct a successful workshop involving four federal departments.
Implementation Lessons Learned
The simulation team learned several valuable lessons. Categorized as operational-, process- or simulation-related lessons, they include:
Operational lessons learned:
- Enterprise-wide assessment: In evaluating the operational utility of the various architectures considered in this study, it quickly became clear that solutions dealing only with one department or domain were not adequate to defeat the hostile plan. Only when an enterprise-wide solution that focused on broadest possible information sharing was applied did a consistently favorable outcome result.
- Information standards: Information standards are critical. Information sharing is often directly driven by the ability of one party to effectively use the information provided by another. The extra time consumed in the scenario, due to a sharing process slowed by translation and/or formatting requirements, literally meant the difference between success and failure for the Federal Enterprise players in several of the architecture cases studied.
- Information sharing: As noted in the National Research Council Report, the flow of shared information among the members of the disaster response team is critical. This includes the need to share information with not only partners at the federal level, but as noted in the National Response Framework, the state, local, and tribal levels of government as well. This was the most significant contributor to the success of the Federal Enterprise players, once the standards issue was solved. This bears out the findings of a recent GAO report on government information sharing [9].
Process lessons learned:
- Scenario structure: The development of the scenario is critical to success of the simulation process. Each of the four planned attacks to be carried out by the hostile force in the overall scenario was based on one of the 15 national planning scenarios developed by DHS. This approach made for significantly easier coordination of the scenarios among the workshop participants, and significantly reduced the need to model attack-related phenomenology (e.g. downwind hazard plumes, etc.).
- Scenario richness: The richness of the scenario used (e.g. the number of hostile players portrayed, the number of encounter events) and the level of fidelity as to information collection, transmission, processing and storage led to a geometric increase in the amount of data to be mined in the post-processing effort. However, this richness paid off in the ability to identify trends, strategies and key failure modes.
Simulation lessons learned:
- Coordination: The ability to coordinate on a close and continuing basis among the team members was critical. Early in-process reviews and continuous interchange of information among the team members were vital to avoiding potentially fatal errors.
- Federation: The federated approach to simulation, as described by McLean, Jain and Lee, is immensely effective. It also facilitates the entire verification and validation (V&V) process and the ability to track down problems in various modules.
- Data collection and mining: Data to be mined from the simulations were identified early in a simulation plan, identifying the measures of utility, measures of performance and measures of effectiveness to be used, the sources of data for each, by simulation module, and the analytic approach to be used in their evaluation. This facilitated a rigorous analytic effort and significantly simplified the risk analysis portion of the problem.
- Dealing with the unknowns – both known and unknown: “Murphy,” disguised as a wildly cockeyed optimist, eventually made his entrance into the process. Simulation teams should plan for at least one major “ah ha!” event in any such effort. It will happen, and it will take place at the worst possible time. A management reserve of hours and patience is critical to success at that point.
Conclusions
The development of a federated set of simulations that can support the timely, effective enterprise-wide analysis of requirements to support Homeland Security can be met with the tools and techniques at hand. With the inclusion of the information flow simulation and the “state of the world” simulation developed by the team, the concept outlined by Jain, Mclean and Lee is clearly executable.
Patricia M. McCormick (pmccor@inforead.com), Alpha Informatics, Limited, is a 1982 graduate of USMA. She holds a bachelor’s degree in computer science and a master’s degree (with distinction) in operations research. She led the simulation team, and was directly responsible for the information flow simulation. She has more than 25 years experience in the application of modeling and simulation to solve complex operational problems, 14 of those years as the president/ CEO of Alpha Informatics.
Greg McNeill, ExoAnalytic Solutions, Inc., is a recognized expert in the development and application of agent-based, complex adaptive systems models. He holds a bachelor’s and a master’s degree in physics. McNeill was responsible for the operational simulation capability used by the simulation team. He has more than 35 years experience in the application of modeling and simulation to complex issues. He is president of ExoAnalytic Solutions.
Doug Hendrix, ExoAnalytic Solutions, holds a bachelor’s degree, a master’s degree and a Ph.D. in physics. He was responsible for the development and application of the Bayes inference engine used by the simulation team. He has more than 20 years experience in the application of modeling and simulation to complex issues. He is chief executive officer of ExoAnalytic Solutions.
Tom McCormick, Alpha Informatics, Limited, holds a bachelor’s degree in history, a master’s degree in international relations/strategic studies, and an master’s degree in systems management. He was responsible for development and instantiation of the operational scenario used by the study team, as well as the planning for and post processing analysis of data. He has more than 40 years experience in the application of modeling and simulation to complex issues. He is a senior analyst for Alpha Informatics, Limited.
References
- Moore, Wermuth, et al, “Bridging the Gap: Developing a Toll to Support Local Civilian and Military Disaster Preparedness,” RAND Center for Military Health Policy Research, April 2010.
- See Sandia National Labs’ “Security Risk Assessment Methodologies,” found online at www.sandia.gov.ram, Sept. 13, 2010.
- Douglas Samuelson, “Mass Egress and Post-disaster Responses – Homeland Security Institute analyzes the aftermath of catastrophes,” OR/MS Today, October 2007.
- Testimony of Secretary of Homeland Security Janet Napolitano to the Senate Committee on Homeland Security and Governmental Affairs, Sept. 22, 2010.
- Charles W. Hutchings, “Enabling Homeland Security with Modeling and Simulation (M&S),” MSIAC Journal, summer 2010.
- National Infrastructure Protection Plan, Department of Homeland Security, June 2009.
- National Response Framework, Department of Homeland Security, January 2008.
- Jain, S., McLean, C., and Lee, T. Tina, “Towards Standards For Integrated Gaming And Simulation For Incident Management,” NIST, June 2007.
- USGAO-10-41 Information Sharing: “Federal Agencies Are Sharing Border and Terrorism Information with Local and Tribal Law Enforcement Agencies, but Additional Efforts Are Needed,” December 2009.
