Election analytics

University of Illinois research group uses advanced analytics to forecast the results of national elections.

University of Illinois research group uses advanced analytics to forecast the results of national election

By Jason J. Sauppe and Sheldon H. Jacobson

The weeks leading up to Election Day are fraught with uncertainty as each candidate attempts to put their best foot forward, distinguishing themselves from the opposing candidate. The hysteria of negative ads, false accusations and uncertain voter turnout makes the race to Nov. 6 an exercise in anxiety. Amidst all this activity are pollsters collecting data from likely voters or registered voters, seeking to foretell the winds of opinion to determine the next president of the United States.  

In 2008, the Election Analytics research group in the Department of Computer Science at the University of Illinois introduced the Web site, election08.cs.illinois.edu, as a tool to meld together all this polling data and come up with accurate forecasts for not only who will win the White House, but how many Electoral College votes each candidate will garner. National polling figures, although of interest to the media, provide limited value in assessing which candidate will win the presidential election. How the support for each candidate is distributed across the states will determine the ultimate winner. Assessing such information requires advanced analytics to blend the so-called battleground states together into a cohesive pathway for each candidate’s road to victory.

The popularity of the 2008 site (it attracted more than 80,000 visitors during the eight weeks leading up to the election) led to the creation of the 2012 site, election analytics.cs.illinois.edu. Enhancement to the new site included a more broadly appealing design, an easy-to-follow trend chart, the opportunity to follow the Web site using social media (Twitter and Facebook), and forecasts (updated daily) of both who will win the White House and who will gain control of the United States Senate.

President Barrack Obama and Gov. Romney

Additional improvements were made to the computational components used in the model. In particular, parallelization was employed to speed up computing the probabilities of each candidate winning a state (or a Senate seat). This, coupled with hardware improvements, resulted in a reduction in computational time from approximately one hour (in 2008) to just over a minute. An enthusiastic team of undergraduate students spearheaded the redesign of the Web site and was responsible for the day-to-day updates and social media communications.

The 2012 Web site was officially launched in June 2012, though interest began to pick up significantly in late August (just around the Republican National Convention). A number of media sources mentioned the Web site, including the student newspaper at the University of Illinois, the Daily Illini.

Initial forecasts in June showed Obama leading with (approximately) a 99 percent chance of winning. This was due to polls indicating Colorado, Michigan, Pennsylvania, Virginia and Wisconsin all leaning toward Obama, with Florida, Iowa, North Carolina and Ohio being close. Since Romney’s electoral path to victory requires wins in many of these highly contested battleground states, the model has generally given him a low overall probability of winning.

As the summer progressed, Obama’s odds remained high due to favorable polling in almost all of the battleground states. The nomination of Paul Ryan for vice presidential candidate on Aug. 11 led to a slight shift in polling, with Romney’s odds of winning reaching 8.5 percent by Aug. 21 (odds were just over 22 percent in the “strong Republican” swing scenario). Polls leaned back towards Obama over the next several days, but shifted slightly toward Romney again during and after the Republican National Convention, which ended Aug. 30. However, no significant bounce in the state-by-state polls was observed for either candidate after the conventions, with Obama having a probability of winning between 97 percent and 99 percent. As of this writing (Sept. 12), Obama is predicted to win with 98.2 percent probability. More up-to-date forecasts can be found at electionanalytics.cs.illinois.edu/election12/.

The Senate site was launched on July 20, with initial forecasts showing a very close battle for control over the Senate: ≈40 percent chance of a Republican majority, ≈25 percent chance of a Democratic majority, and ≈35 percent chance of a tie (both parties securing exactly 50 seats). Close races were predicted in Florida, Massachusetts, Montana, Virginia and Wisconsin.

Republicans were favored to control the Senate during July, but polls shifted in favor of the Democrats during the first several weeks of August. Polls shifted back toward the Republicans at the end of August, despite an expected safe seat in Missouri switching parties. As of this writing (Sept. 12), the battle for the Senate is still close: ≈35 percent chance of a Republican majority, ≈33 percent chance of a Democratic majority, and ≈32 percent chance of a tie. More up-to-date forecasts can be found at electionanalytics.cs.illinois.edu/senate12/.

The Election Analytics Web site is an example of “analytics in action.” It provides a group of students with the opportunity to apply operations research methodologies to an issue of broad national interest. It supports the value of science, technology, engineering and mathematics (as well as analytics) in addressing a real-world problem. Election forecasting is not a textbook, toy problem. With millions of people interested in the outcomes of the election, election analytics highlights how operations research can, and does, make a difference.

Jason J. Sauppe is a Ph.D. candidate in the Department of Computer Science at the University of Illinois.

Sheldon H. Jacobson is a professor in the Department of Computer Science at the University of Illinois.

Acknowledgments

The authors would like to thank Steve Rigdon (St. Louis University) and Edward C. Sewell (Southern Illinois University Edwardsville) for their contributions to the research and the code used in the Web site. The authors also wish to thank Angela Ding, Taylor Fairbank, Bhargava Manja, Calvin Shipplett and Dimitriy Zavelevich, the students working on the project, for their contributions to help make this venture a success. Additionally, the authors would like to thank Joshua Gluck (Swarthmore College) for parallelizing the code used to compute the probabilities in the forecasts, and the Technology Services Group at the University of Illinois for hosting the Web site and providing technical support for the effort.