The results of the recent presidential election surprised most pollsters. Even though polling data indicated a tight race in the final weeks of the campaign, almost every major pollster predicted a Clinton win. Media outlets like the New York Times and Washington Post published aggregated polls that predicted a Clinton victory. Nate Silver’s FiveThirtyEight site was among the most pessimistic about Clinton when it pegged her odds of winning at north of 70%.
Other pollsters were even more bullish: Sam Wang’s Princeton Election Consortium gave Clinton a 99% chance of victory. (And Wang kept his promise to eat a bug on live TV if Trump amassed more than 240 Electoral College votes.) In the post-mortems, some have claimed that the only predictions that turned out to be correct were those based on factors like the likelihood of an out-party succeeding an in-party candidate, economic indicators, right track-wrong track sentiment, etc., rather than polling data.
So, should the takeaway be that historical inferences are more accurate than data? Does the fact that the aggregated polls failed to predict the winner mean we shouldn’t rely on polls in the future? Before answering these questions, it’s important to acknowledge that since the race was so tight, the polls weren’t off by much. In fact, some individual polls, such as the USC-LA Times poll, got it right — that poll consistently showed Trump with a narrow lead in the final weeks.
According to its architect, the USC-LA Times poll got it right by considering data points other polls didn’t factor in: The survey asked potential voters to rate their likelihood of voting for each of the candidates and also their likelihood of voting at all. That’s a departure from the methods used by traditional “likely voter” polls, which give respondents a choice between the candidates, even though they’ve not yet made up their minds, and discard data from potential voters who don’t meet a “likely voter” standard.
As the LA Times article notes, their polling methodology enabled the pollster to capture the uncertainty of many voters, and subsequent exit polling indicates that late-deciders swung decisively for Trump, especially in the key battleground states of Florida, Michigan, Pennsylvania and Wisconsin. The accuracy of this poll may suggest that expanding the dataset to include wavering voters is a better way to call a very tight race.
That suggests an untapped pool of data that the other pollsters missed. They may have even had the data on hand about wavering voters who weren’t sure of which candidate they would support, but it wasn’t integrated with the “likely voter” data. This scenario mirrors the situation businesses often find themselves in when the data they collect resides in silos and companies fail to unlock its potential by integrating it with other information.
It’s also important to keep in mind that pollsters are still struggling to adjust to a communication landscape that has been transformed by digital media. Several years ago, pollsters were caught short by relying on surveys conducted via landlines, which caused them to oversample older voters since many younger people rely on mobile phones exclusively.
An executive at the Pew Research Center noted that researchers are devising ways to incorporate data from social media and other public sentiment indicators, but the new methodologies are still in their infancy. Such an effort is a good way to break down information silos. The data is out there, including poll-generated data. But it will take a strategy that acknowledges a broad spectrum of data inputs and the ability to integrate and manage them effectively to generate true insight and improve polling’s predictive powers.
The pollsters’ failure to accurately predict the outcome of the 2016 election will likely be the subject of much industry introspection and will undoubtedly inspire study and analysis for years to come. The takeaway for businesses that rely on big data is this: breaking down silos and integrating and managing data across a variety of sources and formats is critical to gaining accurate insight.
While the outcome of the election was unexpected, the results go to show that having accurate and integrated data is of crucial importance for a variety of business purposes. At Liaison, we know that the future will be data-inspired, and while there will continue to be exponential amounts of data created, being able to access necessary information and draw trustworthy, actionable conclusions is vital to operating a successful business in a data-centric world.