Big Data

Election tech: Lies, damned lies, and statistics

In the aftermath of the election, pollsters and pundits are scrambling to account for misguided forecasts. What went wrong (and right)? Political analysts weigh in on the role of data in politics.

istockrzoze19.jpg
Getty Images/iStockphoto

"There are three kinds of lies: lies, damned lies, and statistics."

  • Benjamin Disraeli

A bitter cold wind whipped the proud flag around a pole as steely gray as the New Hampshire winter sky. "He can't win," she said on condition of anonymity, "because his big data is nonexistent. Has no ground game. He has no get-out-the-vote operation." As canvassers and candidates trudged through the snow, we chatted in a cozy Manchester diner, anxiously waiting for early primary results to roll in. Republicans had learned from their previous data-driven defeats, and candidates like John Kasich, Ted Cruz, Marco Rubio, and Jeb Bush all invested heavily in big data, explained the respected DC insider. She was confident that tech was going to win the primary, and presidency.

She was wrong. And so was everyone else.

During the 2016 presidential campaign—on the trail from at early primaries, to debates, the RNC and DNC, through election day—TechRepublic reported on the emerging power of technologies to win elections. Ted Cruz's campaign, powered by information tech created by UK firm Cambridge Analytica, attributed its success in Iowa to microtargeting backed by big data. The Clinton campaign benefitted from the remnants of Obama's tech infrastructure and had a robust data and social media operation powered by prominent Silicon Valley technologists. The big winner was sure to be big data.

Even the Trump campaign was prepared for a loss. On Election Day Trump headquarters in Manhattan was inaccessible, barricaded by a wall of trucks and protesters. As Bloomberg's Sasha Issenberg reported, Trump had a smart social media team but spent little on data, and even less on GOTV. The Republican nominee's grassroots support and social media strength, powered largely by robots and spammers, was also dubious.

trumptrucks.png
Image: Dan Patterson

The mood was jubilant and celebratory at Clinton's rally across town. During the day supporters traded smiles and hugs. Then the grim results rolled in.

"How could the polls be so wrong? How could our own data be so wrong?" Standing in a discreet corner under the blue-tinted glass ceiling of New York's Javits center the Clinton staffer choked up. Superficially, it appeared that technology, the polls, and big data failed the Clinton campaign.

And they weren't alone. Clinton's brutal loss hurt the polling industry, the data business, and the media. For weeks prior to the election, polls and prognosticators at respected news outlets like FiveThirtyEight and the New York Times all had Clinton ahead. After the election, pundits and the media offered several explanations: an over-confidence in data precision, inaccuracies in polling samples, and unmeasurable factors like voter sentiment.

"Perhaps people told pollsters one thing, but voted differently. People change their minds over time. It's human," said political scientist and TechRepublic data partner William P. Stodden. "The tl;dr answer to why pollsters got it wrong," was, he explained, "garbage in, garbage out. We can't poll the heart."

SEE: Data storage: Preferred vendors, demands, challenges (Tech Pro Research report)

And yet, undoubtedly, the 2016 election was the most high-tech campaign in history, and future candidates are likely to invest heavily in big data, social media, artificial intelligence, and other innovations. TechRepublic's Jason Hiner reports that social media played a critical role in engaging voters on both the left and right and helped raise funds for grassroots insurgents like Trump and Bernie Sanders.

clintonblueglass.png
Image: Dan Patterson

In many ways, campaigns are early adopters that blaze tech trails followed by enterprise companies and SMBs. Firms like L2 Political and Cambridge Analytica provide powerful insight about consumer demographics, TargetedVictory, incubated by the GOP and Romney campaign, helps companies automate advertising and marketing, liberal-leaning NGP VAN uses mobile devices to mobilize voters and consumers, and NationBuilder helps companies and campaigns better understand social media users.

So why did big data get it wrong (and right) on election night? TechRepublic asked political and technology experts to explain the failure and successes of big data.

"The lesson to learn is that the data didn't fail. Perhaps polling needs to be more complex and more robust. But our data didn't fail, obviously. A lot of people are saying polling is wrong. But we used polling data to help inform our models. We had a robust polling platform that did show a difficult battle. Our polling showed him down or even in battleground states, and the path to victory looked difficult. But what happens if turnout is different than expected? One of the clear outcomes was if the rural vote turned out, and the African American does not, the race gets closer and winnable. Because we were the underdog, we adjusted our model based on those factors. We saw this two weeks to a month out. We saw early votes and absentee trends that suggested an increase in rural voters, especially in Florida, Pennsylvania, and Michigan. We advised [that the Trump campaign] allocate time and resources to those states."

  • Dr. David Wilkinson, lead data scientist at Trump's data firm Cambridge Analytica

"It's not about the data. It's what you do with it ... did Clinton even have a real GOTV in Michigan, for example?"

  • Joe Trippi, former Howard Dean campaign manager

WATCH: Who could be Donald Trump's chief of staff and who could be in his Cabinet? (CBS News)

"The election speaks to the power of brand loyalty. Once you bought into Donald Trump, you were unlikely to be moved. Maybe we didn't fully understand the sentiment of how voters felt. The race was so divided that if anyone said anything positive about either candidate they were attacked [on social media]. Those who were fully onboard with a campaign were not shy, but many went quiet and stopped participating. Data can't predict silence."

  • David Almacy, former White House Internet Director for the George W. Bush administration

"Success has many parents and by next week we're going to be hearing that the Trump data team was perfect and exactly predicted the win and that he had a huge GOTV effort that no one bothered to cover. Everyone who knows, knows neither of those are true but that's the way it goes. What pollsters missed, really, was two things: They missed the Trump vote in rural areas was as big as it is [and] they missed the Trump shift in blue collar suburbs. This is probably a sampling issue. A traditional statewide poll might have a margin of error of 10 points for the rural part of the state because urban and suburban areas take up so much of the sample. I do think there were some 'shy Trump' voters, particularly in more affluent Republican suburbs. It's probably a smaller story than the rural vote or the blue-collar suburban vote, but if you look at the high number of undecideds in late polling and then compare it to the final tallies you either had a really late break for Trump, or some number of voters who said they were undecided but picked Trump ... analytics is going to be critical for cobbling together winning coalitions at all levels."

  • Chris Wilson, CEO Perkins Allen Opinion Research

Read more

About Dan Patterson

Dan is a Senior Writer for TechRepublic. He covers cybersecurity and the intersection of technology, politics and government.

Editor's Picks

Free Newsletters, In your Inbox