Updated 3/18/2018: On March 17, 2018, Facebook announced that Cambridge Analytica was suspended for violating the company’s standards and practices. TechRepublic’s Dan Patterson spoke with CBS News about how Cambridge Analytica could have harvested so many Facebook profiles. The interview has been added to this article.
“Nonsense,” laughed a former senior Trumpworld technology staffer. “Look, I’m not saying they lied but for Cambridge Analytica to run victory laps and claim they won the election for Trump is a huge exaggeration. Data can do a lot of things, but there’s a limit to how effective it is. Cambridge Analytica’s claims went far beyond that limit.”
The comments echo sentiment shared by a handful of GOP operatives, former Trump campaign workers, and current White House digital staffers that big data analytics firm Cambridge Analytica exaggerated its role in the campaign. Moreover, many insiders claim the company’s use of psychographic voter data was vastly overstated. “It’s all well and good to [employ] engineers,” said one operative, “but what was the real value proposition? By the end of the campaign they were doing very little.”
Tweets by White House chief digital officer Gerrit Lansing and President Trump’s former Director of Digital Advertising & Fundraising and current GOP Director of Advertising Gary Coby bluntly called out the company’s big data analysis a “total lie.”
SEE: How risk analytics can help your organization plug security holes (Tech Pro Research)
During the 2016 presidential campaign in interviews with TechRepublic, Cambridge Analytica claimed to use a database of 240 North American consumer records–a mixture of voter data, social media, and surveys–to forecast voter behavior. CEO Alexander Nix claimed, “we use nearly 5 thousand different data points about you to craft and target a message.”
“We are fundamentally politically agnostic and an apolitical organization,” Nix said. “The high volume of Republican primary candidates this cycle allowed us to enter a competitive market… Starting with politics, we’d like to replace blanket advertising with individualised targeted and engagement ads.”
The day after Trump’s victory the company’s lead data scientist David Wilkinson told TechRepublic, “our data didn’t fail, obviously. A lot of people are saying polling is wrong. But we used polling data to help inform our models. We had a robust polling platform that did show a difficult battle… Because we were the underdog, we adjusted our model based on those factors. We saw this two weeks to a month out. We saw early votes and absentee trends that suggested an increase in rural voters, especially in Florida, Pennsylvania, and Michigan. We advised [that the Trump campaign] allocate time and resources to those states.”
SEE: Quick glossary: Big data (Tech Pro Research)
After Cambridge Analytica came under fire, we asked the firm for comment. During a half-hour phone meeting a spokesperson for Cambridge Analytica declined to be identified but articulated some technical components of the company’s model. Regarding allegations of exaggeration, the spokesperson stated, “we weren’t offended by [the Tweets]. We employ serious data scientists, and we do serious work.The [conservative] tech and data communities are relatively small and we have a good working relationship [with other actors]. But we’re somewhat new to the market. We’re not insiders, and our work is sometimes misunderstood.”
There is a nuanced middle ground between the utility of big data and the outrage over allegations of exaggerated claims, said Zack Christenson, founder and CEO of data marketing platform Crowdskout. As a former journalist in Chicago and Washington, D.C., Christenson understands the intricacies of politics, media, and data. “I think most people are seeing the forest and missing the trees,” Christenson said, regarding the big data revolution. “A lot of groups are really excited at the prospect of being able to consume 5,000 different data points on a person and create complex models using a person’s favorite color or what magazines they subscribed to last month. But this over-proliferation of data is the noise.”
SEE: IT leader’s guide to Agile development (Tech Pro Research)
It’s very hard to cut out the data points that really don’t matter, Christenson said, and figure out the few important data points that actually have real bearing on your goals. “That’s not to say that those 5,000 data points aren’t valuable,” he said, “but they’re not the silver bullet many people think.”
With big data science, process is important. “[Data] is like math,” Christenson said. “Each component is a building block, and it’s important to show your work.” To that end, Christenson explained the process at Crowdskout, his tech stack, and the future of data in politics.
What did you do and how did you do it?
The most important thing we did was put together an incredible team. We’d be nowhere if we didn’t hire the smartest and most talented people we could find. We hired the right people, pointed everyone in the right and same direction, and we were off to the races. It’s important to trust the team you’ve assembled and let them do their thing. We’re attempting to solve really challenging problems, so allowing everyone on the team to have the freedom to explore and solve those problems creates a scenario where we’re actually able to solve those problems and, I hope, creates a fulfilling work environment in which talented people want to work.
What is your tech stack and how does big data inform your decision-making process?
We tried to find the goldilocks tech stack–new and cutting edge enough to attract talent who wanted to work with emerging languages and tech, while also using technology with proven track records to get the job done–and also not too obscure as to not be able to find anyone who knew the right languages and stack. Our front-end is all AngularJS, our back end is mainly PHP with some Python. We run on MongoDB, MySQL, and Elasticsearch on AWS. We knew we had to support big data out of the gate, and so all of our technical decisions are made knowing we have to scale, which has prevented us from having to do any major technical swap due to load or performance issues, allowing us to focus on serving our customers and new features.
SEE: Quick glossary: Robotics (Tech Pro Research)
What worked, and what didn’t work?
Coming in we benchmarked our performance off some of our competitors, but as we started to get our software in the hands of users we learned that the speed of pulling large lists was a major pain point and offered an opportunity for us to differentiate ourselves in the market. At this point we incorporated Elasticsearch into our database stack, which allowed us to do in under a second what can take other software minutes or hours. We’re proud of that. Since then Elasticsearch has been a workhorse in our tech stack and we’ve been able to leverage its wide array of features to support some of our newer data visualization features and upcoming mapping functionality.
Could a small business or startup successfully replicate your process?
Of course. I think ideas are worth very little–if you can’t execute on your idea then it doesn’t matter. We’ve worked hard and we’ve built what we think is a pretty solid product that creates a lot of value for our customers. It certainly wasn’t easy, but the thing that I think sets us apart from other companies is we work really hard at getting things right and following through on what needs to be done–so if you can do that, you can definitely do what we’ve done.
One of the things that I think is often overlooked when building a company like this is how essential great customer service is. We built out a strong customer success team that’s focused 100% on maximizing our users’ experience–they’re real people that our customers can have real relationships with. That’s a major differentiating factor between what you get with us and some other tools out there.
WATCH: Documentary shows information revolution of big data (CBS News)
What does the future of big data for campaigns, grassroots organizations, and companies look like?
Software aimed at the corporate and retail world is much further along than what you see in the political, non-profit and advocacy world. I think the political software industry is just getting started. Up to this point, you’ve seen two things–either large presidential-sized campaigns building in-house tools that get mothballed after a campaign, or data consultants doing customized work and shoehorning things into existing software built for different industries. There are a few software companies here and there building software specifically for this world–Crowdskout, NGP VAN, or NationBuilder–but I think we’ll see a commoditization of this type of work as more and more software is built and offered to campaigns and grassroots groups in a SaaS model. Things will start to get easier and more and more tools will become available to smaller groups with smaller budgets–and more people will become familiar with how this work actually gets done.
- Election tech: Lies, damned lies, and statistics (TechRepublic)
- Cambridge Analytica: ‘We know what you want before you want it’ (TechRepublic)
- Happy Holidata! Your spending potential revealed (TechRepublic)
- A data visualization of Trump trends on social media (TechRepublic)
- A visual map of emerging cybersecurity trends (TechRepublic)
- Experts predict 2017’s biggest cybersecurity threats (TechRepublic)
- Visualizing the Russian cyberattack (TechRepublic)
- Poll: What new cybersecurity trends will dominate 2017? (TechRepublic)
- 2017 cybercrime trends: Expect a fresh wave of ransomware and IoT hacks (TechRepublic)
- Interview with a hacker: Kapustkiy from New World Hackers (TechRepublic)
- Interview with a hacker: S1ege from Ghost Squad Hackers (TechRepublic)
- Interview with a hacker: Gh0s7, leader of Shad0wS3c (TechRepublic)
- Cyberwar: The smart person’s guide (TechRepublic)
- IT Security in the Snowden Era (ZDNet)
- Russia’s role in political hacks: What’s the debate? (CNET)