When Jason Atlas joined the online threat-protection firm IID two years ago, he and founders Rod Rasmussen and Lars Harvey could see that their
legacy database “was already falling over.” Atlas, Vice President of Engineering and Technology at IID, noted
that the database “was 180 GB in size, and we were clustering MySQLs and
response times for certain queries. It was becoming untenable. Relational
databases just could not scale to what we needed to do.”

IID aggregates and analyzes widely
sourced threat data to enable the protection of client assets and users. The
firm’s customers include major financial firms, government agencies, leading
e-commerce companies, and social networks and ISPs that use IID’s products to
detect and mitigate threats. IID does this by sharing information,
compiling and comparing data, analyzing it, and then performing actions on it.

With the legacy database, said Atlas, “we weren’t able to do
a lot of the other things we wanted to do down the road. So we had to start
looking at alternative technologies to solve these use cases. How do you store
enormous quantities of variant types of data? How do you then build a system
that allows all this different type of information to be readily retrievable,
and analyze it, and also being readily queryable, sortable, and viewable, as
well as obviously being highly secured?”

IID also found that “we’re sending data and information out
to organizations, and none of them are really talking to each other. There are
threats out there, but there wasn’t an enormous amount of coordination and data
sharing in the systems.”

“Rod and Lars,” said Atlas “came to the realization
that they needed to build a big data system that allowed all of our customers
to share information and analytics with each other.”

DataStax Enterprise: Open
source technology with full vendor support

“The first thing I did in order to test our technologies,”
explained Atlas, “was to take the biggest problem that we have—our
Internet protocol (IP) database, and add a zero to the amount of data, because
if we’re going to solve this problem then let’s solve it at a real scale, for
the future.”

“Using real-world examples of how people are querying our IP
data and accessing it, we removed it from MySQL and tested it on three
different NoSQL platforms.” Atlas stated, “Cassandra from DataStax blew them away. It met all of our requirements, it was
easy to work with, and we were able to get it up and running in an Amazon Web
Services (AWS) instance.”

“We are in the process of designing a massive-scale database,
up to a petabyte of information, that we are building our own dedicated data
center for,” said Atlas. He added that, “Cassandra provided by
DataStax is definitively our key value store of choice.”

“I have not found anything that comes close to the linear
scale capabilities and the rewrite capabilities that Cassandra has,” said
Atlas.” And I’ve tested a lot of databases. Just as a benchmark: It
is three times faster than Microsoft Cosmos. I built a data intelligence
platform on Cosmos three years ago, and I can tell you that Cassandra is
actually, literally three times faster.”

realized we had to know how to index our database
within the NoSQL store,” said Atlas. “Solr
is also an Apache component, which we’re currently looking at. And we also need
a Hadoop MapReduce computational environment, a compute environment, again
for analytics, for crunching, for associations, computation, for queries of
that nature.”

“I was terrified of going to market with all of these new
technologies that didn’t have a vendor behind them to give me 24/7
support,” said Atlas. “The fact of the matter was I needed all these
technologies, and DataStax bundled them all for me.”

“In addition, they gave me this awesome ops console that allows me to monitor and measure things that basically
is the equivalent of a business activity monitor, including workflow elements
and the Cassandra store. And I have the support,” added Atlas, “but I
am not vendor locked, because I am sitting on open source systems.”

“I now have all the benefits of open source, with the support
and benefits of an actual vendor,” said Atlas. “In my mind I get the
best of both worlds, and I have the technology that really maps to my use
cases. I am actually quite pleased, to be perfectly honest.”

Results with DataStax

Quick project
completion and time to revenue

“We migrated our entire product platform onto AWS in three
months.” said Atlas. “Our IP service is now a cloud service running
in a secured environment and is running on Cassandra as a full-blown product.
It is a ‘pay for’ product as we speak.”

“The new platform has been designed around DataStax
as our core engine,” added Atlas. “It’s what I call our ‘black box’
system. The data goes in and magic happens after that. And then the APIs go and
retrieve data and we present it to our customers.”

“We just shipped a demo. It’s a working functional
prototype,” said Atlas, “and it is being presented at RSA as we speak.”

Three core elements with cost savings

Atlas remarked “the fact that they integrated three core
elements with an ops console on top of it that allows us to monitor and measure
was enormous.”

He explained, “I have three technology pieces in this stack:
1) computation, with MapReduce through Hadoop; 2) search and indexing, with
Apache Solr; 3) and store and security elements with Cassandra from

“And I have them all from the same vendor, integrated with
dev and 24/7 support, for one-fifth the price that I would pay for a relational

Top-level security at the atomic level

“One of the things that was key to us, that Cassandra
affords, is what I call atomic level auditing and governance in our system,” said Atlas. “We’re dealing with security companies, a lot of these places are audited.
They have to make sure that the information they distribute is only touched by
the right people, and only seen by the right people.”

Using DataStax Enterprise “we were able to audit at a low
level within Cassandra, essentially every single transaction, every time
somebody accesses somebody else’s data or they access their own data.”

“This is a very high level of making sure there is no
repudiation, making sure there’s no penetration of the systems,” said
Atlas. “So we really have built an incredibly secure ecosystem and we
would not have been able to use systems besides DataStax Enterprise, because we
couldn’t secure them and they would not allow that level of atomicity.”

Developer resources included

“NoSQL is a relatively new programming methodology. People
still think in relational terms,” explained Atlas. “They don’t
necessarily know how to work with a NoSQL system correctly.”

DataStax provides “the developer resources, the developer training programs, and on boarding programs
to get your developers up to speed. So that benefit was huge.”

The bottom line: Value and security

“Customers and enterprises are already very reluctant and
scared to share data,” said Atlas. “You need to give serious value,
and you need to make darn sure you’re protecting and treating their data, and
caring about it as much as they do.”

“And that’s not just a marketing pitch,” added Atlas,
“that’s how we designed the system, and Cassandra and DataStax are a core
part of that.”