Cloudlytics is a log analytics product from Amazon consulting partner BlazeClan Technologies. Cloudlytics analyzes logs for Amazon Elastic Load Balancing (ELB), CloudFront CDN download and streaming, and Simple Storage Service (S3). Cloudlytics is Software as a Service (SaaS) — customers use a web dashboard to view reports based on attributes like geography, client technology, or time.
Gurmeet Singh, founder of BlazeClan, described why his company built Cloudlytics, how it works, and what his customers — companies like the Finnish games company Supercell and US digital marketing company HubSpot — get out of it.
The need for Cloudlytics
BlazeClan produced Cloudlytics because of the needs of one of BlazeClan's initial customers, a NASDAQ listed billion dollar enterprise that has a widely downloaded 3D modeling product. Singh said, "They see over a million downloads across the year. Talking to this customer, we understood the requirement for log analytics because they wanted to process these logs for their downloads and understand the customer behaviour." Cloudlytics came too late for that customer. "At the time they had a need, but we did not have a product. They built their own in-house infrastructure."
BlazeClan went on to build Cloudlytics. BlazeClan thought that, since building infrastructure in-house was time consuming and running it was expensive, there was a global need for a service from a third-party provider for processing Amazon logs.
Customer activity creates the logs
AWS delivers log files to its customers with a maximum latency of 24 hours. Singh said the volume of logs to be processed each day "varies from customer to customer. As you can see from our pricing page, most of our customers are in the range of processing up to 20 GB of logs. However, we have a few customers who have requirements to process more than 20 GB of logs. We have processed up to 150 GB of logs for one customer at one time."
Cloudlytics processes the logs
As you might expect from an Amazon partner, BlazeClan built Cloudlytics on Amazon infrastructure. Singh said the first step is processing. "From S3, what we do is we run a Hadoop cluster, which uses AWS Elastic MapReduce (EMR). We feed in all the log files from a client bucket, and we start processing it. We analyse IP and get details like which country it is. We also find out details like what operating system, what browser is used." The second step is storage. "After processing we store this data directly into Redshift — that is a data warehouse."
How Cloudlytics scales
Singh described how the Cloudlytics system is built to scale to match processing power with customer demand. "What we do is we calculate the number of log files we have to process. When we launch the EMR cluster we launch a particular number of spot instances. Suppose there are 1,000,000 log files that have to be processed — then we launch 15 or 20 spot instances of C3 or M3." This analysis and cluster build is all automated.
Singh described how load can vary. "Typically at the weekends we can expect to spin off smaller number of spot instances because our customers will have less traffic and less number of logs to be processed."
Customers use the dashboard
Once the data is stored in Redshift, it is ready to be viewed by customers. Singh said, "You can query directly onto our data warehouse. There will be no processing because the data is already in Redshift. You just query onto it, and you get all the reports."
The purpose of generating reports based on activity reports is to gain business value. Since every organization has its own unique needs, it is worthwhile to try a range of services to see which is the best fit.
Cloudlytics is available as SaaS — the easiest type of service to try. If you have these Amazon logs, you can register for a free trial and see what you think.
Nick Hardiman builds and maintains the infrastructure required to run Internet services. Nick deals with the lower layers of the Internet - the machines, networks, operating systems, and applications. Nick's job stops there, and he hands over to the designers and developers who build the top layer that customers use.