Big Data

Five apps to turn your data into big data

Reports and dashboards are just the beginning with big data - requirements now include predictive analyses and other more advanced tools.

big-data-explosion.jpg
Over the last few years, big data has become a big deal. Between sites like Data.gov, the massive amounts of data each person generates both privately and on social media, and every organization's rapidly increasing databases, big data is one of the most important things IT professionals need to understand and deal with. Reports and Dashboards are just the beginning with big data - requirements now include predictive analyses and other more advanced tools.

For this edition of Five Apps, we take a look at five tools to help you analyze your big data.

This article is also available as a TechRepublic Screenshot Gallery.

Five Apps

1. Datameer

Datameer is, on its surface, a basic analysis tool. It has a spreadsheet-like interface and contains many of the same charts and graphs. However, it surpasses Excel and other spreadsheet programs by allowing the user link to active data sources as well as import flat files as well as joining two tabs together into a third, much like you join tables in a database.

It also is much more column-focused than a spreadsheet - the tasks you perform, such as Group Bys, are all done with reference to a column and occupy a column of their own on the destination sheet. Since it is so columnar, you can also drag-and-drop columns into charts and graphs easily instead of having to specify ranges like in Excel. Charts and graphs come with many configuration options including manual colors, font sizes, layout, and positioning.

The final feature of Datameer is Smart Analytics which includes Clustering, Decision Trees, Recommendations (Heat Charts), and Column Dependencies tools. Datameer starts at $299/year for a single user and has Workgroup and Enterprise licensing available.

a3_datameer.JPG
 Wally

2. Jaspersoft

Jaspersoft is a drag-and-drop GUI that allows you to combine your data in various ways using the built-in charts, graphs, and crosstab views. You can see various types of data side-by-side by dropping those as Columns and break it down by various categories as Rows. One of the nicest features of Jaspersoft is the Data Level filter at the top right. It allows you to scale back your Rows or Columns to a lower level of detail (such as viewing sales by Country instead of by Country and then by Store Type) without having to remove those data points from the graph altogether. Jaspersoft offers several different editions of their software from the free Community Edition to various on-site versions licensed by server processor to an AWS-based version licensed per-hour. Pricing info is available from the sales team.

b3_jaspersoft.JPG
 Wally

3. Pentaho

Instead of being a dynamic reporting tool, Pentaho allows you to create fixed structure reports and dashboards which are then tied to a dynamic data source. This is great for companies whose users do not have the skill or are unwilling to take the time to create their own visualizations. Pentaho has the typical charts and graphs, such as pie, bar, line, etc., as well as crosstab views. It also has heat grid reports to compare performance among various measures. Like the other systems on this list, Pentaho can link up with various source databases. Pricing is available from the sales team.

c4_penthallo.JPG
 Wally

4. SAS Visual Analytics

Easily recognized as the biggest name on this list, SAS has entered the big data fray with their Visual Analytics software. However, it is, for the most part, roughly equal to the other products here. Data is brought into the system either by flat file or database links, and various charts, graphs, and visualizations are easily created.

It stands out, however, in the way that it displays that information. Where the other products were somewhat vague as to what the data values were, SAS Visual Analytics always seems to provide a legend, especially in geographical visualizations, heat maps, and the like. One visualization I did not see present in their set was the pie chart, however they seem to have replaced it with a treemap, which can have the same effect, although may be harder for some to understand.

The other standout feature, to me was the quick glance feature when selecting data filters. You can easily see the relative size of the data in each data point so you know somewhat what you're getting into. SAS Visual Analytics pricing is available from the sales team.

d1_SAS_Vis.JPG
 Wally

5. Splunk

While it connects to traditional data sources like the other systems on the list, Splunk is the only product that can connect to system event logs, system performance monitors, directory trees, TCP/UDP connections, and Active Directory systems. Given that vast array of non-traditional data sources, Splunk is a great solution for monitoring big data that, on the surface, doesn't seem like big data. However, event log monitoring alone can generate as much raw data as enterprise EHR and CRM systems.

While it provides the common charts and graphs, Splunk also has its own query language that makes it difficult to jump right in to. Anything beyond basic charts requires knowledge of the query language. Pricing is simple: you pay by the gigabyte indexed per day by the system whether that is on-site or cloud-based.

e3_splunk.JPG
 Wally

Bottom line

There are many more products available for analyzing your own big data; these are just a handful offering necessary features. Has your organization delved into analyzing their big data? If so, have you used any of the tools above or different tools? Share your thoughts in the comments below.


Going Deep on Big Data

Big data is transitioning from one of the most hyped and anticipated tech trends of recent years into one of the biggest challenges that IT is now trying to wrestle and harness. We examine the technologies and best practices for taking advantage of big data and provide a look at organizations that are putting it to good use.


4 comments
mupini
mupini

 "..turn your data into big data". At the risk of igniting the big data definition debate, i would say the title to this article shows a level of misunderstanding that the average organisation has regarding big data. 

Making use of a particular tool doesn't magically turn a  dataset into a big dataset. All the same, thanks for a good article

dogknees
dogknees

"can connect to system event logs, system performance monitors, directory trees, TCP/UDP connections, and Active Directory systems. Given that vast array of non-traditional data sources"

We've been using these for years. How are they non-traditional?

Reading through the article, I still can't really see what the difference is between "big data" and the sort of cross-platform, multi-source data matching and analysis I and many others have been doing for a decade or more.

Mark W. Kaelin
Mark W. Kaelin moderator

Has your organization delved into analyzing their big data? If so, have you used any of the tools above or different tools? What other tools should be taking a look at?

dogknees
dogknees

@mupini - Agree!!

It seems to now mean any kind of data analysis regardless of the scale or source of the data.

Editor's Picks