Cloud Security Alliance, an organization dedicated to making a secure cloud universal, has just released a report of 100 best big data security practices. The report presents 10 major challenges facing businesses using the cloud and provides 10 strategies to combat each of the problems.

It’s a long report, but it’s one worth reading. Here is a summary of each category and an essential takeaway for each one.

1. Securing computation in distributed programming frameworks

The report points to Apache Hadoop as an example of distributed programming frameworks and cites problems like information leaks, trustworthiness of mappers, and policy compliance.

The theme of the first point is establishing clear security policies and being sure that each node on a distributed framework is trusted, clearly identified, and secure. Access control needs to be closely monitored, and nodes need to be regularly maintained to ensure continued security.

2. Securing non-relational data stores

NoSQL databases and systems like them are not the most secure. Injection attacks are still common, and when considering security for non-relational data stores the report recommends a number of strategies.

SEE: Research: Big data and IOT – Benefits, drawbacks, usage trends (Tech Pro Research)

Encryption is king in this part of the report. It recommends ensuring passwords are secure and hashing algorithms are used, as well as using TLS for all connections. Logging all connections, supporting pluggable authentication modules, and consistently replicating data to ensure consistency are also advised.

3. Securing data storage and transaction logs

Storage management is the third category covered in the report. The actual placement of stored data isn’t recorded, making collusion attacks and malicious file modification a distinct possibility.

DRM, policy-based encryption, key rotation, and broadcast encryption are all ways to secure storage. If storage is in an untrusted location the report recommends using secure untrusted data repositories (SUNDR) to increase the chance of detecting unlicensed changes.

4. Endpoint validation

BYOD is making endpoint validation and security a huge challenge. Devices can be spoofed, credentials can be faked, and single machines can even masquerade as multiple users.

Device trust and certificate usage are consistent throughout the tips listed in this section. Also mentioned are the incorporation of tools to manage devices (e.g., antivirus applications), outlier detection to catch bad data, and thorough resource testing.

5. Monitoring security and compliance in real time

Collecting data means using a variety of sensors, machines, and programs that raise new kinds of security concerns. Monitoring distributed hardware needs to be done in real time to avoid the injection of false data, and even the addition of untrusted devices to a cluster.

One of the easiest ways to avoid these kinds of attacks is to implement front-end systems like firewalls and routers. Also be sure to consider cloud-, application-, and cluster-level security, all of which can be an ingress route for an attack.

6. Ensuring user privacy in data

Anonymizing data has been shown to be insufficient to protect privacy. Anonymized data can be re-paired with the owner, important info can be leaked, and data analyzers aren’t always aware of the potential risks.

The report recommends implementing a separation of duty principle. In a separated duty system each user only has access to the bare minimum of data they need to do their job, ensuring that it’s harder to re-identify data. Also be sure that data at rest is fully encrypted and that staff is properly trained in security and potential problems to watch out for.

7. Big data cryptography

This portion of the report is more about advances being made in cryptography and how it can help secure big data. Big data also means streaming data through the cloud, so cryptography is a fundamental part of security now and into the future.

SEE: 6 myths about big data (TechRepublic)

Recent developments in cryptography include being able to perform computations of fully encrypted data, group signature systems that prevent identification of individuals, and oblivious RAM to shuffle data location after each access.

8. Granular access control

Granular access is a method of providing user privileges in the most minute way possible. Each small element of data can be controlled with granular access privileges, and some standard practices need to be used so that granular access is effective.

Make sure the granularity is regularly monitored and tightly controlled. This means maintaining access labels, tracking secrecy requirements, using SSO to track users, and developing thorough protocols for tracking access restrictions.

9. Granular audits

Auditing should be done granularly as well. Users often miss legitimate security alerts, or simply ignore them, so everything needs to be audited regularly to ensure complete security.

Audits need to be complete, have an easy-to-follow trail, and be very timely. In addition, audit data has to be thoroughly secured so that it can be trusted, so secure hashing is advisable. Big data and audit data should also be separated, and access to audit data should be restricted and logged.

10. Data provenance

Provenance, in terms of big data, means a complete record of everyone having control, gaining access, and making changes to a data set. It’s the sum total of all interaction with that data and exists to give it legitimacy. Provenance is fundamental to big data security.

This tenth point almost reads like a summary of the entire report. Provenance is maintained through proper implementation of all nine previous points, and in the end it comes down to tightly controlling access, having the right protocols in place, and being sure data is properly encrypted and secured.

The 3 big takeaways for TechRepublic readers

  1. The report gives 10 areas to focus on and 10 recommendations in each one.
  2. While each area focuses on different elements of security, the focus can be reduced: keep things encrypted, tightly control access, and maintain a thorough paper trail of access and modification.
  3. With more and more data stored in the cloud security is going to be fundamental to most every business. The report might be long, but it’s worth your time as a CIO or IT professional to read it and consider implementing its suggestions.

Also see