One of the constant frustrations of managing an enterprise data center is protecting sensitive data. Recent breaches have made companies painfully aware that credit card data isn't the only data that needs to be protected; personally identifiable information (PII), trade secrets, and sensitive correspondence have also proven valuable to bad actors.
Data loss protection (DLP) solutions have been on the market for several years; most of these solutions focus either on the endpoint or the network. Recently, there has been an effort to focus on the storage array. Storage vendor DataGravity tackles DLP at the array. In this post, I'll introduce the system and some considerations.
Technology is only part of the solution
Before getting into the specifics of DataGravity, I need to stress the importance of a strong enterprise security policy; DLP technology is an addition to a strong security program. I've implemented host-based DLP and, universally, the most challenging aspect of implementation has been the maturity of the security training and program.
An example is the forced encryption of USB drives. I led the deployment of an endpoint DLP solution in which the solution encrypts all files copied to external media. Users found the solution too intrusive to their existing workflow. So instead of embracing the ease of use for encrypting critical data, users found even less secure ways of transferring data such as file sharing sites. The education of end users is the most important aspect of protecting sensitive data.
DLP should be deployed to help protect against unauthorized access vs. forcing end users into strong data protection habits.
The unstructured data challenge
Sensitive data exists all over the data center and enterprise. The obvious places for sensitive data are within application data. Well-written applications control the flow and access of sensitive data via controls within the application; however, data will escape these walled gardens and appear as unstructured data on disk. Identifying the resulting unstructured sensitive data can be a challenge.
One way to identify and control sensitive data located on the network is to leverage host-based DLP. Products from traditional enterprise security companies such as McAfee and Symantec use centralized management solutions to dictate data policies on local servers and workstations. In theory, endpoint-based DLP inspects all unstructured data that traverses the endpoint; the method is very similar to virus protection, and it has some of the disadvantages of virus protection. The endpoint approach uses significant CPU resources and requires an agent is installed on the endpoint.
DataGravity's unique approach
DataGravity incorporates the metadata needed to track sensitive data on the actual storage array. Identifying and tagging sensitive data is one of the most challenging and compute heavy aspects of data protection. DataGravity shifts the identification burden to the array and gives deeper levels of visibility.
DataGravity can identify sensitive unstructured data using algorithmic patterns such as validated credit card and social security numbers. End users can also define patterns and manually tag sensitive data. Once identified, DataGravity can report or prevent access to PII.
I spoke with DataGravity's CEO Paula Long, who discussed potential use cases for the technology. One of the chief use cases is looking for PII within virtual machine (VM) images that reside in the array. The VM image inspection removes the barrier and performance overhead of running DLP on each VM. Further, sensitive data is tracked regardless of the power status of the VM. If the VM image file moves, DataGravity will track the movement of the associated PII. The recently announced version of DataGravity integrates with VMware vRealize to automate policy enforcement.
DataGravity adds a resource in the fight to control sensitive data, but it isn't a magic bullet for DLP; there's still mail, endpoints, and legacy arrays that need to be considered. Also, there's no solution that gives a single pane of glass or reporting mechanism for DLP from all sources.
Is array-based DLP a solution to a problem you are experiencing, or is it a solution looking for a problem? I'd love to read your thoughts in the comments.
- Data storytelling offers a new approach to storage, says DataGravity founders
- Better data loss prevention tools for OneDrive for Business
- Be the Hemingway of data science storytelling
- Eliot Van Buskirk. Data and music storyteller. Child opera singer. Ramen noodle obsessive.
- Penetration Testing and Scanning Policy (Tech Pro Research)
Keith Townsend is a technology management consultant with more than 15 years of related experience designing, implementing, and managing data center technologies. His areas of expertise include virtualization, networking, and storage solutions for Fortune 500 organizations. He holds a BA in computing and a MS in information technology from DePaul University.