If companies store and access their big data onsite, the processing and storage costs can be substantial, but the ability to access and query this data is faster than it would be in the cloud. Conversely, cloud big data processing and storage offers scalability, no investments in on-premises infrastructure, and security and management options that are almost as good as you’d have onsite, provided IT enacts the same policies and procedures in cloud operations as it does onsite.
SEE: Hiring Kit: Database engineer (TechRepublic Premium)
The catch is that these big data management policies and procedures might not be as enacted by IT in the cloud as they are on-prem. This makes it essential that organizations set strategies and practices for cloud-based big data management.
What types of strategies should be considered for big data in the cloud? Here are three.
How to select your cloud architecture
The four cloud models for big data are private cloud, public cloud, hybrid cloud and multicloud.
Private clouds are the most expensive, but they are dedicated to your use alone and are therefore ideal for data that is highly confidential and proprietary. It is up to your IT staff to manage private clouds. Public clouds are cheaper. The cloud provider takes over management tasks, but you share the cloud with others. Hybrid clouds combine the concepts of private and public clouds, and a multicloud approach means that you spread your big data over more than one cloud.
SEE: Microsoft Power Platform: What you need to know about it (free PDF) (TechRepublic)
While it’s easy to see the pros and cons of private versus public clouds, deciding when to use multiple clouds or a hybrid cloud concept is far more complex. It requires careful thought over how you will use your big data, exercise governance and coordinate data, and how you will process data and enforce security across multiple clouds.
Because cloud big data needs and architecture design can be highly complex, defining the big data cloud architecture should be the first task on a big data cloud strategy list.
How to plan your big data security and management on the cloud
Depending on your cloud architecture, defining and managing security and operations on the cloud will vary from the simple to the complex. Each cloud that you subscribe to will have its own tools for security and performance management. If your IT staff is managing these areas directly, it will have to learn to use multiple sets of tools. You can contract with the cloud provider to perform management tasks for you, but you must be clear about the levels of security, logging, traceability and performance that you want—and confident that the cloud provider can meet these expectations.
SEE: iCloud vs. OneDrive: Which is best for Mac, iPad and iPhone users? (free PDF) (TechRepublic)
Whether you’re directly managing or consigning management of the cloud to a cloud provider, all cloud operational standards and procedures should be documented and tracked, just as they are in your own data center. Deciding how documentation and tracking will be done is central to any big data cloud strategy.
How to manage the costs of your cloud storage
Costs for processing and storage in the cloud can be cheaper than they are on-prem, but they won’t be for long if you don’t have a way of tracking your scale-outs of processing and storage on the cloud, and what these scale-outs are being used for. At the end of processing cycles, procedures should be in place for deallocating processing and storage so you don’t continue to pay for them. If your big data changes rapidly, old data should be purged or at least moved to low-cost cold storage on the cloud. Finally, cloud cost accounting should be reviewed upfront with your cloud providers. Do you understand their bills? If it’s difficult to decipher costs, develop a way with your cloud provider to make it simple so you can track your usage.
Cost management should be part of every big data cloud strategy.