Big Data

Getting on top of the Big Data life cycle

If we don't start thinking about how we are going to manage this incoming mass of Big Data in our data centers--where images, videos and documents are growing at a clip of 80 percent--we may never be able to lift our heads from under it.

Big data, like other kinds of data, has a life cycle-but how many organizations think about big data in this way? At this moment in time, probably not many.

To compare, it has taken us decades to get on top of data life cycles for standard transactional data with fixed record lengths coming from baseline "systems of record" in the enterprise. Even now, it is not uncommon for IT to sit down with various business functions deemed to "own" this data in order to determine both business and regulatory requirements for data retention and storage.

With big data, which can be unpredictable and come in many different sizes and formats, the process isn't so easy. Yet, if we don't start thinking about how we are going to manage this incoming mass of unstructured and semi-structured data in our data centers--where images, videos and documents are growing at a clip of 80 percent--we may never be able to lift our heads from under it!

IT's heritage will tell us that the easiest way to manage big data is by throwing more storage at it, because storage is relatively cheap, it's often easy to increment without sending out any budget "alerts" to the CFO, and it's also what organizations have always done. If you stick deduplication technology in front of your storage, you can also weed out duplicate data and use data compression to further assist in managing the storage footprint.

Unfortunately, what you do in the data center doesn't affect the distributed servers in different areas of the company that continue to house copies of the same big data.

Ash Ashutosh, CEO of Actifio, a data management software provider, cites an example of a medical research facility that generates 100 terabytes of data from the various instruments that it uses.

According to Ashutosh, the research facility has 18 different research departments that further process the same big data, with each department adding five terabytes of additional synthesized data to the baseline data.

"Now they must manage a total of over a petabyte of data, of which less than 150 terabytes is unique," said Ashutash, "Yet, the entire petabyte of data is backed up, moved to a disaster recovery site, consuming additional power and space used to store it all. So now, the medical center has used over 10 petabytes of storage to manage less than 150 terabytes of real unique data. This is not efficient."

It isn't efficient-and it's the kind of big data problem that can't be solved by just sitting down with various departments and identifying data retention policies.

Ashutash recommends virtualizing all of this data and relocating it to the data center. The data center has techniques that can identify duplicated data and eliminate it, IT staff has data management experience that business users don't have, and technologies like virtualization from single-source servers in the data center can provide on demand service and access to big data to end users throughout the organization.

There is also an operational side to this that involves data and process ownership, and that can become quite political.

The various departments involved must agree to surrender their physical servers and data, and to work off centralized and virtualized data that is maintained in the data center. This is where the CIO and other C-level executives enter in-because people throughout the organization have to understand and support a virtual data policy-and the data management guidelines that come with it.

In most organizations, this is still a work in progress. Consider:

What are the takeaways for IT?

First, that some old fashioned data management meetings-this time about big data-should be held at both the strategic and operational levels. These meetings will undoubtedly be about policies, but they will also be about control.

Second, if IT hasn't already done so, it should get aggressive in the data center, putting into play technologies that are proven and ready to harness the big data that daily enters corporate portals.

Now is the time-before the digital floodtides literally sweep you away.

About

Mary E. Shacklett is president of Transworld Data, a technology research and market development firm. Prior to founding the company, Mary was Senior Vice President of Marketing and Technology at TCCU, Inc., a financial services firm; Vice President o...

9 comments
calvinbrock735
calvinbrock735

Great post full of useful tips! My site is fairly new and I am also having a hard time getting my readers to leave comments. Analytics shows they are coming to the site but I have a feeling “nobody wants to be first”. girlsdoporn

johnpatinson345
johnpatinson345

Great, This specific net webpage is seriously thrilling and enjoyment to learn. I’m an enormous fan from the subjects mentioned. ohsas 18001 livorno

johnpatinson345
johnpatinson345

Really great post, Thank you for sharing This knowledge.Excellently written article, if only all bloggers offered the same level of content as you, the internet would be a much better place. Please keep it up! Health news

johnpatinson345
johnpatinson345

wow, great, I was wondering how to cure acne naturally. and found your site by google, learned a lot, now i’m a bit clear. I’ve bookmark your site and also add rss. keep us updated. birthday gifts uk

johnpatinson345
johnpatinson345

I came onto your blog while focusing just slightly submits. Nice strategy for next, I will be bookmarking at once seize your complete rises larnaca airport taxis

johnpatinson345
johnpatinson345

Many thanks for this brilliant post! Many points have extremely useful. Hopefully you'll continue sharing your knowledge around. Fashion

calvinbrock7351
calvinbrock7351

With big data, which can be unpredictable and come in many different sizes and formats, the process isn't so easy. Yet, if we don't start thinking about how we are going to manage this incoming mass  

girlsdoporn

Editor's Picks