Organizations are just digging in with big data, and already portents of change are forming.
This time, it appears to be some re-thinking by HPC (high performance computing) vendors about which hardware delivers the best processing for big data payloads that need to be processed and analyzed as quickly as possible.
Big data’s HPC heritage is grounded in universities and research institutes sharing supercomputers with immense footprints and costs of operation. But this isn’t the space where enterprises and SMBs (small and medium-sized businesses) operate. Instead, these organizations want affordable and scalable power computing for their big data that can be budgeted into their data centers. Unless these organizations opt for cloud service providers to host and run their big data processing and analytics, they’re also going to be looking at real metal (and not virtualized) HPC platforms, because HPC and big data don’t tend to perform well in virtualized environments.
Thus far, the platform of choice in the enterprise data center for big data processing has been x86 servers. Part of the reason is their ease of scalability into big data processing clusters as organizations expand their big data processing capabilities. Another reason is the comparative affordability of x86-grade machines, even though they must be specially configured in order to apply HPC’s parallel processing against big data in an analytics operation. The “catch” in this is that as big data processing in enterprises grows, so do expectations.
In the x86 hardware architecture, big data can be processed in parallel, but on only two threads per server. In comparison, in a RISC (reduced instruction set computing) chip environment, customarily found on computers running Unix, four threads can be parallel-processed per server, a 2x improvement.
How seriously are big data solution providers looking at this?
IBM rolled out its RISC-based Power System several years ago with the ability to scale and run both Linux and Unix (AIX) clusters for big data HPC . Not to be outdone, Oracle launched its SPARC T5 processors in late first quarter of 2013.
The interesting background behind this is that the Unix RISC-based computing market has actually been in decline. It understandably has some in the industry wondering why vendors would choose to make major investments.
The answer could be as simple as recognizing that the future of big data processing is likely to surpass what x86-based computing platforms can offer. Today’s RISC-based machines can also run Linux as well as Unix. This is helpful, because IT has resident staff skills in Linux, but not necessarily in Unix.
So, what do you do as you continue your big data analytics rollout in the data center?
#1-Rethink your asset planning
Many sites have begun implementing big data processing clusters on x86-grade machines. But it isn’t too early to start thinking about RISC-based systems, since they could be the future. This thinking needs to be in dollars and cents as well as in IT infrastructure integration and implementation terms-because big data processing (unless you are outsourcing your big data work to the cloud) requires “real” hardware.
#2-Evaluate staff skills
RISC-based platforms can run Linux-but they still represent a different hardware architecture that even the best automation cannot render entirely transparent. New IT system and administrative skills might be needed.
#3-Check with your vendors
Always maintain an active dialogue with your big data vendors. This is more than just understanding the products they are selling today, because you should also be in tune with their technology roadmaps and where their products are going. If their intent is to move to RISC and the product they have sold you is x86-based, it’s time to talk. You have an investment at stake. If they want to keep you happy, it will be incumbent on them to offer you a migration path, training–and discounts.