CoreOS has announced a new, open source distributed storage system named "Torus," which is primarily intended to be used as part of Tectonic, a commercial packaging of the open-source CoreOS project with Google's Kubernetes container cluster management system. The software, currently in prototype phase, provides a scalable storage solution to handle the complex data I/O demands of containers.
According to the release announcement, "The problem of reliable distributed storage is arguably even more historically challenging than distributed consensus. In the algorithms required to implement distributed storage correctly, mistakes can have serious consequences. Data sets in distributed storage systems are often extremely large, and storage errors may propagate alarmingly while remaining difficult to detect."
Why this project feels familiar
The purpose that Torus is intended to serve—a distributed storage system that provides block storage—already exists. The Ceph project, which only had a stable release in 2012 after five years of development, and the first stable release of the native filesystem (CephFS) in April, is effectively the open source standard for distributed storage.
There are also a wide variety of alternatives to Ceph, including OrangeFS (itself a fork of PVFS2), GlusterFS, Apache Hadoop Distributed File System (HDFS), BeeGFS, and OpenStack Swift. Naturally, as closed-source options go, Microsoft provides the generically named, closed-source "DFS" component in Windows Server, while EMC offers Isilon OneFS. ObjectiveFS also exists as another proprietary solution. Naturally, like Torus, these are all developed independently with differing goals in mind, and do not necessarily have a 100% feature overlap with Ceph (or each other).
Should I deploy Torus for my organization?
The short answer to this is no. At least, certainly not in production as the initial release is only a prototype, and is lacking features and utilities that would be necessary for use in a production environment. Aside from that fact, there is no particular value proposition for adopting Torus if you are not already using CoreOS or Tectonic in your organization.
SEE: Data storage: Preferred vendors, demands, challenges (Tech Pro Research)
The way in which Torus is written is also a source of moderate concern. While GlusterFS is written in C, and Ceph is written primarily in C++, with some reliance on Terra, Torus is written in Go, an improved version of C developed by Google.
While the design of the language itself is not a concern—it does not inherently indicate a performance degradation compared to C, in the way Java does—Google has a long history of introducing and unceremoniously killing projects. The main CoreOS product and Docker, their primary competitor, are both written in Go, as are a variety of other projects. However, the use of Go for something as long-lived as a storage platform raises concerns about its long-term maintainability.
So, why another competing standard?
Using web comics to make an argument about a specific technology always carries the risk of being heavy handed, though this xkcd strip immediately comes to mind. Torus is not being positioned as a one-size-fits-all panacea for any distributed storage need, but the project (even in its infancy) provides little justification for existing.
To the extent that this can be characterized as a problem, it is not an isolated one. Consider the history of the CoreOS project itself—when it was introduced in 2013, it relied exclusively on Docker as the container component which is managed by CoreOS.
In December 2014, a competing standard was introduced by CoreOS called Rocket, as a result of justifiable philosophical differences about the direction Docker progressed in. Much of the justification of Torus relies on the use of etcd for synchronization, a component also written by the CoreOS team in Go. This easily lends itself to an argument of CoreOS falling victim to "not invented here syndrome," when existing projects can be configured or extended to cover the needs of CoreOS developers and users, with less effort put forth than creating a new competing standard.
What's your view?
Does your organization currently use CoreOS for container management? Or, does your organization use Ceph for storage management? Share your software stack strategies in the comments.
- Special Feature: Building the Software Defined Data Center (ZDNet)
- How a mix of Cassandra and DC/OS makes massive scale simple (TechRepublic)
- EMC and smaller players planning open-source storage middleware (TechRepublic)
- AWS launches largest ever 2TB X1 instance, supports SAP HANA (TechRepublic)
- Public cloud computing vendors: A look at strengths, weaknesses, big picture (ZDNet)
James Sanders is a Java programmer specializing in software as a service and thin client design, and virtualizing legacy programs for modern hardware.