Microsoft created their Distributed File System (DFS) technology to solve many different data storage problems. DFS is designed to present users with a unified view of files and folders, regardless of the server on which the individual folders are located. Furthermore, DFS technology is designed to boost scalability and availability while offering features like load balancing and fault tolerance.
In the end, DFS turned out to be a wonderful technology. However, at least one company claims to have improved upon it. NuView has released a new product, StorageX, designed to improve upon the existing DFS components. In this article, I’ll take a look at StorageX to see what all the hype’s about.
Why use StorageX?
Like DFS, StorageX is designed to make both end users’ and administrators’ lives easier. The main difference is that StorageX picks up where native DFS leaves off. StorageX enhances many preexisting DFS features and builds on the normal DFS capabilities.
StorageX is designed to provide scalable, enterprise-class file virtualization. So what does that mean in English? File virtualization is the mapping of a file’s physical location to a logical location for the purpose of making a file appear to exist in a place where it doesn’t actually reside. Obviously, the DFS file system already provides file virtualization by making a Windows-based share point appear as a subdirectory beneath the DFS root. However, StorageX takes this concept further. With StorageX, there’s no requirement that the files have to exist in the same geographic location or even on the same server platform. You’re free to mix and match Windows NT, Windows 2000, and Windows .NET servers. Furthermore, these servers can exist anywhere in the world, so long as there is a network path between them. Gone is the requirement for the data servers to exist within a common site.
Another of StorageX’s benefits is the fact that it’s completely independent of network drive letters and server names. The concepts of server names and drive letters are so fundamental to computers that it’s easy to overlook the problems that they can cause. For example, suppose that a server fills up, and you need to move some of the data to another server. In a normal environment, this means educating the users and explaining to them that the data’s location has changed. You must then provide them with the new server name, share point name, and possibly even a new drive letter. Then you’ll have to deal with answering questions as to why some of their data is in one place and some is in another location. Initially, this end-user confusion drives up costs because your help desk will be bombarded by calls from presently nonproductive users who can’t seem to find their data.
StorageX takes care of that problem. After implementing StorageX, all of the data is mapped to a single logical location. It doesn’t matter where the data actually resides; to users, all of the data will appear to exist in one large directory. This does away with the need for drive letters, server names (from an end user standpoint), and share point names.
You can also apply the concept of file virtualization to data migration and server consolidation. Normally, when you need to move data or consolidate servers, the affected servers will be temporarily unavailable to end users during the migration. After the migration completes, you’ll have to show them where the data is stored.
With StorageX, server consolidations and data migrations are completely transparent to the users. Users can access data before, after, and even during the migration without having to redirect their search manually. Aside from a possible system slowdown, the users may never even know that the migration or server consolidation is happening. Of course, this is good for administrators as well. Not only will you no longer have to deal with the headaches of explaining all the changes to users after the migration, but you’ll also be able to do the migrations during normal business hours. No more lost nights and weekends!
How StorageX works
I’ve already explained that StorageX works by combining all of your data into a single virtual file system. This virtual file system is known as a namespace. The namespace is nothing more than a database that maps each file and folder’s physical location to its logical location. Whenever a client needs to access a file, the client looks at the namespace to determine the file’s location, and then retrieves the file. There are some interesting things about the way that this process works, though.
First of all, the namespace can exist on a single server, but it doesn’t have to. The namespace can be replicated to other servers to avoid having a single point of failure. Another unique feature of the StorageX system is that the clients don’t have to be reconfigured or run any special software to take advantage of the namespace. In fact, the entire StorageX system is designed in a way that avoids using any proprietary technology, avoids system-level software on the servers, and doesn’t require any client software.
StorageX was also designed to provide optimum performance. Although the clients must perform a namespace lookup, they still have the same direct path to the data that they’ve always had. I’ve seen some storage systems in which the namespace server acts as a proxy on behalf of the client. When the client requests a file, the namespace server then retrieves the file and passes the file to the client. As such, all data flows through the namespace server. Obviously, such a configuration is really bad from a performance standpoint. StorageX doesn’t use this method. Instead, when a client needs access to a file, the client looks up the file’s location and then retrieves the file itself.
Another way that StorageX works to enhance performance is to provide load balancing. As I’ll explain in greater detail later, StorageX provides many data replication options. You can use the replicated data for fault tolerance, but you can also use it for load balancing.
For example, suppose that your company had offices in Miami, Las Vegas, and New Orleans. You could replicate your data to each location. By doing so, the employees in each office could access the data from a local server rather than congesting WAN links by accessing a server halfway across the country. The namespace is designed so that users won’t know if the data is coming from the next room or the next continent. The entire StorageX system is designed to be invisible to the end user.
Robust disaster recovery with automatic failover
If you’ve ever managed large volumes of information, you know about some of the challenges involved when disaster strikes. For example, what do you do if you have so much data that you can’t back it all up in a 24-hour period, or, worse yet, if you can’t restore it all within 24 hours?
Up to now, one of the only solutions to this problem has been to replicate the data to another server or group of servers. However, depending on which replication scheme you use, switching to the replica after a disaster might take as long as simply trying to restore all of the data from tape. For example, you may have to rename the replica, reconfigure the clients, and do who knows what else before the data will be available once again.
In contrast, StorageX has a unique spin on replicated data. For example, suppose that you had a data center in Miami with 100 TB of data stored using DAS and SAN storage, and another data center in New Orleans, but the New Orleans data center only uses NAS, not SAN or DAS. Even with the differences in storage architectures and the geographic distance, it would still be possible to replicate the data in Miami to the New Orleans data center.
Now, imagine that your data is replicated and a massive hurricane wipes out the Miami office. Normally, the solution would be to reroute the network in a way that would allow the clients to access the data from the New Orleans office until the damage in Miami is taken care of. However, with StorageX, this isn’t necessary. The software is smart enough to sense the Miami failure and automatically reroute the clients to the replica. The entire process isn’t instantaneous, but the clients will be back in business in a matter of minutes rather than hours or even days.
Replication has been around for a long time, and it’s used for a variety of purposes. Essentially, there are two different types of replication, block replication and file-based replication.
Block replication tends to be used for the block storage system that’s used by many databases. Block replication is usually hardware based. File-based replication, on the other hand, focuses on individual files rather than blocks of information and is controlled by the operating system rather than the hardware.
Both types of replication get the job done, but each has some problems. First, replication tends to work well between identical devices or identical file systems, but difficulties arise when replicating between heterogeneous (different) environments. StorageX solves this problem by supporting heterogeneous environments. Therefore, you can replicate data between any Windows-controlled storage systems without having to worry about compatibility.
Replication scheduling and bandwidth throttling
Another problem with replication is that it can eat up a lot of bandwidth. For example, suppose that you needed to copy 100 GB of data between two network servers. If you replicated the data in the normal manner, your network’s available bandwidth and overall performance may decrease because of the amount of bandwidth consumed by the operation. Of course, if you’re using switches rather than traditional hubs, or if you have some other type of VLAN implementation, the traffic causing bandwidth problems on the rest of the network is a nonissue. However, if you have clients who still need to communicate with the servers involved in the operations, those communications will be affected by the loss of bandwidth.
StorageX solves this problem by incorporating replication scheduling and bandwidth throttling. This means that you can schedule replication to take place at a time of day (or night) when it will have the least impact on users. You can also use throttling to limit bandwidth consumption so that the replication process doesn’t consume all of the server’s available bandwidth.
One other problem with traditional data replication is that, depending on your environment, it can be difficult to tell whether replication was completely successful, partially successful, or if it never occurred at all. StorageX solves this problem by providing a comprehensive but easy-to-use management interface. This lets administrators not only schedule replication, but also see replication progress reports.
StorageX has relatively few system requirements. The server that’s running the main StorageX console must be running Windows 2000 with Service Pack 2 or higher, and it must have at least 20 MB of free hard disk space. You’ll also need Internet Explorer 5.5 or later, and a video resolution of no less than 800x600 with 256 colors. However, NuView strongly recommends greater color depth.
StorageX is licensed on a per-root-node basis. This means that you must purchase a license for each server that you intend to manage. Unfortunately, the NuView Web site doesn’t list the price of these licenses. Instead, you must contact NuView and make an appointment to speak with a sale representative. You can contact NuView about speaking to a sales representative by going to the Specifics section of their Web site. You can also contact a NuView reseller by phone or by snail mail. The contact information is as follows:
- StorNet Inc.
11104 West Airport Blvd.
Stafford, TX 77477
- Total Tec Systems, Inc
2 Gourmet Lane
Edison, NJ 08837
If you’re the type who’d rather try before you buy, NuView offers a 30-day evaluation version of the software that you can download. Unfortunately, this isn’t a hassle-free download. You must fill out a long registration form, and a sales person will contact you and give you a temporary license key to use with the trial software.