I’ve written a lot about iSCSI in these
storage columns, and I believe it to have a very bright future, especially
with 10-Gb Ethernet on the way and the significant difference in acquisition
costs between a complete iSCSI installation and a similar fibre channel setup.
In general, storage technologies have tended to be complex and
somewhat misunderstood. iSCSI, while simpler, still retains some of the
complexity and adds some new concepts to which you will need to become accustomed
if you implement it. In this article, I’ll provide some information about iSCSI
for those of you that may be considering this quickly-growing technology.
Some people consider iSCSI to be the melding of a dedicated storage
area network (SAN), like fibre channel, with network attached storage (NAS).
This is probably because iSCSI uses Ethernet as its transport mechanism. That
said, on a scale with NAS at one side and SAN at the other, iSCSI is definitely
much closer to the SAN side of the scale. Whereas NAS devices work at the file
level—that is, entire files are transferred to and from the device—iSCSI works
at the block level, exactly like a locally connected disk. This means that
iSCSI can be used in situations inappropriate for NAS, such as for some database
applications and Exchange. (Although Exchange can use NAS file-level devices, block-level storage is definitely
the preferred mechanism.) When you attach a server to an iSCSI array—to the server—the
storage looks just like a local disk, thus providing a seamless storage
experience.
Many iSCSI arrays offer the following features:
Multiple path
capability: Like other storage technologies, iSCSI offers the capability
for multiple data paths to provide redundancy and greater throughput. Instead
of a single-point-of-failure, gigabit Ethernet connection, you can install
multiple gigabit Ethernet adapters in your servers, and provide a fairly
inexpensive, fully-meshed storage architecture that, when everything is up and
running, also offers aggregated bandwidth to the iSCSI target for improved
performance.
Ethernet jumbo frame
support: This isn’t really an iSCSI technology, but it does make iSCSI
perform better. This larger frame size reduces the overhead on both your
servers and iSCSI targets. Jumbo frames are generally 9K in size, but some NICs
and switches support 16K frames as well.
Snapshots: Generally
available for an additional (and sometimes huge) cost on fibre channel gear, snapshots
are the primary reason that many companies opt for a centralized storage
architecture. In most iSCSI equipment, the snapshot feature is included in the
base price of the product. Snapshots provide significant data protection in
that they can protect your data between backups. Without snapshot capability,
many companies have a “window of risk” of 24 hours or more, meaning
that, between backups, data loss can occur. For example, if you have database
corruption in your ERP system at 5 P.M. and you restore from the previous
evening’s backup, you could lose 17 hours of data, or more. Using snapshots, if
your database blew up at 5 P.M., you could remount a snapshot from 4 P.M. and,
while there would still be some data loss, it would be much more limited. Snapshots
significantly reduce your window of risk.
Replication: Again,
this feature is also available on fibre channel SANs—often
at a hefty cost—but is usually bundled with iSCSI SANs. Replication is an
important disaster recovery element that can automatically copy data from your
primary data center to another similar SAN array in a backup data center. With
iSCSI’s comparatively inexpensive price tag, true hot-site disaster recovery
becomes a real possibility for small- and medium-size businesses. Both
synchronous and asynchronous replication are generally supported. Synchronous
replication is considered “real-time” and is great for faster links,
while asynchronous replication is more suitable for remote offices, as it is
usually scheduled.
Scalability:
Enterprise-class iSCSI SAN arrays can scale in both storage and performance by
adding additional units to the storage cluster. For example, in my data center,
we have a single EqualLogic PS200E array with multiple gigabit Ethernet connections.
When I eventually add a unit to the SAN cluster (even a single unit goes into a
cluster), the two arrays will automatically restripe all data across both units,
and the available bandwidth to the overall cluster will double since I will be
doubling the number of gigabit Ethernet connections.
Ethernet: This is
the most simplistic part of iSCSI. It runs on Ethernet. The chances are pretty
good that you and your coworkers are intimately familiar with Ethernet—both with
its good points and its bad points. With iSCSI, you don’t need to learn yet
another transport mechanism. You can rely on inexpensive, commodity hardware
and can get your storage network running quickly and easily.
iSCSI is beginning to make inroads into companies that may not
have considered this technology before. With its growing popularity, it’s
important for IT people to understand some of the features and benefits.