I think it is worth pointing out that you calculate the latency time from the RPM value
10000/60 = 166.6R
1/166.6R = 0.006
so 1/2 a turn = 0.003s
As the seeking by the head and spinning of the disk occur concurrently I would just take the slower of the too.
I remember reading Jeff Bonwick at Sun saying you can get the seek to keep pace with the platter spinning but its the linear velocity of the outside edge of 3.5" HDs that stops you spinning much faster (prompting a move to 2.5" enterprise disks)
http://blogs.smugmug.com/don/2007/10/08/hdd-iops-limiting-factor-seek-or-rpm/ (first comment)
Discussion on:
View:
Show:
This article is profoundly simplistic, and does not take into consideration numerous factors that include
1) Degraded performance - How does a SATA vs a SAS array perform when a bad block is encountered or when there is degradation due to a failed disk
2) Data corruption - SAS disks typically have 10X or greater number of ECC bits then SATA drives. With multi TB arrays, you pick up ECC errors quite frequently
3) Drive controllers matter. More intelligent controllers with larger cache will generate fewer I/Os to disk as they coalesce I/O requests. Furthermore, the added intelligence that the Cache control mode page (08h) provided in disks that use the SCSI protocol allow for tuning that one can not do with SATA disks.
As such, for almost any given load, one can tune SAS disks to do fewer I/Os.
This paper considers I/Os to be absolute and not affected by the drive technology. That is simply not the case.
Finally IOMETER does NOT analyze I/Os that are actually performed by the disk drive. While it is a nice tool, it can not be used to measure physical disk I/Os. It only measures I/O requests sent by high level read/write requests. It is also incapable of measuring actual I/Os performed by the disk drives behind a RAID controller.
1) Degraded performance - How does a SATA vs a SAS array perform when a bad block is encountered or when there is degradation due to a failed disk
2) Data corruption - SAS disks typically have 10X or greater number of ECC bits then SATA drives. With multi TB arrays, you pick up ECC errors quite frequently
3) Drive controllers matter. More intelligent controllers with larger cache will generate fewer I/Os to disk as they coalesce I/O requests. Furthermore, the added intelligence that the Cache control mode page (08h) provided in disks that use the SCSI protocol allow for tuning that one can not do with SATA disks.
As such, for almost any given load, one can tune SAS disks to do fewer I/Os.
This paper considers I/Os to be absolute and not affected by the drive technology. That is simply not the case.
Finally IOMETER does NOT analyze I/Os that are actually performed by the disk drive. While it is a nice tool, it can not be used to measure physical disk I/Os. It only measures I/O requests sent by high level read/write requests. It is also incapable of measuring actual I/Os performed by the disk drives behind a RAID controller.
You're right to point out that reassignments take a huge amount of time, but they just don't happen that frequently to affect overall performance. The same goes for ECC errors. A lot of small ECC errors can even be corrected on the fly by the read logic without any extra action by the drive's microprocessor.
Smart controllers with cache do have an impact. In particular, faster controller microprocessors with faster data paths are real advantages. However, most storage systems have layers of cache (disk drive, controller, system, etc.) and there are diminishing returns to be had by cache. And when manufacturers publish their own benchmarks, they often arrange the tests so as much of the IOs can occur out of cache as possible. Thus while cache does help in the real world, it often contributes to manufacturers overselling the performance of their storage subsystems.
Smart controllers with cache do have an impact. In particular, faster controller microprocessors with faster data paths are real advantages. However, most storage systems have layers of cache (disk drive, controller, system, etc.) and there are diminishing returns to be had by cache. And when manufacturers publish their own benchmarks, they often arrange the tests so as much of the IOs can occur out of cache as possible. Thus while cache does help in the real world, it often contributes to manufacturers overselling the performance of their storage subsystems.
Proper use of Secure Socket Layer security is a mystery even to many virtual server administrators, but it seems to be mysterious even to the developers who build it into their products-whether they know it or not. Where To Find Coupons
You seem to have made the implicit assumption that while the head is seeking the proper sector is always rotating closer to the head. The truth is that about half the time the sector is rotating *away* from the head (i.e., the head comes on track but just misses the sector). The two latencies are additive, but there's a random component to just how additive. And even if the disk controller has perfect knowledge of where the head and sector are at all times, it can't do away with all rotational latency even though it can optimize the order in which IOs are performed to minimize overall latency. You first have to move the head to the proper track and then you still have to wait for the proper sector to come under the head. This is what makes modeling disk IOs so difficult.
Mr. Bonwick is correct about seek speeds versus rotational speeds. It comes down to moving a relatively light head stack versus spinning a relatively heavy platter stack.
Mr. Bonwick is correct about seek speeds versus rotational speeds. It comes down to moving a relatively light head stack versus spinning a relatively heavy platter stack.
You are right! At the point the seek is done, you need to wait for the platter to spin into place, on average half a turn. I got all fired up, convinced I was right, damn hidden assumptions.
Have tried a couple Free add ins that will remove all attachments from Inbox plus sub folders to an archive destination while inserting a note that the attachment has been moved to "destination.". Makes a major reduction in your PST file.
They work fairly well except all choke and die when they run into an e-mail that is digitally signed. Is there one out there that will ignore the signature and strip the attachment? cizgi film oyunlari cocuk oyunlari
They work fairly well except all choke and die when they run into an e-mail that is digitally signed. Is there one out there that will ignore the signature and strip the attachment? cizgi film oyunlari cocuk oyunlari
This is yet another article written for the database administrator which claims to have universal importance to all server administrators.
In our environment, random I/O is not the primary concern. Working with photographic images ranging in size from hundreds of megabytes to a few gigabytes, sequential read/write performance is key.
For sequential read/write performance, server system memory and network transport (we use bonded gigabit Ethernet at server and workstation) become highly critical to serving files to workstations. Indeed, workstation disk performance can be a limiting factor, as the server can push files faster over the network than the workstation can save them.
Other factors are clearly important as well. I mention these as they are ignored in the article.
In our environment, random I/O is not the primary concern. Working with photographic images ranging in size from hundreds of megabytes to a few gigabytes, sequential read/write performance is key.
For sequential read/write performance, server system memory and network transport (we use bonded gigabit Ethernet at server and workstation) become highly critical to serving files to workstations. Indeed, workstation disk performance can be a limiting factor, as the server can push files faster over the network than the workstation can save them.
Other factors are clearly important as well. I mention these as they are ignored in the article.
While you're right that the type of IOs from the application level affect overall storage performance, what goes on at the actual storage device may not resemble the application's view. Writes on RAID5, for instance, require reads and calculation of parity as well as the actual writes. What you think is a sequential write at the application level is a whole lot of IOs at the storage device that involve factors such as extra rotational latency, etc. Virtual storage systems do all kinds of slicing and dicing of what the application thinks is a sequential write. The reason for this is that storage subsystems have to worry about space management, backup, overall performance, etc., things that the application programmer seldom worries about.
Most enterprise storage subsystems can be designed so that the data paths are not the limiting factor. While historically it has happened, it's rare that the data transfer interface to a disk drive cannot keep up with the fastest media transfer speed of the disk. It's then a question of how much saturation of the common data paths further upstream you want to tolerate versus how much you want to spend.
In general, it's been harder to speed up the media transfer speeds of disk versus their data interface. The transition to vertical disk recording, for example, took about a decade of R&D (roughly 1995 to 2005). In that time faster parallel SCSI and then fibre channel interfaces went through several iterations and easily kept up.
Most enterprise storage subsystems can be designed so that the data paths are not the limiting factor. While historically it has happened, it's rare that the data transfer interface to a disk drive cannot keep up with the fastest media transfer speed of the disk. It's then a question of how much saturation of the common data paths further upstream you want to tolerate versus how much you want to spend.
In general, it's been harder to speed up the media transfer speeds of disk versus their data interface. The transition to vertical disk recording, for example, took about a decade of R&D (roughly 1995 to 2005). In that time faster parallel SCSI and then fibre channel interfaces went through several iterations and easily kept up.
In order to keep the article sane, I did choose to focus on random workloads since that's what most of the IO is in our data center. We don't do massive amounts of work with sequential workloads.
I didn't intend for the work to be considered as a general "all purpose" piece targeted at every administrator out there; I apologize if that wasn't clearly written.
Scott
I didn't intend for the work to be considered as a general "all purpose" piece targeted at every administrator out there; I apologize if that wasn't clearly written.
Scott
You have fallen victim to the common misunderstanding about MBTF, which is perpetuated by the industry.
The error is partly due to the fact that the units are incorrectly stated. The units of MBTF are something like "unit/hours".
MTBF is a measure of how many units will fail in a given amount of time. Thus, a MBTF of 100,000 says, "out of 100,000 units, one unit will fail per hour." This is a useful measure for planning the number of replacement parts necessary (in rather large environments perhaps).
Some examples, a battery which might only last a couple hours, might still have an MBTF of 100,000. Of 100,000 batteries, one battery will "fail". All of the (other) batteries still will only last a couple hours.
Another example, men, aged 30, have a MBTF of 1000 (roughly). Out of 1000 men, one man will die. However, to the best of my knowledge, no man will live to be 1000 years old. As stated, MBTF is a useful measure for stocking purposes, or in this case with men, women need to keep approximately 1000 men "on hand."
The error is partly due to the fact that the units are incorrectly stated. The units of MBTF are something like "unit/hours".
MTBF is a measure of how many units will fail in a given amount of time. Thus, a MBTF of 100,000 says, "out of 100,000 units, one unit will fail per hour." This is a useful measure for planning the number of replacement parts necessary (in rather large environments perhaps).
Some examples, a battery which might only last a couple hours, might still have an MBTF of 100,000. Of 100,000 batteries, one battery will "fail". All of the (other) batteries still will only last a couple hours.
Another example, men, aged 30, have a MBTF of 1000 (roughly). Out of 1000 men, one man will die. However, to the best of my knowledge, no man will live to be 1000 years old. As stated, MBTF is a useful measure for stocking purposes, or in this case with men, women need to keep approximately 1000 men "on hand."
Would it have been more accurate if I had described it as the time until an unrecoverable read error occurs? That's what I was going for...
Scott
Scott
This is great information - but there are VERY few tools that help you understand how many IOPs your system is actually using and if you are hitting that "wall". Vendor of the hardware always give you IOPs numbers and vendors of software almost NEVER do. Even high end enterprise arrays don't always provide a direct "iop-o-meter" especially when the lun is virtualized across many drives (and shared) this is helpful but not the holy grail of performance.
There is no mention of the size of the IOP in this equation. A 2k IOP is a lot different than a 32k, 128k, 256k IOP. 2000 2k IOPS (4000k of data)is a lot less data than 2000 256k IOPS (51200k of data).
The IOPS formula also ignores the benefits of OS Optimization. There's no reason why the OS can't handle I/O operations in "disk order" or "availability order" instead of "request order."
With a little planning and foresight, the average request time can be improved substantially from the "average" performance in the article.
With a little planning and foresight, the average request time can be improved substantially from the "average" performance in the article.
Shouldn't formula read (total workload * % Read workload) + (total workload * % Write workload * Raid Write Penalty).
Formula above doesn't reflect % write workload.
good information
Formula above doesn't reflect % write workload.
good information
I wrote a small application in flash that calculates IOPS overhead on array based on formula provided in article:
http://www.syshalt.net/pub/flash/iocalc.swf
http://www.syshalt.net/pub/flash/iocalc.swf
Disclosure - NetApp Employee -
It's a personal pet peeve of mine, but most of the latency rules of thumb are at best only marginally informative, and at worst, downright misleading. I'm not having a go at the author or his sources, but this kind of stuff has been so often repeated that it has become a self supporting "truth".
These "one figure" IOPS figures ignore the effect of the advanced queuing and elevator algorithms that have been present in disk drives for the last 10 odd years, which means a SATA drive has about 10 IOPS at a 13ms latency and 130 IOPS at 100ms latency.
I covered that and the impact of various raid and caching configurations in one of my blog posts here
http://storagewithoutborders.com/2010/07/19/data-storage-for-vdi-part-2-disk-latencies/
Also it should be noted that in enterprise arrays, because of SATA's unchangable 512byte block sizes, the usual checksum mechanisms used on FC disks wont work (which use 520 byte blocks which includes an 8 byte checksum). These checksum mechanisms (e.g. slip mask on EMC, Zoned Checksum on NetApp) all impose some extra level of IO which impacts both reads and writes.
Rules of thumb are all well and good, but they're no substitute for consultation between the people who truly understands the performance characteristics of your workload, and the array architecture you are going to implement
Regards
John Martin
It's a personal pet peeve of mine, but most of the latency rules of thumb are at best only marginally informative, and at worst, downright misleading. I'm not having a go at the author or his sources, but this kind of stuff has been so often repeated that it has become a self supporting "truth".
These "one figure" IOPS figures ignore the effect of the advanced queuing and elevator algorithms that have been present in disk drives for the last 10 odd years, which means a SATA drive has about 10 IOPS at a 13ms latency and 130 IOPS at 100ms latency.
I covered that and the impact of various raid and caching configurations in one of my blog posts here
http://storagewithoutborders.com/2010/07/19/data-storage-for-vdi-part-2-disk-latencies/
Also it should be noted that in enterprise arrays, because of SATA's unchangable 512byte block sizes, the usual checksum mechanisms used on FC disks wont work (which use 520 byte blocks which includes an 8 byte checksum). These checksum mechanisms (e.g. slip mask on EMC, Zoned Checksum on NetApp) all impose some extra level of IO which impacts both reads and writes.
Rules of thumb are all well and good, but they're no substitute for consultation between the people who truly understands the performance characteristics of your workload, and the array architecture you are going to implement
Regards
John Martin
These "one figure" IOPS figures ignore the effect of the advanced queuing and elevator algorithms that have been present in disk drives for the last 10 odd years, which means a SATA drive has about 10 IOPS at a 13ms latency and 130 IOPS at 100ms latency.. school girl pictures
There are also plenty of other points of contention within the host->array stack which makes this whole subject more complicated, however, when comparing different drive speeds, calculating a worst case scenario gives a good indication of how differing drives will perform. internet marketing blog
Hi there,
We are looking for a high IOPS device; looking at a Hybrid device, outlined in the article with options for HBA???s and 10Gbs switch-fabric. Has anyone tested Open-E SAN Storage and have any IOPS results:
http://www.sentralsystems.com/open-e-dss-storage-servers/
IBM Storwize V7000 Unified Storage does provide higher IOPS and I would like to compare the performance of Open-E system with the IBM. Ideally, performance of 100k IOPS or higher will be good.
We are looking for a high IOPS device; looking at a Hybrid device, outlined in the article with options for HBA???s and 10Gbs switch-fabric. Has anyone tested Open-E SAN Storage and have any IOPS results:
http://www.sentralsystems.com/open-e-dss-storage-servers/
IBM Storwize V7000 Unified Storage does provide higher IOPS and I would like to compare the performance of Open-E system with the IBM. Ideally, performance of 100k IOPS or higher will be good.
- Keyboard Shortcuts:
- Prev
- Next
- Toggle









































