Reply to Message

The cost of disk access
While there is a general rule here about efficiency vs. cleanliness, it's also an example of how most programmers don't understand just how much repeated disk access costs them. If this was understood by Mr. Perrin originally, he would have realized the potential problem when he first wrote the code and at least checked its impact as the database became larger.

This is a classic case of efficiency versus cleanliness. Many OSes such as Linux use APIs for I/O that deliberately hide the underlying peripheral device. There's good reason for this abstraction, but at some point the ugliness of the real world kicks in. I once worked on a ftp server written by a Linux guru that treated disk accesses just like memory accesses. He implicitly assumed that the server was processor and memory bound instead of disk bound, and repeatedly reread the database from the disk. When I rewrote the code to minimize disk accesses, it's performance improved by 40%. I don't think my code was any more complex than the original; it just had a better understanding of which resources were the most limiting.
Posted by wilback
15th Jun 2011