Anyone who uses Microsoft Access is likely to have had problems with database corruption. While developers typically use more advanced database tools for large projects, Access is still used for smaller projects and also serves as a back-end data store by some third-party products. This article, the first in a three-part series on working with Access, looks at how Access is used and why it is so prone to corruption.
Access is a “file system” relational database, storing every part of the database—tables, queries, forms, reports, etc.—inside one MDB file. It’s very rare to see corruption on a local database, where a single copy of the MDB file is stored on the hard disk and only worked on from that machine. This solution is unacceptable, however, because data applications are most effective only if they can be shared over a network. You’ll most often find Access being put to work in a multiuser environment.
Access and the multiuser environment
The problem with a multiuser environment is that it is distributed, and the MDB file has to be shared between many users in different locations on the network. As an example, I want to run a query on a table (let’s call it tblExample) that contains 20,000 records. The MDB file has been stored on a server and shared across the network, so anyone can access it at any time. I run the query, which realizes that it’s dealing with tblExample and pulls the table containing 20,000 records across the network to my workstation.
Unfortunately, that’s how Access works. Even if your query is only returning three records from tblExample, the entire 20,000 records have to be transferred to the workstation, which then works on these records and returns the relevant result set. (I should mention here for Access purists that things are not always this simple, such as when indexes are involved. In those cases, the database engine has to move far less data across the network. That does assume, however, that you have a lot of well-designed indexes and that you never want data that hasn’t been indexed, and it doesn’t even touch on the performance hit of rebuilding the index each time data is altered. For the purposes of this article, it’s enough to know that the all the data—not just the results—have to travel across the network.)
Database corruption is often the result of a user improperly disconnecting from a database. When tables are frequently flying around the network, you’re leaving a pretty massive window for this kind of error. And when Access gets mad at somebody disconnecting without first closing the connection properly, it takes it out on everyone. When one user makes a “mistake,” everybody else is either kicked out or, worse, begins receiving corrupted data.
Corruption happens when a user unexpectedly quits the application without closing the connection “gracefully.” This can be caused by a power outage or a crash that forces a reboot of the PC (such as a Blue Screen of Death). It can also be caused by having to force Access out of the PCs’ memory, when it crashes or freezes. And Access, even in its most recent version, is not 100 percent reliable (especially for those who like to match version numbers and run it under Windows 2000 it seems). Even more likely to cause corruption, however, is when your server bites the dust. If a user leaves a connection to the database open, then we’re only talking in terms of one connection; if multiple users are working on the database and the server crashes, then multiply your chances of getting a corrupt database by the number of users. You would not expect a server to crash as regularly as a workstation, but if you use Windows, a server crash is probably on the horizon. If the server doesn’t crash, you’ve still got failed disks, controllers, and other hardware components inside the server—although Access corruption will be the least of your woes if your server fails.
Check out your network
In the real world, the network is the most likely cause of a dropped connection. Old network cables, overheating and overworked switches or hubs, creaky old network cards, and badly written device drivers are the most common problem areas—and should be addressed, since these failures are probably corrupting not only Access but also other areas. You should also look out for human error on the network. Make sure that cleaners and maintenance personnel are aware that hubs and switches (let alone servers) must not be unplugged at night and that patch leads have to remain in the computer, even if they do get tangled in the broom.
You also need to take a close look at your database for sources of corruption, especially if it’s complicated. Access uses Visual Basic for Applications as its programming language, which is an extremely powerful solution allowing you to build fully featured applications from a user-friendly shell. In some ways, this is the downfall of Access. It is not really robust enough to be pulled around in this way. While VB is pretty forgiving of small mistakes, the JET database engine frankly isn’t. If you open an ADO (or DAO) recordset object, then you must close it afterwards; otherwise, you will leave the connection open just as if the machine had crashed. You could see frequent corruption problems if a function is used several times an hour by 50 users.
All of the above factors and more can combine to cause problems. If your network is shaky in the first place, and you’re leaving ADO recordsets open left and right, then you’re more likely to see corruption than if only one of the above scenarios is true. Corruption is not just something to think about when it’s happening. If you have mission-critical systems running Access, now is the time to be taking precautions.
What is your experience with Microsoft Access?