Editor’s note: This article was originally published March 9, 2006.
It’s been said that the database administrator (DBA) has three basic tasks. In decreasing order of importance, they are: protect the data, protect the data, and protect the data.
Although data integrity is clearly the #1 job (who cares if the database is available or fast if the data isn’t good), the DBA has many other jobs as well. Here’s a list of the actual tasks that a DBA performs. (Some duties are common to all DBAs, and others are only required in some database environments.)
Installation, configuration, upgrade, and migration
Although system administrators are generally responsible for the hardware and operating system on a given server, installation of the database software is typically up to the DBA. This job role requires knowledge of the hardware prerequisites for an efficient database server, and communicating those requirements to the system administrator. The DBA then installs the database software and selects from various options in the product to configure it for the purpose it is being deployed. As new releases and patches are developed, it’s the DBA’s job to decide which are appropriate and to install them. If the server is a replacement for an existing one, it’s the DBA’s job to get the data from the old server to the new one.
Backup and recovery
DBAs are responsible for developing, implementing, and periodically testing a backup and recovery plan for the databases they manage. Even in large shops where a separate system administrator performs server backups, the DBA has final responsibility for making sure that the backups are being done as scheduled and that they include all the files needed to make database recovery possible after a failure. When failures do occur, the DBA needs to know how to use the backups to return the database to operational status as quickly as possible, without losing any transactions that were committed. There are several ways the database can fail, and the DBA must have a strategy to recover from each. From a business standpoint, there is a cost to doing backups, and the DBA makes management aware of the cost/risk tradeoffs of various backup methods.
Because databases centralize the storage of data, they are attractive targets for hackers and even curious employees. The DBA must understand the particular security model that the database product uses and how to use it effectively to control access to the data. The three basic security tasks are authentication (setting up user accounts to control logins to the database), authorization (setting permissions on various parts of the database), and auditing (tracking who did what with the database). The auditing task is particularly important currently, as regulatory laws like Sarbanes-Oxley and HIPAA have reporting requirements that must be met.
Storage and capacity planning
The primary purpose of a database is to store and retrieve data, so planning how much disk storage will be required and monitoring available disk space are key DBA responsibilities. Watching growth trends are important so that the DBA can advise management on long-term capacity plans.
Performance monitoring and tuning
The DBA is responsible for monitoring the database server on a regular basis to identify bottlenecks (parts of the system that are slowing down processing) and remedy them. Tuning a database server is done on multiple levels. The capacity of the server hardware and the way the operating system is configured can become limiting factors, as can the database software configuration. The way the database is physically laid out on the disk drives and the types of indexing chosen also have an effect. The way queries against the database are coded can dramatically change how fast results are returned. A DBA needs to understand which monitoring tools are available at each of these levels and how to use them to tune the system. Proactive tuning is an attitude of designing performance into an application from the start, rather than waiting for problems to occur and fixing them. It requires working closely with developers of applications that run against the database to make sure that best practices are followed so good performance will result.
When things do go wrong with the database server, the DBA needs to know how to quickly ascertain the problem and to correct it without losing data or making the situation worse.
In addition to these basic responsibilities, some DBAs need special skills because of how the database is being used.
With the advent of the Internet, many databases that could have been available only during the day are now required to be available 24 hours a day, 7 days a week. Web sites have changed from static, pre-defined content to dynamically created content, using a database to create the page layout at the time a page is requested. If the Web site is available 24×7, so must the underlying database. Managing a database in this environment requires an understanding of which types of maintenance operations can be done online (with the database available to users) and which must be scheduled for a maintenance “window” when the database may be shut down. It also requires planning for redundant hardware and/or software components, so that when one fails, others will keep the total system available to its users. Techniques like online backups, clustering, replication, and standby databases are all tools the DBA can use to ensure higher availability.
Very Large Databases (VLDBs)
As companies have found more and more uses for database technology, they tend to save more data. Also, the type of data stored in databases has changed, from structured data in neat rows and columns to unstructured data such as documents, images, sound files, and even fingerprints. Both trends have the same result: larger databases. Managing a VLDB requires special skills of the DBA. The time required to do simple operations like copying a table can be prohibitive unless done correctly. The DBA needs to understand techniques like table partitioning (Oracle), federated databases (SQL Server), or replication (MySQL) to enable a database to scale to large sizes while still being manageable.
Data Extraction, Transformation, and Loading (ETL)
In data warehouse environments, a key task is efficiently loading the data warehouse or data mart with large volumes of data extracted from multiple existing production systems. Often these production systems have different formats than the standardized definitions in the data warehouse, so data must be transformed (or “cleansed”) before loading. Extracting the data may or may not be the DBA’s responsibility in a given company, but making sure what is extracted is useful is, and the DBA is a key part of the team.
The bottom line for IT leaders
Increasing feature sets in popular database products have steadily increased the number of things a DBA has to master. IT leaders who check for skills in the above key areas can make sure the DBA they hire is up to the task.