One IT team, when faced with high-priced options from vendors, decided to develop its own high-speed backup network. Here's how they did it.
Mainstream storage packages are often too expensive for many companies both in terms of cost and effort. At a recent Dell CIO briefing (concerning Dell’s recent agreement with EMC to sell storage), I was reminded of my own past struggles to find affordable—and high performance—storage options. One instance in particular was especially rewarding. If Storage Area Network (SAN) technology hadn’t been too expensive and Network Attached Storage (NAS) had had higher performance levels, my team never would have had the incentive to create the high-capacity backup solution that we did. Here is how we did it.
A little background
As the IT manager for a global software company, I was always under pressure to improve performance and reduce costs. Since my company had standardized on leasing equipment, I was always looking for three-year solutions that matched the lease terms of our equipment and solutions that provided vendor support for the same period. The existing lease was ending and I needed to look for our next three-year solution.
Our file storage requirements were large, growing, and critical, especially for functional areas like Product Development, which needed a flexible system for storage of multiple product releases and work in-progress. The file system had to be 100 percent available, secure, and I needed restores to be quick. The existing solution had been Novell 4.11 using two HP servers with manual failover and a cabinet full of disk arrays using RAID 5 with storage of about 200 gigabytes and projections to double every two years. Application servers usually required 20-100 gigabytes and were stored locally.
We did backups using Veritas across a 100-megabit network with two servers and a host of DLT drives. We did them at night and the backup times started to exceed the backup windows on given days, which affected performance for users. The environment called for full backups in order to do more easily the kind of restores that were necessary. I learned that a key element in making the backup windows was that the tape drive had to run full speed without stopping to wait for data. The drives had to be fed data as fast as they could accept it. This is where our then-current solution fell short.
Deciding on and building the solution
Before I started the storage and backup project, I talked to several vendors. Their solutions were attractive—I really liked their backup solutions and fiber channel speeds, but they were just too expensive. I asked my team why we couldn’t develop our own high-speed backup network and rely on more conventional storage from Dell, especially since Windows 2000 clusters were available, fast, reliable, and administratively strong with Active Directory. It seemed to me that we could also utilize low-cost disk arrays that could handle incremental storage requirements and use low-cost copper gigabit networking. I knew that even if we were inefficient by using the continued decentralized disk storage solution, we could meet performance needs at a fraction of the cost of any SAN available at the time. My team went to work and researched various solutions with our partners.
As the team was building the solution, we faced several obstacles. Isolating the backup network required some tweaking and we found the solution worked for everything except the Exchange Server and the Active Directory Domain servers. The Exchange 2000 cluster was set up as “active failover,” and we had to back it up on the front-end network. It also required its own active restore server in the Domain with its own dedicated DLT tape drive. The Domain controllers also had to be backed up on the front-end network because the multiple IP addresses caused problems on the public network.
Figure A shows a high-level view of the architecture used for the separate backup network. Each server had two network interface cards (NICs), one for the backup network and a second for the front-end network. Figure A shows how the backup server could also access those special servers, like Exchange, that needed to be backed up over the front-end network and also the placement of the Exchange restore server. It also shows the relationship of the backup and front-end networks.
The hardware and physical aspects of the solution included:
- Dell Servers
- Dell SCSI Disk Arrays
- Dell Tape Library
- Veritas Backup Software
- Microsoft Windows 2000 Servers and Clustering
- Foundry Networks Copper Gigabyte Switch
I wanted to make sure that my company had a master list of all required backups and that reporting would show the success of these backups over time and by application. I assigned a team member the responsibility of making sure performance standards were met. He took backups seriously because I’d put an incentive plan in place.
My team was required to take the information from the backup logs and import it into a Microsoft SQL Server database, to define the correct status codes, assign application codes consistent with other reporting I needed, get my approval on performance standards, then set up Microsoft Excel pivot tables that would let me monitor performance over time.
The hard part was converting the Veritas log files into a format that could be loaded into the database format. I must say this is also a problem with other “enterprise solution vendors.” They just don’t have built-in options for loading their logs into a database format. The value of the database, of course, is that it looks at information over time and that it also combines with other sources for complete and integrated IT Performance reporting.
Our project included these steps:
- Setting up an automated procedure for copying the backup logs from the backup server to a central location on the front-end network
- Writing programs to read the logs and load pertinent information into the SQL tables
- Establishing success values (Verified, Backed Up, Failed, and Not Started)
- Establishing percent completed (number of files vs. number of files actually backed up)
- Establishing major IT service/application linkage (linking to the IT reporting system and showing backup and restore performance by major services, e.g., file server by location, Sales Intranet, etc.)
- Updating the master reporting database weekly, so the information would be available to me on Monday mornings
- Integrating the information into the master IT reporting system
Basically, the backup reporting system used the following software components:
- Microsoft SQL Server 2000
- Microsoft Access Programming Language
- Microsoft Excel Pivot Tables
- Server scheduling and copying of the Veritas Log Files was needed to move copies of the daily logs from the backup network to the front end network so that the access program could decode and load logs to the database
- Veritas Enterprise Backup Software
So what happened?
I was very pleased with the results. The file server in a Windows 2000 server cluster configuration was rock solid and Active Directory met our administrative needs. The backup network worked better than anticipated, reduced overtime expense, and helped to keep the same staff levels. We implemented the solution for about $25,000 as opposed to over $100,000 for comparable SAN Fiber Channel solutions. We enjoyed about a 40 percent speed increase on backup times, but more importantly, we gained the ability to back up servers any time, eliminating the traditional off-hour backup window.
The master list of required backups was completed. The new reporting system provided the needed information, and problems were quickly identified and taken care of by the person-in-charge.