Your main production server’s hard drive, which happens to hold the SYS volume, just crashed. Worried about how you can quickly restore the server?
If you have some simple documentation and a recent backup tape, you are well on your way to begin recovering from the potential disaster. While every situation is different, the following steps can be followed to restore your server to production service.
Verify that network time is synchronized. Avoid causing yourself more problems by verifying that network time is synchronized. There will probably be few problems since time should have been synchronized before the crash. However, if the crashed server was a time provider, you will need to designate a new one.
Verify that a Master replica is present for each partition. If Master replicas were stored on the crashed server, you must change a Read/Write replica from another server into a Master replica. Documenting replica placement before you have problems will make this decision easier.
Delete the Server object. Using NDS Manager, delete the Server object of the crashed server. This method will remove the server and its replicas from all of the servers in the replica ring.
Delete the Volume objects. Using NetWare Administrator, delete all of the crashed server’s Volume objects.
Verify that there are no NDS errors. Check the partition continuity using NDS Manager. If there are errors, wait a little while before doing anything. Very often NDS errors will resolve themselves if they are given ample time to do so. If you receive –625 errors, NDS could be attempting to communicate with the deleted server. If the error does not resolve itself, you can follow these steps to remove it from the replica ring.
- Using NDS Manager, highlight the partition.
- Right click, and select partition continuity.
- Highlight the deleted server.
- From the Repair menu select Remove Server.
- Select Yes to confirm your intentions.
- Perform these steps for each replica that the server held.
Install the new hard drive. The new hard drive must be the same size or larger than the previous drive.
Install NetWare. Install NetWare using the same parameters that were used during the initial installation. At the very least, you must use the same server and volume names. Documenting the parameters used for all server installations will be invaluable in a situation like this.
Place the replicas back onto the server. Depending on how many replicas are stored on the server and the speed of the LAN, this step may take some time. You can speed things up by placing only Read/Write replicas that are used for authentication. Restoring the original replica placement can be scheduled for non-business hours.
Restore time provider settings. This step is optional and can be performed at a later time, during non-business hours.
Restore the file system data from a backup tape. Using the last known good backup tape(s), restore the file system data. If the tape backup system does not restore trustee rights, they must be restored manually. You must also reassign home directories and recreate print queues.
Confirm the bindery context. If you have NetWare 3.x servers on the network, confirm that the bindery context is correct.
The network should be somewhat back to normal at this point, although it may take a day or two for you to resolve all of the residual issues. I would pay a little extra attention to the server and the network for a few days, just to ensure that everything is running normally. Hopefully, this is one procedure that you will never need to use.
Steve Pittsley is a Desktop Analyst for a Milwaukee, WI hospital. He enjoys playing drums, bowling, and most sports.
If you’d like to share your opinion, please post a comment at the bottom of this page or send the editor an e-mail.