Disaster recovery (DR) and business continuation is every CIO's concern, but there are distinct differences in culture and in risk mitigation/avoidance in IT when you look at the legacy computing staff (mainframes, mini-computers) and then cross the aisle into Windows/x86 distributed computing networks and servers.
"A major reason we decided to move to a mainframe in our data center was that we had service level commitments guaranteeing uptime and performance to our clients," said one Software as a Service (SaaS) provider. "To achieve these metrics, we made the move from distributed x86 servers to mainframe computing, and we noticed an immediate difference in the quality of vendor response to performance issues."
I have found this to be true in visits to IT shops using distributed computing. When it comes time to evaluate the currency and even the existence of DR and business continuation plans for Windows/x86 platforms, these plans are often missing or incomplete.
One reason is a history in Windows/x86 computing of hosting office systems like word processing and email that haven't been regarded to be as mission critical as the systems of record that run on mainframes and mini-computers. Consequently, businesses have skipped making DR plans for x86 servers that run Windows systems.
In other cases, distributed computing IT staffs come with little or no knowledge of some of the DR and risk management practices and disciplines that characterize the mainframe and mini-computer worlds, so they don't understand the importance of risk management and disaster avoidance.
In still other cases, vendors for distributed platforms in these environments have product and market approaches based on constant innovation that comes at the expense of delivering high-quality products that consistently perform and seldom fail. Unsurprisingly, they also fail to provide robust solutions and toolsets for the more mundane areas of computer management, like DR planning and execution.
Michael Tweddle, senior director of product management for Dell, talked about this a couple of weeks ago in the context of the Microsoft Active Directory that runs on x86 servers.
"In a recent survey, when we asked Active Directory customers about testing and orchestrating DR for Active Directory servers, many of them responded that they don't even test the software because they are fearful of possible impacts to their production environment once they move out of test," said Tweddle. "In some cases, these fears are preventing them from moving to newer versions of the software."
The fears are not unfounded. Tweddle referenced one example of an Active Directory service patch test in a test environment where there were no problems, but when the patch got moved to production, the patch inexplicably wiped out an entire series of email addresses.
"This is a real DR and risk management concern," Tweddle said, "because we are finding from survey results that 87 percent of production interruption or DR incidents for Active Directory occur at the Forest or Domain levels, where there is the most potential for adverse impact. When these incidents happen, we find that sites don't have a robust DR plan, or a reliable way of testing new changes or software versions in a 'lab' environment that accurately mirrors what they have in production."
To address the Active Directory challenge, Dell offers a solution that automates much of the testing process for Active Directory, and that is also capable of mirroring production to wipe out fears (and not data and infrastructure). Dell's Recovery Manager for Active Directory Forest Edition comes with built-in rules for DR testing and management that leverage many of the established best practices that are found in mainframe and mini-computer environments.
Solutions like this couldn't be more opportune, because businesses are steadily moving more mission-critical IT to x86 computing environments that include Windows.
"It is important for sites to understand the criticality of keeping systems up and running," said Tweddle. "In some cases, like airline reservations, one minute of downtime can mean one million dollars of revenue loss."
Mary E. Shacklett is president of Transworld Data, a technology research and market development firm. Prior to founding the company, Mary was Senior Vice President of Marketing and Technology at TCCU, Inc., a financial services firm; Vice President of Product Research and Software Development for Summit Information Systems, a computer software company; and Vice President of Strategic Planning and Technology at FSI International, a multinational manufacturing company in the semiconductor industry. Mary is a keynote speaker and has more than 1,000 articles, research studies, and technology publications in print.