The H1N1 virus (aka swine flu) epidemic had much less business impact than many had predicted. The mass exodus of employees due to a co-worker confirmed to be sick from the virus didn’t happen at my place of business, or at any of our 500 locations. This was good news for the business, the employees, and the patients in our facilities. However, it was bad news for business continuity planning teams, even those who “do it right.” They realized there were gaps in their plans. However, the significantly scaled back H1N1 impact on business, and the limited scope of related remediation projects, helped to put any remediation plans on hold. Our company was no exception.
When the media initially began reporting the possibility of a near-term pandemic, senior management began asking questions about our readiness. Most managers responsible for risk quickly assured them that our disaster recovery (DR) plan was tested and ready. This, of course, is the standard answer when asked about business continuity event readiness. However, there was a big gap in our plan. The threat of absences of key personnel across the enterprise was something we hadn’t considered.
Our tested and up-to-date DR plan allowed us to recover critical systems within the constraints of maximum allowable downtime. Further, our secondary office space allowed corporate personnel to actually access the temporary data center. Finally, we could stand up any server in the data center with current build documentation. But what if critical personnel were unable to get to the office or to the secondary office space?
H1N1 doesn’t attack servers and other infrastructure. Rather, it comes at business continuity from a completely different vector, attacking the human component of critical business processes. It would seem an easy fix. Just give everyone remote access and get on with the important stuff. But it wasn’t that easy. (Is it ever…?)
We had an SSL VPN solution in place for a limited number of users. Our infrastructure could handle over 2000 concurrent connections. However, company and security policy limited access to only those employees requiring remote access to perform their job functions (i.e., mobile workers). The problems began when, as the director of IS security, I proposed we plan for for expanding remote access to a larger user population.
Our SSL solution was designed to allow access to email, home folders, and our intranet. Although there was limited access to a very small number of applications, access to core processes like financial and payroll systems was not implemented. So we began looking for a way to implement reasonably secure—and inexpensive–remote access for key personnel who either stayed home to care for a family member or because the corporate office was closed due to the spread of H1N1.
It took months to come up with a technical plan. The delay was not caused by skill set shortcomings in our network engineering team. Instead, it happened because H1N1 didn’t seem as important to project planners as other activities. So work related to expanded remote access was given a low priority. Eventually, our engineers came up with a tested method of allowing access. It was elegant in its simplicity.
Figure A is a simple concept diagram of our solution. In Step 1, a home user enters the URL for our SSL appliance and uses his or her network login for authentication. In Step 2, the SSL appliance uses a script to determine the machine name of the authenticated employee. And in Step 3, the employee is connected to his or her office desktop. The desktop connection provides remote control, giving the user access to all authorized applications just as if he or she was sitting in the office.
This is not a good permanent solution for allowing remote access. For one thing, allowing users to access desktops requires strong physical security controls at the office. Although our floors are accessed via locked doors, and although you have to pass a security desk to get to an elevator, physical access controls for a healthcare company are not the same as those implemented for a national defense facility. Consequently, enhanced security guard and employee awareness regarding their roles in preventing unauthorized physical access to office spaces had to be part of the modified business continuity plan.
Everything was moving forward until we needed application development to build the production database that would contain the table cross referencing user IDs with desktop machine names. The director of application development and his boss decided that there wasn’t enough need. So they tabled the entire project.
Yes, we had enough documentation to turn on access within 24 to 48 hours. However, none of the users were trained and no documentation was available to provide assistance. The Help Desk would be very, very busy if the proposed plan was implemented.
The problem I have with this scenario is the loss of an opportunity. We lost the opportunity to implement a solution that goes beyond H1N1. Focused only on a potential pandemic, a very narrow project scope, management lost interest. But what if the value of the project was increased by expanding the scope? Is the problem of absences related only to pandemics? Hardly.
This company, like many others, had one person and a backup trained to perform each business critical task. But what if both individuals were out? The process we designed for H1N1 also provided a solution for this scenario. And what about a case in which the data center is still operational, but users are unable to make it to the office? Again, this solution meets the challenge.
The missed opportunities arose because we began this project as H1N1 planning, not key employee absence planning. Although we tried to recover when we eventually saw our mistake, management had already moved on. They saw this only as an H1N1 issue. The scope was too narrow to provide sufficient business value to keep it moving forward.
The takeaway from this story is simple. Increasing the scope of business continuity planning activities to include a wider range of possibilities helps management see the risk mitigation value of the effort. This is important if you don’t want to see your project slide into hold status, or worse.