SOCs take the guesswork out of fighting digital bad guys

A Security Operations Center (SOC) is becoming an essential part of IT departments. Find out why, and get a roadmap for implementing one.

Image courtesy of Alissa Torres and SANS

If you're familiar with the name Alissa Torres, that means you're interested in computer forensics and incident response, which is her specialty as a certified SANS instructor. At Shmoocon 2014, Torres teamed up with Jake Williams, another scary-smart SANS instructor, and the two gave a presentation outlining a concept tool that would allow cybercriminals to cover their tracks by altering the contents of a computer's memory.

Torres just published a SANS white paper discussing another security subject -- Building a World-Class Security Operations Center: A Roadmap -- that I would also like to share with you.

If detection isn't working, response had better

She starts out by noting two things people in charge of security are coming to grips with:

  • If a security incident can't be prevented, those responsible had better be able to detect the incident and respond quickly.
  • Increased spending does not guarantee more or better security.

"Achieving the goal of better security depends on how a budget is allocated; what people, procedures, and infrastructure are put into place; and how the security program is managed and optimized over the long term," writes Torres. One way to get better security, according to Torres, is to implement a Security Operations Center (SOC).

A SOC roadmap

Torres admits that building a SOC from scratch may seem like a formidable task, but it does not have to be. To start, Torres suggests laying out a roadmap, mentioning, "The goal of planning should be to execute regular incremental improvements based on your completed gap analysis and to establish a series of prioritized milestones that lead the organization toward optimized security and improved incident detection and response."

I have heard gap analysis mentioned in reference to business planning. Margaret Rouse at clarifies how gap analysis applies to IT by defining it as:

"A method of assessing the differences in performance between a business' information systems or applications to determine whether the system requirements are being met and, if not, the steps needed to do so. Gap refers to the space between 'where we are' (the company's present state) and 'where we want to be' (its target state)."

The Triad of Security operations

With the gaps now known, Torres recommends considering the removal of each gap as an incremental improvement. Eliminating gaps may require one or more elements from what Torres calls the Triad of Security: people, processes, and technology.


Alissa Torres
Image courtesy of Alissa Torres and SANS

For the foreseeable future, people will be an important part of any response to a digital incident. The complexity of today's incidents most likely will require help from: an in-house security team; a Managed Security Service Provider (MSSP); and specialists who provide surge incident response support.

In August 2014, Torres published a white paper titled Incident Response: How to Fight Back. Part of the research for the paper included the survey question: What resources does your organization utilize in responding to incidents? The results were telling -- over 50% of the participants employed a dedicated response team (in-house), but also relied on a surge staff (third-party consultants) to help with critical incidents.

SOC job descriptions

Next Torres looked at what kind of expertise would be required to man a SOC.

  • socjobs2.png
    Image courtesy of Alissa Torres and SANS
    Alert Analyst (Tier 1): Continuously monitors the alert queue; triages security alerts; monitors the health of security sensors and endpoints; collects data and context necessary to initiate Tier 2 work.
  • Incident Responder (Tier 2): Performs deep-dive incident analysis by correlating data from various sources; determines if a critical system or data set has been impacted; advises on remediation; provides support for new analytic methods for detecting threats.
  • Subject Matter Expert/Hunter (Tier 3): Possesses in-depth knowledge on network, endpoint, threat intelligence, forensics, and malware reverse engineering, and the functioning of specific applications or underlying IT infrastructure; acts as an incident "hunter," not waiting for escalated incidents; closely involved in developing, tuning, and implementing threat detection analytics.
  • SOC Manager (Tier 4): Manages resources to include staff budget, shift scheduling, and technology strategy to meet SLAs; communicates with management; serves as organizational point person for business-critical incidents; provides overall direction for the SOC and input to the overall security strategy.


The importance of having team members understand their duties as they apply to the incident triage and investigative processes cannot be overstated. "By creating repeatable incident-management workflows, team members' responsibilities and actions from the creation of an alert and initial Tier 1 evaluation to escalation to Tier 2 or Tier 3 personnel are defined," writes Torres.

For suggestions of what works process-wise, Torres recommends looking at the DOE/CIAC model and the NIST SP800-61 Revision 2, "Computer Security Incident Handling Guide."


For incident response, Torres believes more data is better, suggesting, "An enterprise-wide data collection, aggregation, detection, analytic and management solution is the core technology of a successful SOC."

Torres continues:

"With the benefit of network, log, and endpoint data gathered prior to and during the incident, SOC analysts can immediately pivot from using the security monitoring system as a detective tool to using it as an investigative tool, reviewing suspicious activities that make up the present incident, and even as a tool to manage the response to an incident or breach."

The benefits

As I read the white paper, I was reminded of my time as a volunteer firefighter and EMT. Everyone knew their job. The technology was in place. We followed procedures. Otherwise, people got hurt or worse. This type of incident command and control works and is flexible enough to help fight fires -- real and digital.

Also see