In this Help Net Security interview, Ken Gramley, CEO of Stamus Networks, explains the main causes of alert fatigue in cybersecurity and DevOps environments. Alert fatigue is caused by the sheer volume of event data generated by security tools, the prevalence of false positives, and the lack of clear event prioritization and actionable guidance.
What are the main causes of alert fatigue in cybersecurity and DevOps environments?
Alert fatigue is the result of several interrelated factors.
First, today's security tools generate vast amounts of event data, making it difficult for security personnel to distinguish serious threats from background noise.
Second, many systems are prone to false positives triggered by benign activity or overly sensitive anomaly thresholds, which can lead defenders to become less sensitive and miss important attack signals.
The third factor that contributes to alert fatigue is a lack of clear prioritization: The systems that generate these alerts often lack a mechanism to triage and prioritize events. This can leave personnel unsure of where to start and can lead to inaction.
Finally, when alert records and logs don't contain sufficient evidence or response guidance, defenders don't know what actionable next steps to take. This confusion wastes valuable time and leads to frustration and fatigue.
Mitigating alert fatigue is a major challenge for organizations. How can you optimize your security tech stack to overcome this challenge?
This is really hard to do. Unfortunately, we have seen organizations decide to log all alerts and only inspect them if an incident is detected by a more reliable system. Oftentimes, logged alert data contains traces of evidence that could be important in an incident investigation, but this “save and ignore” approach is not an ideal solution.
The three most critical components of a modern Security Operations Center (SOC) are a Network Detection and Response (NDR) system, an Endpoint Detection and Response (EDR) system, and a central analysis engine (typically a Security Information and Event Management (SIEM) system). Each of these elements of the so-called “SOC visibility triad” plays a key role in reducing alert fatigue.
NDR and EDR systems need reliable mechanisms to identify severe and imminent threats in their respective domains with a very high degree of accuracy, i.e., near zero false positives, thus building confidence in the toolset and providing a starting point for security analysts to begin their investigations. Additionally, they should provide some form of automated event triage or prioritization, allowing SOC teams to highlight next level events that need to be investigated.
Finally, NDR and EDR should collect all relevant artifacts related to a particular security event and, if possible, correlate and organize them into an incident timeline to expedite investigations and enable defenders to eradicate threats before they cause damage.
Because NDR and EDR are a critical source of security telemetry into the SIEM, this is where the next level of mitigation of alert fatigue comes in. The individual event records or logs that NDR and EDR send to the SIEM must contain enough metadata to provide the SIEM analysis engine and its users with all the relevant evidence and relevant information required for incident response efforts. Additionally, these detailed event records can feed into an additional layer of correlated threat detection in the SIEM itself.
How can organizations leverage contextual information to enrich alerts and make them more actionable?
This is important. There are several types of context that can help here. Organization-specific information such as hostnames or well-known network names can make it much easier to identify assets under attack or being used to spread malware than IP addresses alone. If this context is not included in the alert record or logs, analysts will have to go to another system to look up this information.
Another form of context comes in the form of associated metadata and artifacts – we're referring to things like protocol transaction logs, file attachments, or even a full packet capture (PCAP) of the session where the alert occurred.
This additional information has proven to help SOC personnel more quickly assess the severity, source, and cause of an incident, making these alerts more actionable.
How can organizations balance the need for transparency with the potential risk of leaking sensitive information?
This topic is very close to my heart. At Stamus Networks, we are very committed to radical transparency and data sovereignty, which are elements of this question. That said, balancing transparency with information security is a tricky tightrope for organizations to walk. There are many strategies that organizations can employ, but here are a few that stand out to me and that I commonly see in the practices of successful security leaders:
First, build your control program based on a recognized security framework such as NIST or ISO 27001. This not only ensures that you create a defensible program, but also ensures that you consider the big picture and don't forget important controls.
Second, we focus on incorporating extensive security monitoring into our systems and networks, which helps us spot serious threats and malicious activity early in the kill chain.
Additionally, these organizations develop clear, transparent communication plans outlining what information can and cannot be shared, which builds trust and avoids confusion within the organization and with stakeholders.
Finally, these organizations are particularly careful about where data is stored and processed, practicing what I call “extreme data sovereignty” – tightly controlling where data is stored and processed.
What role do regulatory requirements and industry standards play in promoting transparency and accountability in cybersecurity?
Regulatory requirements and industry standards play a key role in promoting transparency and accountability by encouraging both breach disclosure and the implementation of strong cybersecurity controls. Regulations such as SEC Form 8-K filings in the United States and GDPR in the European Union mandate reporting of data breaches to authorities and, in some cases, to affected individuals. This encourages organizations to openly report security incidents, raising public awareness and preventing potential cover-ups.
SEC 10-K filing requirements require public companies to disclose details of their cybersecurity programs. Similarly, the EU's NIS Directive, which focuses on critical service providers, requires risk management measures to be implemented. Having visibility into these controls allows stakeholders (and shareholders) to assess an organization's cybersecurity posture and hold it accountable for maintaining strong defenses.
How can organizations leverage new technologies and frameworks to improve transparency and accountability?
The previously mentioned “SOC visibility triad” – NDR, EDR, and SIEM – is one key new technology that can help. These systems continuously monitor the network for suspicious activity, allowing threats to be identified and mitigated more quickly. Real-time threat detection increases transparency by enabling organizations to communicate about ongoing threats and actions being taken.
We've already mentioned the importance of a cybersecurity framework. It helps organizations identify, protect, detect, respond to, and recover from cyberattacks. By publicly outlining a framework-based approach, organizations can demonstrate their commitment to cybersecurity and hold themselves accountable for following established processes.