Detection Engineering Maturity Matrix
🕵️‍♀️

Detection Engineering Maturity Matrix

Kyle Bailey (@kylebailey22) Detection engineering has long been a function of the incident response team. However, over the last several years, it has gained momentum, becoming a dedicated and more well-defined role within many security operations teams. Many great articles and presentations exist (see Medium) on the purpose of detection engineering and how it fits into a broader Security Operations team. This matrix aims to help the community better measure the capabilities and maturity of their detection function and provide a high-level roadmap for organizations looking to either build a team or expand an existing one. (https://kyle-bailey.medium.com/detection-engineering-maturity-matrix-f4f3181a5cc7) Additional Resources: - 2021 SANS Blue Team Summit: Measuring Detection Engineering Teams - 2022 BSidesSF: Detection-as-Code: Why it works and where to start You can find V1 of this matrix on Github.
CategorySubcategoryDefinedManagedOptimized
👩‍💻
People
Team

- Ad-hoc team building/managing detection content (i.e., as-needed task owned by incident response or other security individuals). - SME's in none or very few detection domains (i.e. network, host, cloud, etc.)

- Dedicated individuals performing detection work full time (This could be detection & response engineers on a rotation or a dedicated team). - SME’s on some tools & log sources, informally defined domain ownership.

- Dedicated team with defined SME's for all detection domains (host, network, application, cloud, etc.)

👩‍💻
People
Leadership

- Leadership has basic understanding of detection processes and challenges but limited resources or different priorities may exist.

- Leadership advocates for detection as a dedicated function. - The size of the team and the resources needed may not be fully understood.

- Leadership advocates for involvement in detection processes across the business, as well as necessary tools, licensing, and staff.

🏭
Process
Detection Process

- Detection strategy, workflow, and process do not exist. - Detection quality depends greatly on the understanding of the individual performing the work. - No backlog or prioritization of known gaps. - Little or no active maintenance or monitoring of existing detection.

- A detection strategy and workflow are defined and followed. - Approval and handoff processes are loosely defined for releasing new detection. - Work is prioritized in an ad-hoc way with little to no input from threat intel or others. - Maintenance and monitoring are performed but are ad-hoc and generally done reactively.

- Detection strategy is defined and continuously iterated on. - Defined review and approval processes exist for new and updated detection, and incident response is heavily involved prior to deployment. - Work is prioritized using input from threat intel, and threat modeling with technology SME's. - Maintenance and monitoring are continuous and most SIEM and detection failures are identified proactively.

🏭
Process
Metrics

- Little to no detection-related metrics exist. - Metadata and aggregation methods or tools may not exist to collect or report on metric data. - Detection is inconsistently mapped to MITRE ATT&CK or another control framework, but there is no formal tracking

- Some metrics exist for categories such as alert fidelity, mean time to detect, and automated resolutions. - Alert metadata is consistently applied but may have room for improvement or not is not consistently aggregated into actionable data. - Detection metadata includes MITRE ATT&CK TIDs but aggregation of coverage may not occur or is manual.

- KPIs are well-defined and goals are well understood. - Automated methods exist for collecting and reporting on most metrics. - Applicable MITRE ATT&CK TID coverage is automatically tabulated and displayed across nearly all detection sources.

📱
Technology
Visibility

- Visibility across the environment is not well understood. - Visibility is inconsistent and some sources critical for custom detection may be missing. - Tool/SIEM limitations prevent some log sources from being ingested.

- Log sources are cataloged and prioritized. - Most critical log sources are available in the SIEM.

- The detection team defines critical log sources and ensures they are present in the SIEM. - Log providers agree to SLAs on outages and delivery latency.

📱
Technology
SIEM

- Log outages and detection errors/failures are not alerted on or well known. - Log ingest delays are not tracked.

- Log source health alerting exists for critical sources. - Most log sources are delivered to the SIEM within minutes but overall log latency may not be well understood.

- The SIEM allows for an expressive language to enable complex detection logic. - Most detection logic is run in real-time. - Robust log health alerting exists for all sources. - Most log sources are delivered to the SIEM within minutes. Those that do not are due to provider limitations.

📱
Technology
Detection-as-Code

- Detection-as-code principles are not followed or prioritized.

- Detection as code principles are used as a north star but technical components (e.g., testing) may not be fully built out or utilized. - Detection is stored in version control and deployed to production using a CI/CD pipeline. - Linting & testing in CI occur in a very limited way or not at all.

- Detection as code is engrained in the team culture. - Version control, review and approval, linting, and testing are baked into the deployment pipeline. - Some detection logic is continuously tested end-to-end in an automated way.

🧙‍♂️
Detection
Threat Operations

- Threats are not simulated for building or testing, historical data is used to build new detection. - Red or purple teaming does not occur outside of mandated pentests.

- Detection creation is prioritized loosely using threat data (known, likely threats) - Purple team exercises are run ad-hoc, potentially loosely driven by threat intel or incidents. - Red team testing occurs on a regular basis.

- Detection creation is prioritized based on known and active threats to the org as identified by threat intel with risk input from other teams (i.e Security engineering, architecture, risk). - Purple team exercises are constantly run to validate and improve detection and response capabilities. - Mature Red team capabilities work closely with detection and response daily.

🧙‍♂️
Detection
Detection Content

- Primarily rely on out-of-the-box detection content, little to no customization or custom rules are built. - Detection is primarily IOC focused, few behavioral TTP detections exist

- Detection content is tuned to the environment where applicable. - More behavioral-based detections exist, new detection is mostly TTP-focused (where possible)

- Focused primarily on behavioral/TTP detection logic. ML & statistical-based detection models are applied where applicable. - There is an intense focus on risk-based alerting, leveraging as much lateral (additional events) & external context (enrichment) as possible to estimate the appropriate risk of an alert.

🧙‍♂️
Detection
Response Experience

- All detection logic is treated equally in priority - All alerts must be manually reviewed and interacted with by the incident response team or on-call. - Little to no enrichment occurs in alerts - More alerts are generated daily than can be handled by incident response.

- The priority of an alert is clearly displayed for responders. - Alert enrichment occurs from most data sources and is actively being expanded. - Limited workflows that can be used for automated remediation exist but are expanding.

- Alerts are enriched with the relevant context in order to provide incident response an accurate risk picture. - New detection must have automation and enrichment before being released to production.