Home Detection-as-Code: Treating Your Detections Like Software, Not Configurations

Detection & Response

Detection-as-Code: Treating Your Detections Like Software, Not Configurations

April 26, 2026

A SIEM rule edited through a vendor console at 2 a.m. by an analyst who left the company eighteen months ago. No commit history, no test, no documentation of why the threshold is >15 instead of >20. When it starts firing on a new application’s healthy traffic, nobody knows whether to tune it, suppress it, or rip it out. This is the default state of most detection programs, and it scales about as well as you’d expect.

Detection-as-Code (DaC) is the discipline of treating that rule — and every other piece of detection logic in your environment — the way engineering teams treat application code. Rules live in Git. Changes go through pull requests. Tests run in CI. Deployment is automated. The goal isn’t just tidiness; it’s making detection a reproducible engineering practice instead of an oral tradition. According to Splunk’s 2025 State of Security data, 63% of security professionals say they want to frequently or always use Detection-as-Code, but only 35% actually do — a 28-point gap that defines the current frontier of detection engineering.

What Detection-as-Code Actually Means

DaC borrows directly from Infrastructure-as-Code and GitOps. Instead of building rules through a security platform’s user interface, security teams write detection rules in structured formats like YAML or Python, store them in version-controlled repositories, and test and deploy them using CI/CD pipelines with tools like GitHub Actions or GitLab CI. The detection becomes a file, not a console setting.

That distinction has consequences. A rule in Git has a commit author, a review history, and an audit trail. It can be diffed against last quarter’s version. It can be unit-tested before it ever touches production telemetry. It can be promoted from dev to staging to prod through the same pipeline that promotes the rest of your infrastructure. Detection logic stops being a side artifact maintained by whoever happened to be on shift and starts being a product the security team owns and ships.

The practice maps cleanly onto the Detection Development Life Cycle (DDLC), a six-phase loop borrowed from the SDLC: requirements, design, development, testing and deployment, monitoring, and continuous testing. Each phase has explicit artifacts and exit criteria. Rules don’t graduate to production because someone clicked “save”; they graduate because tests pass and a reviewer approved the merge.

Why Console-Driven Detection Falls Apart

The case for DaC is mostly a case against the alternative. In traditional workflows, changes made through UIs get lost or overwritten with no clear traceability, and the failure modes compound: rules that silently break when a log schema changes, duplicate detections written by different analysts who didn’t know the original existed, no way to roll back a regression, no way to know what coverage you actually have.

The coverage problem is particularly acute. Without a process to codify a SOC’s detections, leaders are often forced to resort to rough estimates when asked about coverage. An EDR vendor’s marketing slide claiming 95% MITRE ATT&CK coverage doesn’t tell you whether your deployment, with your log sources and your exclusions, actually fires on T1059.001 in your environment. DaC makes coverage a queryable property of the repository — every rule has metadata mapping it to ATT&CK techniques, and a CI job can produce the matrix on demand.

Then there’s the team-knowledge problem. “Detections provided by security vendors are valuable but must cater to a broad customer base. CISOs and SOC managers are recognizing the need to develop targeted detections aligned with their specific threat models, fine-tuned to their unique environments”. Custom detections accumulate fast — into the thousands at mature programs — and without code-management discipline, they become a liability that outweighs their value.

The Pipeline: From Idea to Production

A working DaC pipeline has six structural phases. Each is enforced by code, not by good intentions.

DETECTION DEVELOPMENT LIFE CYCLE

Six Phases, Each a Pipeline Gate

01 — REQUIREMENTS

Threat Modeling

ATT&CK technique selected. Data sources identified. Success criteria defined. False-positive tolerance set.

02 — DESIGN

Logic Sketch

Query approach drafted. Field mappings checked. Edge cases enumerated before any rule code is written.

03 — DEVELOPMENT

Rule + Tests

Detection written in YAML, Python, SPL, or KQL. Positive and negative test cases committed alongside.

04 — TEST & DEPLOY

CI Pipeline

Syntax validated. Tests run. Peer review on PR. Merge triggers automated deploy to SIEM.

05 — MONITORING

Live Telemetry

FP/FN rates tracked. Tuning issues filed in the same repo. Decommission stale rules with deprecation tags.

06 — CONTINUOUS TEST

Adversary Emulation

Scheduled attack simulations verify rules still fire. Detection drift caught before adversaries find it.

The detail that matters most is phase 4. Detections may have unit tests and syntax validation or can even be validated against real or simulated attack data before they reach a production environment to ensure that nothing breaks and they operate on optimal false positive and false negative rates. A rule that hasn’t been tested against both a known-malicious event and a known-benign event isn’t ready for production, no matter how clever the logic looks. The pipeline gate is what makes that non-negotiable.

The Tooling Landscape

Two threads dominate current DaC tooling: vendor-agnostic rule formats and platform-native frameworks. They aren’t mutually exclusive — most mature programs use both.

Sigma is the standard for portable detection logic. Sigma is for log files what Snort is for network traffic and YARA is for files— a vendor-neutral YAML format that converts to platform-specific queries. The pySigma library and sigma-cli toolchain compile a single rule into SPL, KQL, ElastAlert, QRadar AQL, and dozens of other targets. In 2023, IBM QRadar announced that it would natively support Sigma rules, joining a list that includes Security Onion, Hayabusa, and Chainsaw. The community-maintained SigmaHQ repository is the largest open detection corpus in the world.

The Sigma argument is leverage: write once, deploy to whatever SIEM you happen to have this year. DaC is invaluable during SIEM migrations. By codifying detection logic into portable formats like Sigma or YAML and converting them with sigma-cli, you can avoid starting from scratch when changing platforms.

Panther takes the opposite approach with Python-native detections. Rules are Python functions that receive a parsed event and return a boolean. The pypanther framework, which is the evolution of Panther’s panther-analysis and panther_analysis_tool repositories, bringing modern Python practices to detection engineering, supports inheritance, overrides, and unit-testable helpers. The trade-off: Python rules are expressive enough to model anything, but they don’t port to other platforms without a rewrite.

Most established SIEMs now support some form of git-backed rule management. Splunk ships content through TA packages built in CI; Microsoft Sentinel supports KQL detection rules deployed via ARM templates and GitHub Actions; Elastic ships detection rules as YAML in the official detection-rules repository on GitHub. The pipeline shape rarely changes: rules in a repo, tests in CI, deployment via API.

A common architecture combines Sigma as the authoring layer for portable detections with platform-native rules where Sigma can’t express the logic — correlations, stateful aggregations, ML-augmented detections. The pipeline merges both into a single deployment gate.

Testing: The Hard Part

Most teams underestimate how much testing detection rules requires. A SIEM query that “looks right” can match nothing in production, match everything, or match the wrong things in ways that take weeks to surface.

The minimum viable test set per rule is two cases: a synthetic event that should fire the rule, and a similar-looking event that should not. Panther’s framework, Sigma’s testing pipeline, and most SIEM-native frameworks accept these as YAML or JSON fixtures stored next to the rule. CI runs the rule against the fixtures and fails the build on mismatches. This catches the obvious regressions — a renamed field, a flipped operator, a typo in a value list — before they reach production.

The harder problem is realism. Many SOCs rely on attack ranges or static log datasets to validate new detections. While useful, these methods fall short in one critical way: they don’t reflect the real behavior of your environment. Static logs don’t capture system nuances, log normalization quirks, or the variability of real user activity. A rule that passes against a sanitized fixture can still drown an analyst when it meets your actual log volume. The mature answer is adversary emulation in pre-production: tools like Atomic Red Team, Caldera, or Stratus Red Team execute real ATT&CK techniques against a controlled environment, and the resulting telemetry validates that rules fire end-to-end.

There’s also the inverse failure: tests that pass for the wrong reasons. In software testing, a false positive not only creates a false sense of security but also provides a breeding ground for other bugs. A detection unit test that passes because the fixture happens to satisfy the rule’s first clause — but would never satisfy the second clause in production — is worse than no test at all. Mutation testing, where you deliberately break the rule and confirm the test fails, is the cheapest way to catch this class of error.

Where Teams Get Stuck

DaC adoption is bounded less by tooling than by organizational mechanics. The 28-point gap between aspiration and adoption shows up consistently in three places.

Skills mismatch. Software development life cycle principles (47%) and detection-as-code (46%) — these skills go hand in hand. As detection engineering evolves, teams are increasingly adopting version control, structured workflows, and reusable detection logic to keep rules efficient, consistent, and adaptable when managing hundreds or thousands of detections across diverse data platforms and security tools. Most SOC analysts learned to write detections in a console, not in Git. The pull-request workflow, the discipline of writing tests, the patience for code review — these are cultural changes, not just tool changes.

Telemetry access. Fewer than half of detection teams have the data access they need and even fewer can act on it fast enough. A pipeline can’t validate rules it can’t query. Teams that adopt DaC without first solving log normalization and access end up with beautiful Git repos full of rules that never get tested against real data.

Over-engineering. Be cautious of over-engineering, as it can drain your resources without adding significant value. A two-person detection team does not need the same pipeline as a managed security service provider supporting fifty clients. The right starting point is one repo, one CI job, one deployment script. Everything beyond that is earned by demonstrated need.

What Good Looks Like

Two case studies surface repeatedly in the DaC literature. Bitstamp, a global crypto exchange, replaced legacy detection logic with Python-based DaC using Panther. They defined rules in Git, wrote tests, and automated deployments. This reduced false positives, improved visibility, and allowed rapid iteration on emerging threats. Fastly took a different angle: their security team built a simulation pipeline around their WAF. Instead of deploying detection rules and hoping for the best, they simulate both real and false positive cases before releasing new rules.

Both teams ended up with the same artifacts — versioned rules, tested pipelines, automated deployment — but they got there by chasing different pain points. Bitstamp wanted iteration speed on emerging threats. Fastly wanted to stop shipping noisy rules. Pick the pain that hurts most and let the pipeline grow around it.

FAQ

Do I need to use Sigma to do Detection-as-Code? No. Sigma is the most popular portable format, but DaC is fundamentally about Git, CI, and tests — not about a specific rule language. Native SPL, KQL, or Python rules in version control with a tested deployment pipeline qualify as DaC.

How does this work with vendor-managed detection content? Most mature programs treat vendor rules as a starting baseline and customize them in code. Panther supports the ability to map rules, policies, and scheduled rules to compliance frameworks (including MITRE ATT&CK®) to track coverage against that framework, and similar capabilities exist in other platforms. The vendor content becomes a dependency you pin and override, not a black box you hope is correctly tuned.

What’s the minimum team size for DaC to be worth it? Two engineers and roughly a hundred custom detections is a defensible threshold. Below that, a shared rule repository with manual deployment and informal review captures most of the benefit. The pipeline overhead pays off when the cost of a missed regression exceeds the cost of running CI.

Where does AI fit in? 88% of security teams believe AI will have a massive impact on detection engineering by 2028, but only 45% are actually using it today. Current useful applications are narrow: drafting rule scaffolds from threat reports, summarizing alert clusters, and suggesting tuning candidates. Autonomous detection engineering is not a 2026 reality.

The Bet

Detection-as-Code is, in the end, a bet that detection logic deserves the same engineering discipline as the applications it protects. The bet pays off because the failure modes are the same: silent regressions, lost institutional knowledge, untested edge cases, and changes nobody can audit. Software solved these problems with Git, CI, and code review thirty years ago. Security has spent that time editing rules in consoles and hoping the alert count looked reasonable on Monday.

The teams that have crossed over describe the same shift in how they spend their time: less firefighting old rules, more building new ones with confidence they’ll work. That’s the actual prize. The pipeline isn’t the point — the pipeline is what lets you stop being afraid of your own detections.