Engineering High-Fidelity Splunk Alerts for AWS Threat Detection

Detection Engineering
January 2026 10 Min Read

The Objective: To transition from "noisy" signature-based alerts to behavioral detections capable of identifying a multi-stage cloud breach across the entire Kill Chain—from reconnaissance to data destruction.

Cloud environments generate a massive volume of logs. For a Security Operations Center (SOC), the challenge isn't finding data—it's finding signal within the noise. In this project, I utilized the Invictus IR AWS Attack Dataset to engineer a suite of custom detection rules in Splunk Enterprise, mapped to the MITRE ATT&CK Cloud Matrix.

The Detection Strategy

Most "out-of-the-box" AWS alerts trigger on single events, such as a failed login. However, sophisticated attackers like the one profiled in this dataset (bert-jan) use automated scripts that mimic administrative behavior. My strategy focused on Statistical Thresholding and Cardinality Analysis.

6
Custom Rules
100%
Detection Rate
MITRE
Aligned

Technical Deep Dive: Key Logic

1. API Reconnaissance Spike T1595.002

Instead of counting total API calls, this rule counts distinct API names (dc(eventName)). This filters out noisy scripts that retry the same unauthorized action and highlights an actor actively mapping the breadth of the environment.

index=invictus (eventName=List* OR eventName=Describe* OR eventName=Get*)
| eval _time=strptime(eventTime,"%Y-%m-%dT%H:%M:%SZ")
| bin _time span=10m
| stats dc(eventName) as unique_api_calls, 
        values(eventName) as attempted_calls, 
        count as total_hits 
  by _time, sourceIPAddress, userIdentity.arn
| where unique_api_calls > 10
| sort - unique_api_calls

Validation: Detected bert-jan performing 47 unique API calls from 10.8.8.10 within a 10-minute window—clear reconnaissance behavior.

2. SSM Parameter Enumeration T1555.006

AWS Systems Manager Parameter Store often holds secrets: API keys, database credentials, and environment variables. This rule targets bulk retrieval patterns that indicate credential harvesting.

index=invictus eventSource="ssm.amazonaws.com" 
       eventName IN ("DescribeParameters","GetParameters","GetParameter")
| eval event_time=strptime(eventTime,"%Y-%m-%dT%H:%M:%SZ")
| bin event_time span=10m
| stats count as api_call_count,
        dc(requestParameters.name) as unique_parameters,
        values(userAgent) AS user_agents 
  by userIdentity.arn, event_time
| where api_call_count > 10

Validation: Captured the Stratus Red Team automated tool extracting 16 unique secrets using the user agent stratus-red-team_11a6ef34...

3. Failed Privilege Escalation T1078 / T1068

Attackers often "fuzz" permissions to see what their stolen credentials can access. I implemented Dynamic Severity logic using Splunk's eval case() function to prioritize alerts based on the volume of AccessDenied errors.

index=invictus errorCode="AccessDenied" OR errorCode="Client.UnauthorizedOperation"
| eval event_time=strptime(eventTime,"%Y-%m-%dT%H:%M:%SZ")
| bin event_time span=5m
| stats count as failed_attempts, 
        values(eventName) as attempted_events,
        values(eventMessage) as error_details 
  by event_time, sourceIPAddress, userIdentity.arn
| where failed_attempts > 3
| eval severity=case(failed_attempts > 10, "CRITICAL", 
                     failed_attempts > 5, "HIGH", 
                     1=1, "MEDIUM")
| sort - failed_attempts

Validation: Flagged 15 failed DescribeInstanceAttribute attempts by stratus-red-team-get-usr-data-role—a clear EC2 credential theft pattern.

4. CloudTrail Tampering T1562.008

A critical, high-fidelity alert monitoring for attempts to disable logging. Attackers use this to "blind" security teams before performing destructive actions. This rule has zero tolerance—any match is CRITICAL.

index=invictus 
       eventName IN ("StopLogging","DeleteTrail","UpdateTrail",
                     "PutEventSelectors","DeleteDetector","ArchiveFindings")
| table eventTime, eventName, awsRegion, userIdentity.arn, 
        sourceIPAddress, userAgent, requestID

Validation: Detected bert-jan executing StopLogging at 12:01 PM, immediately followed by DeleteTrail during the cleanup phase.

5. Lateral Movement via AssumeRole T1550.001

Detects Role Chaining—where an attacker pivots from one IAM role to another to escalate privileges or hide their origin. The rule alerts on spikes in AssumeRole calls within a time window.

index=invictus eventName="AssumeRole"
| eval event_time=strptime(eventTime,"%Y-%m-%dT%H:%M:%SZ")
| bin event_time span=10m
| stats count as total_attempts,
        values(errorMessage) as error_message,
        values(errorCode) as errors
  by userIdentity.arn, event_time, sourceIPAddress
| where total_attempts > 3
| table event_time, userIdentity.arn, sourceIPAddress, 
        total_attempts, errors, error_message

Validation: Identified 17 AssumeRole attempts (4 failed) from bert-jan, indicating active privilege escalation via role chaining.

6. Mass Deletion / Destruction T1485 / T1490

The final safety net. This rule detects bulk deletion of assets (>20 events in 5 minutes) OR the deletion of specific critical infrastructure (KMS Keys, CloudTrail logs) regardless of volume.

index=invictus (eventName="Delete*" OR eventName="Terminate*" OR eventName="Drop*")
| eval event_time=strptime(eventTime,"%Y-%m-%dT%H:%M:%SZ")
| bin event_time span=5m
| stats count as deletion_count,
        values(eventName) as actions_taken 
  by userIdentity.arn, sourceIPAddress, event_time
| where deletion_count > 20 
   OR (deletion_count >= 1 AND 
       (actions_taken="DeleteTrail" OR actions_taken="DeleteKey"))

Validation: Cross-correlated with CloudTrail tampering activity, catching the final DeleteTrail event at 12:35 PM.

Threat Actor Profile: bert-jan

Understanding the adversary is critical for tuning detections. The primary actor in this simulation exhibited clear behavioral patterns:

The use of the Stratus Red Team Framework was evident through automated role names (stratus-red-team-get-usr-data-role) and high-frequency API calls that triggered throttling exceptions—a key indicator of scripted tooling.

Incident Reconstruction (The 55-Minute Attack)

By correlating the custom alerts, I was able to reconstruct a precise timeline of the breach. This timeline demonstrates how behavioral detections catch attackers even when they attempt to hide their tracks.

Key Insight: The attack event rate spiked +221% during the escalation phase (6.75 → 21.7 events/min). This velocity change alone is a powerful behavioral indicator of automated attack tooling.

MITRE ATT&CK Cloud Matrix Coverage

Each detection rule was explicitly mapped to adversary techniques to ensure comprehensive coverage across the attack lifecycle:

T1595.002 Active Scanning: Vulnerability Scanning
T1555.006 Credentials from Password Stores: Cloud Secrets
T1078 / T1068 Valid Accounts / Exploitation for Priv Esc
T1562.008 Impair Defenses: Disable Cloud Logs
T1550.001 Use Alternate Authentication Material
T1485 / T1490 Data Destruction / Inhibit System Recovery

Lessons Learned

Conclusion

This project proves that effective AWS detection requires moving beyond simple signatures. By focusing on behavioral deviations (spikes in unique API calls) and intent-based errors (UnauthorizedOperation), I built a detection suite that identified the adversary at every stage of the lifecycle—from initial foot printing to final evidence destruction.

The next step? Operationalizing these rules into a production Splunk environment with automated response playbooks. Because in cloud security, detection is only half the battle—response is where breaches are stopped.