Cloud environments generate a massive volume of logs. For a Security Operations Center (SOC), the challenge isn't finding data—it's finding signal within the noise. In this project, I utilized the Invictus IR AWS Attack Dataset to engineer a suite of custom detection rules in Splunk Enterprise, mapped to the MITRE ATT&CK Cloud Matrix.
The Detection Strategy
Most "out-of-the-box" AWS alerts trigger on single events, such as a failed login. However, sophisticated attackers like the one profiled in this dataset (bert-jan) use automated scripts that mimic administrative behavior. My strategy focused on Statistical Thresholding and Cardinality Analysis.
Technical Deep Dive: Key Logic
1. API Reconnaissance Spike T1595.002
Instead of counting total API calls, this rule counts distinct API names (dc(eventName)). This filters out noisy scripts that retry the same unauthorized action and highlights an actor actively mapping the breadth of the environment.
index=invictus (eventName=List* OR eventName=Describe* OR eventName=Get*)
| eval _time=strptime(eventTime,"%Y-%m-%dT%H:%M:%SZ")
| bin _time span=10m
| stats dc(eventName) as unique_api_calls,
values(eventName) as attempted_calls,
count as total_hits
by _time, sourceIPAddress, userIdentity.arn
| where unique_api_calls > 10
| sort - unique_api_calls
Validation: Detected bert-jan performing 47 unique API calls from 10.8.8.10 within a 10-minute window—clear reconnaissance behavior.
2. SSM Parameter Enumeration T1555.006
AWS Systems Manager Parameter Store often holds secrets: API keys, database credentials, and environment variables. This rule targets bulk retrieval patterns that indicate credential harvesting.
index=invictus eventSource="ssm.amazonaws.com"
eventName IN ("DescribeParameters","GetParameters","GetParameter")
| eval event_time=strptime(eventTime,"%Y-%m-%dT%H:%M:%SZ")
| bin event_time span=10m
| stats count as api_call_count,
dc(requestParameters.name) as unique_parameters,
values(userAgent) AS user_agents
by userIdentity.arn, event_time
| where api_call_count > 10
Validation: Captured the Stratus Red Team automated tool extracting 16 unique secrets using the user agent stratus-red-team_11a6ef34...
3. Failed Privilege Escalation T1078 / T1068
Attackers often "fuzz" permissions to see what their stolen credentials can access. I implemented Dynamic Severity logic using Splunk's eval case() function to prioritize alerts based on the volume of AccessDenied errors.
index=invictus errorCode="AccessDenied" OR errorCode="Client.UnauthorizedOperation"
| eval event_time=strptime(eventTime,"%Y-%m-%dT%H:%M:%SZ")
| bin event_time span=5m
| stats count as failed_attempts,
values(eventName) as attempted_events,
values(eventMessage) as error_details
by event_time, sourceIPAddress, userIdentity.arn
| where failed_attempts > 3
| eval severity=case(failed_attempts > 10, "CRITICAL",
failed_attempts > 5, "HIGH",
1=1, "MEDIUM")
| sort - failed_attempts
Validation: Flagged 15 failed DescribeInstanceAttribute attempts by stratus-red-team-get-usr-data-role—a clear EC2 credential theft pattern.
4. CloudTrail Tampering T1562.008
A critical, high-fidelity alert monitoring for attempts to disable logging. Attackers use this to "blind" security teams before performing destructive actions. This rule has zero tolerance—any match is CRITICAL.
index=invictus
eventName IN ("StopLogging","DeleteTrail","UpdateTrail",
"PutEventSelectors","DeleteDetector","ArchiveFindings")
| table eventTime, eventName, awsRegion, userIdentity.arn,
sourceIPAddress, userAgent, requestID
Validation: Detected bert-jan executing StopLogging at 12:01 PM, immediately followed by DeleteTrail during the cleanup phase.
5. Lateral Movement via AssumeRole T1550.001
Detects Role Chaining—where an attacker pivots from one IAM role to another to escalate privileges or hide their origin. The rule alerts on spikes in AssumeRole calls within a time window.
index=invictus eventName="AssumeRole"
| eval event_time=strptime(eventTime,"%Y-%m-%dT%H:%M:%SZ")
| bin event_time span=10m
| stats count as total_attempts,
values(errorMessage) as error_message,
values(errorCode) as errors
by userIdentity.arn, event_time, sourceIPAddress
| where total_attempts > 3
| table event_time, userIdentity.arn, sourceIPAddress,
total_attempts, errors, error_message
Validation: Identified 17 AssumeRole attempts (4 failed) from bert-jan, indicating active privilege escalation via role chaining.
6. Mass Deletion / Destruction T1485 / T1490
The final safety net. This rule detects bulk deletion of assets (>20 events in 5 minutes) OR the deletion of specific critical infrastructure (KMS Keys, CloudTrail logs) regardless of volume.
index=invictus (eventName="Delete*" OR eventName="Terminate*" OR eventName="Drop*")
| eval event_time=strptime(eventTime,"%Y-%m-%dT%H:%M:%SZ")
| bin event_time span=5m
| stats count as deletion_count,
values(eventName) as actions_taken
by userIdentity.arn, sourceIPAddress, event_time
| where deletion_count > 20
OR (deletion_count >= 1 AND
(actions_taken="DeleteTrail" OR actions_taken="DeleteKey"))
Validation: Cross-correlated with CloudTrail tampering activity, catching the final DeleteTrail event at 12:35 PM.
Threat Actor Profile: bert-jan
Understanding the adversary is critical for tuning detections. The primary actor in this simulation exhibited clear behavioral patterns:
- 1,037 events (83.8% of all activity)
- 187 unique API actions across EC2, S3, SSM, IAM, and KMS
- 6 distinct source IPs, with
192.168.10.20as primary orchestration - Attack capabilities: Infrastructure manipulation, IAM privilege escalation, secrets exfiltration, and logging tampering
The use of the Stratus Red Team Framework was evident through automated role names (stratus-red-team-get-usr-data-role) and high-frequency API calls that triggered throttling exceptions—a key indicator of scripted tooling.
Incident Reconstruction (The 55-Minute Attack)
By correlating the custom alerts, I was able to reconstruct a precise timeline of the breach. This timeline demonstrates how behavioral detections catch attackers even when they attempt to hide their tracks.
-
11:42 AM — Attack Initiation: First
AssumeRoleand secrets access events detected. - 11:50 AM — Reconnaissance: Attacker performed mass enumeration of VPCs, S3 buckets, and SSM parameters (Detected via Rule 1 & 2).
-
11:55 AM — Privilege Escalation: Automated scripts attempted to extract EC2 metadata, triggering multiple
AccessDeniedevents (Rule 3). -
12:00 PM — Lateral Movement: Attacker attempted role chaining via
AssumeRoleto gain higher privileges (Rule 5). -
12:01 PM — Defense Evasion: Attacker executed
StopLoggingto disable CloudTrail and "blind" the security team (Rule 4). -
12:20 PM — Data Exfiltration: Secrets retrieved via
GetParameterandDecrypt; database snapshots prepared for external sharing. -
12:35 PM — Impact & Cleanup: Attacker deleted the trail configuration (
DeleteTrail) and attempted infrastructure destruction (Rule 6).
Key Insight: The attack event rate spiked +221% during the escalation phase (6.75 → 21.7 events/min). This velocity change alone is a powerful behavioral indicator of automated attack tooling.
MITRE ATT&CK Cloud Matrix Coverage
Each detection rule was explicitly mapped to adversary techniques to ensure comprehensive coverage across the attack lifecycle:
Lessons Learned
- Behavior > Signatures: Counting distinct API calls (
dc()) is far more effective than raw volume for detecting reconnaissance. - Context Matters: Correlating
AccessDeniederrors with specific API names (DescribeInstanceAttribute) reduces false positives. - Time-Binning is Critical: Using
bin span=10menables detection of burst activity without alert fatigue. - Dynamic Severity Saves Time: Routing alerts by failure volume helps analysts prioritize true threats.
- Test Against Real Data: Validating rules against the Invictus dataset confirmed 100% coverage of the simulated attack chain.
Conclusion
This project proves that effective AWS detection requires moving beyond simple signatures. By focusing on behavioral deviations (spikes in unique API calls) and intent-based errors (UnauthorizedOperation), I built a detection suite that identified the adversary at every stage of the lifecycle—from initial foot printing to final evidence destruction.
The next step? Operationalizing these rules into a production Splunk environment with automated response playbooks. Because in cloud security, detection is only half the battle—response is where breaches are stopped.