Security · Topic 13 of 23 · Part II — Catalog of testing methodologies

Blue teaming (and purple)

Blue teaming is the set of capabilities by which an organization actually defends itself — what makes a successful attack noisy, contained, and short-lived.

Syllabus: § Testing Methodologies and Tools of the Trade (2, 4) → Blue teaming

Topic 13 · Blue & purple

Defence is what testing measures

By the end of this topic you can:

define blue teaming and explain how it relates to and differs from red teaming
describe the major functional capabilities of a blue team: prevention, detection, response, recovery, and the analytical work connecting them
explain what detection engineering is and why it has become a distinct discipline
place blue-team practice in the context of the testing catalog
articulate the value of purple teaming as the bridge between offensive testing and defensive capability

What blue teaming is

Blue teaming is the operational defensive practice of an organization — the work that determines whether an attack succeeds silently or gets seen and stopped.

Prevention & detection

Design and operate preventive controls (firewalls, EDR, identity hardening)
Build logging and detection capability (SIEM, EDR, IDS/IPS, anomaly detection)
Engineer detections — write the rules that turn raw logs into alerts
Perform security monitoring and alert triage

Response & improvement

Respond to incidents — containment, eradication, recovery, lessons-learned
Hunt proactively beyond what alerts fire
Improve the program based on incidents and exercises
In large organizations: specialized roles; in small ones: one person does all of this

Why blue belongs in the testing catalog

1Testing measures blue capability. A pentest measures whether prevention failed; a red team measures whether the SOC can see and respond; purple teaming measures specific detection coverage. Without blue, these measurements have no referent.

2Test output flows into blue work. Pentest findings become prevention tickets; red team findings become detection-engineering backlog. Testers who cannot write for a blue audience produce reports nobody acts on.

3Most students will work blue. The job market for security operations, detection engineering, and incident response is several times larger than the market for offensive testers.

Central claim Testing exists to make defence better — not to prove that attack is possible.

Defensive strategies and security controls

Network security. Firewalls, segmentation, zero-trust network access, egress and DNS filtering.
Endpoint protection. EDR on every device, application allowlisting, OS hardening, patch management, full-disk encryption.
Identity and access management. MFA (preferably FIDO2/passkeys), least-privilege, PAM, just-in-time access, periodic access reviews.
Email and web filtering. Anti-phishing, attachment sandboxing, DMARC/SPF/DKIM enforcement.
Vulnerability management. Continuous scanning, prioritized remediation, verified closure.
Backup and recovery. Tested backups, immutable copies, recovery rehearsals.

Governance point These controls are owned by engineering teams. The blue team typically governs the security expectations and exercises them through testing.

Identity is the new perimeter Modern attacks frequently bypass network controls entirely via stolen credentials and session-token theft.

Detection: from logs to alerts to action

Detection is what distinguishes "we got hacked" from "we got hacked and we know."

1Telemetry generation. Process logs, network flows, auth events, DNS queries, file integrity, cloud audit logs.

2Collection. Logs flow to SIEM, data lake, or security analytics platform.

3Detection rules. Patterns of attacker activity encoded in Sigma, SPL, KQL, EQL.

4Alert & triage. Matching events produce alerts; analysts (or automation) decide: real, false-positive, or duplicate.

5Investigate & respond. Confirmed alerts are scoped; containment, eradication, and recovery follow.

The dominant reference for what to detect is MITRE ATT&CK — coverage is increasingly described in ATT&CK terms.

Detection engineering as a discipline

What a detection engineer does

Reads threat intelligence and attacker tradecraft (or red team output)
Translates TTPs into detection rules in the SIEM/EDR
Tests detections against attack simulations (Atomic Red Team, CALDERA)
Tunes rules to reduce false positives
Versions detections in source control with peer review
Measures coverage and reports to leadership

The maturity shift Detection-as-code applies software-engineering discipline to detection rules: version control, code review, CI testing, deprecation. It transforms "we have a SIEM" into "we know what we detect, why, and how well."

Sigma Open-source portable rule language with translators to most SIEM/EDR query languages. A detection written in Sigma can move with the analyst across employers.

The modern blue-team toolstack

Tool class	Purpose	Examples
SIEM	Aggregate logs, apply detection rules	Splunk ES, Microsoft Sentinel, Elastic Security, Chronicle
EDR	Endpoint behavioural detection & response	CrowdStrike Falcon, Defender for Endpoint, SentinelOne
NDR	Network-traffic-focused detection	Vectra, Darktrace, Corelight (Zeek-based)
XDR	Unified endpoint + network + identity + cloud telemetry	Palo Alto Cortex XDR, Microsoft XDR
SOAR	Workflow automation: enrich, triage, respond	Tines, Splunk SOAR, Cortex XSOAR, Sentinel automation
TIP	Ingest and cross-reference IOC feeds	MISP, Recorded Future, OpenCTI

The boundary between SIEM and XDR is blurring. The key takeaway: modern defence is extensively tooled, and operating within this stack is a core blue-team competency.

Incident response process

When detection confirms an incident, the IR process activates. NIST SP 800-61 names six phases:

1Preparation. Playbooks, contacts, tools, training, exercises — done before the incident.

2Detection & analysis. Confirm the incident, scope it, understand what is affected.

3Containment. Stop the spread — isolate hosts, disable accounts, block IOCs.

4Eradication. Remove attacker presence — uninstall malware, close footholds.

5Recovery. Restore systems; monitor for re-compromise.

6Post-incident. Lessons learned, documentation, control improvements.

Context matters Regulatory breach notification (FADP/GDPR Art. 33–34, NIS2) runs in parallel — it has its own timelines independent of technical containment.

Threat intelligence and threat hunting

Threat intelligence

Blue teams ingest intel to:

Identify which adversaries target organizations like theirs
Stay current on TTPs so detections can be updated
Get specific IOCs (IPs, domains, hashes) to feed into detection
Inform risk decisions ("ransomware actor X is targeting our sector")

Sources: commercial feeds (Mandiant, CrowdStrike), ISACs, open-source (MISP, OTX), government (NCSC, CISA).

Intelligence is a force multiplier when integrated; a waste when it is a feed nobody reads.

Threat hunting

Proactive search for adversary activity that detections have not yet flagged:

Form a hypothesis: "if an attacker did X, we would see Y"
Query available telemetry to test it
Result: confirm an active incident, inform a new detection, or invalidate the hypothesis

Maturity step Hunt programs operate on a sprint or quarterly cadence and produce both incidents found and detection improvements — beyond pure alert-driven operations.

Purple teaming: the bridge

Purple teaming is red and blue working together to systematically improve detection and prevention — a collaboration mode, not a separate methodology.

A session in practice

Choose a specific ATT&CK technique (e.g. T1003.001 — LSASS Memory dumping)
Red executes the technique; blue watches their tools in real time
If detection fails, detection engineers write a rule on the spot
Re-run the technique; validate the rule; tune for false positives
Document: technique X is detected at this confidence, by this rule

Why it is often more cost-effective than red teaming

Red team produces findings; purple produces built and verified detections
Gaps identified are filled the same day, by the people who own detection
Costs less and builds skill on both sides
Appropriate for organizations not yet ready for full covert red teaming

Frequent purple teaming + occasional red teaming as an integrity check is the right model for most organizations.

Common blue-team failure patterns

Alert overload. 200 k alerts/day — the real ones are buried. Caused by under-tuned rules and absent detection engineering.
Tool sprawl without integration. SIEM + EDR + NDR + SOAR + TIP unbridged = a billing problem, not a defence.
The SIEM as expensive logging. Logs collected, no meaningful detections built.
No detection coverage knowledge. "Do we detect this?" → "Probably?" — ATT&CK-mapped coverage is the fix.

IR plans untested. A documented plan never exercised is fiction.
Tabletops without follow-through. Action items that never get done.
No baseline / no metrics. "How is our blue team doing?" → no answer.
Compliance-driven, not threat-driven. Only chasing audit findings will fail against real attackers.

Tester's perspective A pentest or red team that surfaces any of these produces useful blue-program value, even if no classical vulnerability was found.

Check — alert volume and detection maturity

Reflection

A SOC reports 200,000 alerts per day. What does this tell you about the program, and what would you suggest?

Reveal answer

The detections are under-tuned and detection engineering is absent. Recommendations: invest in detection engineering to tune and deprecate noisy rules; add SOAR to automate tier-1 enrichment and triage; measure signal-to-noise by rule; build an ATT&CK coverage view so the team knows what they detect deliberately versus accidentally.

What you take home

Blue teaming is the set of capabilities that makes testing meaningful — without it, findings are shelf-art.
The blue team has five major functions: prevention, detection, response, recovery, and the analytical work connecting them.
Detection engineering applies software-engineering discipline to rules: source control, CI, coverage metrics.
MITRE ATT&CK is the standard vocabulary for both offensive and defensive work; coverage mapping is the maturity dashboard.
Purple teaming produces built and verified detections in real time — often the most cost-effective improvement investment.
Alert overload, tool sprawl, and untested IR plans are the most common blue-team failure modes testers will encounter.
Identity-centric defence is now arguably the most important blue-team capability — most modern attacks bypass the network perimeter entirely.

Next: Topic 14 — Reconnaissance and enumeration. The offensive techniques students will practice are exactly what blue must detect.

END · TOPIC 13

Detect. Respond. Improve.

Review the NIST SP 800-61 IR phases and map one blue-team failure pattern you have observed (or can imagine) in a real organization.