Security · Topic 13 of 23 · Part II — Catalog of testing methodologies

Blue teaming (and purple)

Blue teaming is the set of capabilities by which an organization actually defends itself — what makes a successful attack noisy, contained, and short-lived.

Syllabus: § Testing Methodologies and Tools of the Trade (2, 4) → Blue teaming
Topic 13 · Blue & purple

Defence is what testing measures

By the end of this topic you can:
  • define blue teaming and explain how it relates to and differs from red teaming
  • describe the major functional capabilities of a blue team: prevention, detection, response, recovery, and the analytical work connecting them
  • explain what detection engineering is and why it has become a distinct discipline
  • place blue-team practice in the context of the testing catalog
  • articulate the value of purple teaming as the bridge between offensive testing and defensive capability

What blue teaming is

Blue teaming is the operational defensive practice of an organization — the work that determines whether an attack succeeds silently or gets seen and stopped.

Prevention & detection

  • Design and operate preventive controls (firewalls, EDR, identity hardening)
  • Build logging and detection capability (SIEM, EDR, IDS/IPS, anomaly detection)
  • Engineer detections — write the rules that turn raw logs into alerts
  • Perform security monitoring and alert triage

Response & improvement

  • Respond to incidents — containment, eradication, recovery, lessons-learned
  • Hunt proactively beyond what alerts fire
  • Improve the program based on incidents and exercises
  • In large organizations: specialized roles; in small ones: one person does all of this

Why blue belongs in the testing catalog

1Testing measures blue capability. A pentest measures whether prevention failed; a red team measures whether the SOC can see and respond; purple teaming measures specific detection coverage. Without blue, these measurements have no referent.
2Test output flows into blue work. Pentest findings become prevention tickets; red team findings become detection-engineering backlog. Testers who cannot write for a blue audience produce reports nobody acts on.
3Most students will work blue. The job market for security operations, detection engineering, and incident response is several times larger than the market for offensive testers.
Central claim Testing exists to make defence better — not to prove that attack is possible.

Defensive strategies and security controls

  • Network security. Firewalls, segmentation, zero-trust network access, egress and DNS filtering.
  • Endpoint protection. EDR on every device, application allowlisting, OS hardening, patch management, full-disk encryption.
  • Identity and access management. MFA (preferably FIDO2/passkeys), least-privilege, PAM, just-in-time access, periodic access reviews.
  • Email and web filtering. Anti-phishing, attachment sandboxing, DMARC/SPF/DKIM enforcement.
  • Vulnerability management. Continuous scanning, prioritized remediation, verified closure.
  • Backup and recovery. Tested backups, immutable copies, recovery rehearsals.
Governance point These controls are owned by engineering teams. The blue team typically governs the security expectations and exercises them through testing.
Identity is the new perimeter Modern attacks frequently bypass network controls entirely via stolen credentials and session-token theft.

Detection: from logs to alerts to action

Detection is what distinguishes "we got hacked" from "we got hacked and we know."

1Telemetry generation. Process logs, network flows, auth events, DNS queries, file integrity, cloud audit logs.
2Collection. Logs flow to SIEM, data lake, or security analytics platform.
3Detection rules. Patterns of attacker activity encoded in Sigma, SPL, KQL, EQL.
4Alert & triage. Matching events produce alerts; analysts (or automation) decide: real, false-positive, or duplicate.
5Investigate & respond. Confirmed alerts are scoped; containment, eradication, and recovery follow.

The dominant reference for what to detect is MITRE ATT&CK — coverage is increasingly described in ATT&CK terms.

Detection engineering as a discipline

What a detection engineer does

  • Reads threat intelligence and attacker tradecraft (or red team output)
  • Translates TTPs into detection rules in the SIEM/EDR
  • Tests detections against attack simulations (Atomic Red Team, CALDERA)
  • Tunes rules to reduce false positives
  • Versions detections in source control with peer review
  • Measures coverage and reports to leadership
The maturity shift Detection-as-code applies software-engineering discipline to detection rules: version control, code review, CI testing, deprecation. It transforms "we have a SIEM" into "we know what we detect, why, and how well."
Sigma Open-source portable rule language with translators to most SIEM/EDR query languages. A detection written in Sigma can move with the analyst across employers.

The modern blue-team toolstack

Tool classPurposeExamples
SIEMAggregate logs, apply detection rulesSplunk ES, Microsoft Sentinel, Elastic Security, Chronicle
EDREndpoint behavioural detection & responseCrowdStrike Falcon, Defender for Endpoint, SentinelOne
NDRNetwork-traffic-focused detectionVectra, Darktrace, Corelight (Zeek-based)
XDRUnified endpoint + network + identity + cloud telemetryPalo Alto Cortex XDR, Microsoft XDR
SOARWorkflow automation: enrich, triage, respondTines, Splunk SOAR, Cortex XSOAR, Sentinel automation
TIPIngest and cross-reference IOC feedsMISP, Recorded Future, OpenCTI

The boundary between SIEM and XDR is blurring. The key takeaway: modern defence is extensively tooled, and operating within this stack is a core blue-team competency.

Incident response process

When detection confirms an incident, the IR process activates. NIST SP 800-61 names six phases:

1Preparation. Playbooks, contacts, tools, training, exercises — done before the incident.
2Detection & analysis. Confirm the incident, scope it, understand what is affected.
3Containment. Stop the spread — isolate hosts, disable accounts, block IOCs.
4Eradication. Remove attacker presence — uninstall malware, close footholds.
5Recovery. Restore systems; monitor for re-compromise.
6Post-incident. Lessons learned, documentation, control improvements.
Context matters Regulatory breach notification (FADP/GDPR Art. 33–34, NIS2) runs in parallel — it has its own timelines independent of technical containment.

Threat intelligence and threat hunting

Threat intelligence

Blue teams ingest intel to:

  • Identify which adversaries target organizations like theirs
  • Stay current on TTPs so detections can be updated
  • Get specific IOCs (IPs, domains, hashes) to feed into detection
  • Inform risk decisions ("ransomware actor X is targeting our sector")

Sources: commercial feeds (Mandiant, CrowdStrike), ISACs, open-source (MISP, OTX), government (NCSC, CISA).

Intelligence is a force multiplier when integrated; a waste when it is a feed nobody reads.

Threat hunting

Proactive search for adversary activity that detections have not yet flagged:

  • Form a hypothesis: "if an attacker did X, we would see Y"
  • Query available telemetry to test it
  • Result: confirm an active incident, inform a new detection, or invalidate the hypothesis
Maturity step Hunt programs operate on a sprint or quarterly cadence and produce both incidents found and detection improvements — beyond pure alert-driven operations.

Purple teaming: the bridge

Purple teaming is red and blue working together to systematically improve detection and prevention — a collaboration mode, not a separate methodology.

A session in practice

  1. Choose a specific ATT&CK technique (e.g. T1003.001 — LSASS Memory dumping)
  2. Red executes the technique; blue watches their tools in real time
  3. If detection fails, detection engineers write a rule on the spot
  4. Re-run the technique; validate the rule; tune for false positives
  5. Document: technique X is detected at this confidence, by this rule

Why it is often more cost-effective than red teaming

  • Red team produces findings; purple produces built and verified detections
  • Gaps identified are filled the same day, by the people who own detection
  • Costs less and builds skill on both sides
  • Appropriate for organizations not yet ready for full covert red teaming

Frequent purple teaming + occasional red teaming as an integrity check is the right model for most organizations.

Common blue-team failure patterns

  • Alert overload. 200 k alerts/day — the real ones are buried. Caused by under-tuned rules and absent detection engineering.
  • Tool sprawl without integration. SIEM + EDR + NDR + SOAR + TIP unbridged = a billing problem, not a defence.
  • The SIEM as expensive logging. Logs collected, no meaningful detections built.
  • No detection coverage knowledge. "Do we detect this?" → "Probably?" — ATT&CK-mapped coverage is the fix.
  • IR plans untested. A documented plan never exercised is fiction.
  • Tabletops without follow-through. Action items that never get done.
  • No baseline / no metrics. "How is our blue team doing?" → no answer.
  • Compliance-driven, not threat-driven. Only chasing audit findings will fail against real attackers.
Tester's perspective A pentest or red team that surfaces any of these produces useful blue-program value, even if no classical vulnerability was found.

Check — alert volume and detection maturity

Reflection

A SOC reports 200,000 alerts per day. What does this tell you about the program, and what would you suggest?

Reveal answer

The detections are under-tuned and detection engineering is absent. Recommendations: invest in detection engineering to tune and deprecate noisy rules; add SOAR to automate tier-1 enrichment and triage; measure signal-to-noise by rule; build an ATT&CK coverage view so the team knows what they detect deliberately versus accidentally.

What you take home

  • Blue teaming is the set of capabilities that makes testing meaningful — without it, findings are shelf-art.
  • The blue team has five major functions: prevention, detection, response, recovery, and the analytical work connecting them.
  • Detection engineering applies software-engineering discipline to rules: source control, CI, coverage metrics.
  • MITRE ATT&CK is the standard vocabulary for both offensive and defensive work; coverage mapping is the maturity dashboard.
  • Purple teaming produces built and verified detections in real time — often the most cost-effective improvement investment.
  • Alert overload, tool sprawl, and untested IR plans are the most common blue-team failure modes testers will encounter.
  • Identity-centric defence is now arguably the most important blue-team capability — most modern attacks bypass the network perimeter entirely.

Next: Topic 14 — Reconnaissance and enumeration. The offensive techniques students will practice are exactly what blue must detect.

END · TOPIC 13

Detect. Respond. Improve.

Review the NIST SP 800-61 IR phases and map one blue-team failure pattern you have observed (or can imagine) in a real organization.