Security · Topic 20 of 23 · Part IV — From findings to value

Vulnerability evaluation & scoring

Once a finding exists, the real question is: how bad — and how bad compared to the other twenty findings?

Syllabus: § Understanding Vulnerabilities (6.1–6.3) → Vulnerability evaluation criteria and industry-standard scoring systems

Topic 20 · Scoring frameworks

One number does not tell the whole story

By the end of this topic you can:

distinguish severity from risk and explain why they differ
apply CVSS v3.1, CVSS v4.0, OWASP Risk Rating, and BSI classification — and state what each misses
score a finding consistently and defend every metric setting
adjust raw severity for environmental and contextual factors
combine CVSS with KEV, EPSS, and asset criticality for defensible prioritization

Severity, risk, priority — three different things

Severity A property of the vulnerability itself — how bad if exploited under typical conditions? CVSS Base scores this. Somewhat context-independent.

Risk Severity × likelihood, adjusted for context. A Critical finding on an isolated test system carries less risk than the same finding on a production customer-data server.

Priority What to fix first — a function of risk, cost-to-fix, remediation timeline, and organizational capacity.

Scoring frameworks produce severity primarily. Risk requires environmental context. Priority requires program context.

Reports that hand a CVSS list and call it prioritization have confused severity with priority.

CVSS — the dominant framework

The Common Vulnerability Scoring System (CVSS), maintained by FIRST, is the industry's lingua franca. Every scanner, every CVE entry, every vendor advisory uses it.

Base Properties of the vulnerability itself — context-independent, stable over time. Most published scores are Base-only.

Temporal / Threat Properties that change as threats evolve: exploit maturity, remediation level, report confidence. Renamed Threat in v4.0.

Environmental Properties of this target environment — modified Base metrics plus impact weightings based on local data sensitivity.

Key insight Base-only scoring is the dominant source of mis-prioritization. A 9.8 on a component not exposed in your environment may be a non-issue locally.

CVSS v3.1 Base metrics

Metric	Options	Key question
Attack Vector (AV)	Network · Adjacent · Local · Physical	Where must the attacker be?
Attack Complexity (AC)	Low · High	How reliably can it be exploited?
Privileges Required (PR)	None · Low · High	What privileges are needed?
User Interaction (UI)	None · Required	Does the victim have to act?
Scope (S)	Unchanged · Changed	Does impact cross a trust boundary?
C / I / A Impact	None · Low · High (each)	How bad if successful?

Example vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H → 9.8 Critical.

CVSS v4.0 and severity bands

CVSS v4.0 (2023) introduced notable changes:

Supplemental metrics — Safety, Automatable, Recovery, Value Density, Provider Urgency
Threat metrics replace Temporal — explicit exploit maturity signal
Multiple scores — CVSS-B, CVSS-BT, CVSS-BE, CVSS-BTE instead of a single number
Safety category explicit — critical for OT/ICS/medical contexts

v3.1 remains dominant in 2026; a tester must read both. Always label which version a score is from — never mix them.

Band	Range
None	0.0
Low	0.1 – 3.9
Medium	4.0 – 6.9
High	7.0 – 8.9
Critical	9.0 – 10.0

Caution Bands are conventional, not analytical. A 6.9 is not meaningfully different from a 7.0. Never let the band override the underlying analysis.

CVSS critiques and limits

Base is context-blind. Critical on an internet-facing server = Critical on an air-gapped lab — even though local risk differs vastly.
Poor at chained findings. Two Mediums that chain to a Critical attack path have no CVSS expression.
Authorization flaws score awkwardly. An IDOR may produce massive impact but maps poorly to the metric set.
Environmental metrics are rarely set. Most published scores stop at Base, leading to systematic mis-prioritization downstream.
Severity is not exploit likelihood. A 9.8 with no working exploit is severe in principle, not actively risky right now.
Impact dimensions treated equivalently. Business consequences of C vs. I vs. A loss vary widely; CVSS does not reflect that.

These are reasons to use CVSS as one input, not as the prioritization output.

OWASP Risk Rating & BSI classification

OWASP Risk Rating

Likelihood = avg(Threat Agent factors: Skill, Motive, Opportunity, Size) + avg(Vulnerability factors: Discovery, Exploit, Awareness, Detection)
Impact = avg(Technical: CIA + Accountability) + avg(Business: Financial, Reputation, Compliance, Privacy)
Risk = Likelihood × Impact in qualitative bands
More granular and risk-aware than CVSS; more subjective — scorer variance is higher
Well-suited to web-application assessments

BSI classification

Qualitative: Informational → Low → Medium → High → Critical
Used in regulated contexts in Germany and Switzerland
Mapping CVSS bands to BSI nomenclature is sometimes a contractual requirement for regulated Swiss/German clients

Principle Use the scoring framework that fits the finding type. Don't force every finding into CVSS.

Exploit-availability signals: KEV & EPSS

CISA KEV — Known Exploited Vulnerabilities Curated list of CVEs known to be exploited in the wild. Inclusion means: currently being weaponized. US federal agencies must remediate KEV entries on aggressive timelines.

FIRST EPSS — Exploit Prediction Scoring System Probability (0.0–1.0) of exploitation in the next 30 days, based on vulnerability characteristics, public exploit availability, and observed activity. EPSS 0.95 = active widespread exploitation likely; 0.001 = academic interest only.

Risk-based vulnerability management (RBVM) combines all signals:

CVSS Base (severity) + KEV listing (active exploitation?) + EPSS percentile (imminent likelihood?) + asset criticality + exposure

CVSS 9.8 + KEV + EPSS 0.99 + internet-facing customer data is a different priority from CVSS 9.8 + no KEV + EPSS 0.005 + internal sandbox.

Environmental adjustment & chained findings

Environmental adjustment

Even when CVSS-Environmental is not formally computed, articulate the adjustment in the finding narrative:

Modified AV: published score assumes Network; this system is internal-only → adjust down
Modified PR: public score assumes None; environment requires SSO + MFA → adjust down
Modified C/I/A requirements: data on this asset is non-sensitive — or the reverse

Chained findings

CVSS does not support chains. Conventional approaches:

Score independently, then describe the chain in the attack narrative noting impact exceeds the sum of parts
Score the highest-impact finding at the chain's overall impact — most common; note that the score reflects the chained scenario
Document the chain as a "scenario cluster" with its own qualitative severity — most useful for red-team deliverables

Severity inflation & deflation

Inflation Testers or scanner vendors inflate severity to make findings — or the engagement — look more impactful. A High labeled Critical erodes client trust when discovered.

Deflation by the client "That's only Medium because we have a compensating control." Sometimes legitimate; sometimes avoidance. The tester's role: articulate what compensates and how reliably — not let it silently vanish.

Defensible habit Pick the score you would defend in front of an independent reviewer, with the reasoning written down. If you cannot write the justification, reconsider the score.

Check — score this finding

Exercise

Score with CVSS v3.1 and defend each metric: unauthenticated SQL injection on a public-facing customer portal, yielding read access to a database containing personal data.

Reveal answer

Reasonable vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N → 7.5 High. AV:N — endpoint internet-facing. AC:L — no special conditions. PR:N — unauthenticated. UI:N — no user action needed. S:U — impact stays in the application. C:H — full database read. I:N — read-only. A:N — no availability impact. Score rises if integrity or availability also affected, or if Scope is Changed (e.g., read crosses tenant boundary). Justify each setting explicitly in the report.

What you take home

Severity, risk, and priority are distinct — CVSS produces severity; risk and priority require additional context.
CVSS Base is one input, not the answer — Base-only scoring is the dominant cause of mis-prioritization.
Document the vector, not just the score — the vector lets anyone reproduce the reasoning.
Supplement with KEV and EPSS — active exploitation and likelihood signals change urgency dramatically.
Adjust for environment — articulate local context even in prose, not only in formal Environmental scores.
Match framework to finding type — OWASP for web-app and design findings; BSI bands for regulated contexts; qualitative for process flaws.
Defensible scoring = score + vector + written justification for non-obvious metric choices.

Next: Topic 21 — Remediation & prioritization. Scored findings become action plans.

END · TOPIC 20

CVSS Base score is not risk

Combine it with context, KEV, EPSS, and asset criticality — then you have something actionable.