Expected Shortfall for the Vasicek Distribution

Compare credit VaR and Expected Shortfall under the Vasicek portfolio loss distribution, and see why ES is monotone in ρ while VaR can be non-monotonic for low PD

Expected Shortfall (ES) is the average loss conditional on the loss exceeding the VaR threshold:

\[ ES^{1-p} \;=\; \mathbb{E}\!\left[L \,\middle|\, L \geq VaR^{1-p}\right] \]

For the Vasicek loss distribution, ES has a useful closed-form involving the bivariate normal CDF (Gordy 2003):

\[ ES^{1-p} \;=\; \frac{1}{p}\,\Phi_{2}\!\left(\Phi^{-1}(PD),\; -\Phi^{-1}(1 - p);\; \sqrt{\rho}\right) \]

where \(\Phi_2(\cdot, \cdot; r)\) is the bivariate standard normal CDF with correlation \(r\). We compute \(\Phi_2\) numerically via an accurate Drezner–Wesolowsky-style approximation.

Three reasons to care about ES in credit portfolios:

Coherent (subadditive). Merging two loan books cannot increase measured risk beyond the sum of parts, unlike VaR.
Tail-sensitive. Captures the magnitude of losses beyond VaR, not just where the tail begins — matters for the heavily right-skewed Vasicek distribution.
Monotone in \(\rho\). ES rises with correlation everywhere, whereas VaR can exhibit a non-monotonic edge case at very low \(PD\) and high \(\rho\).

normalCDF = x => {
  const a1 = 0.254829592, a2 = -0.284496736, a3 = 1.421413741
  const a4 = -1.453152027, a5 = 1.061405429, p = 0.3275911
  const sign = x < 0 ? -1 : 1
  const z = Math.abs(x) / Math.sqrt(2)
  const t = 1.0 / (1.0 + p * z)
  const y = 1 - (((((a5 * t + a4) * t) + a3) * t + a2) * t + a1) * t * Math.exp(-z * z)
  return 0.5 * (1 + sign * y)
}

normalPDF = x => Math.exp(-x * x / 2) / Math.sqrt(2 * Math.PI)

qnorm = {
  const a1 = -3.969683028665376e+01, a2 =  2.209460984245205e+02
  const a3 = -2.759285104469687e+02, a4 =  1.383577518672690e+02
  const a5 = -3.066479806614716e+01, a6 =  2.506628277459239e+00
  const b1 = -5.447609879822406e+01, b2 =  1.615858368580409e+02
  const b3 = -1.556989798598866e+02, b4 =  6.680131188771972e+01
  const b5 = -1.328068155288572e+01
  const c1 = -7.784894002430293e-03, c2 = -3.223964580411365e-01
  const c3 = -2.400758277161838e+00, c4 = -2.549732539343734e+00
  const c5 =  4.374664141464968e+00, c6 =  2.938163982698783e+00
  const d1 =  7.784695709041462e-03, d2 =  3.224671290700398e-01
  const d3 =  2.445134137142996e+00, d4 =  3.754408661907416e+00
  const pLow = 0.02425, pHigh = 1 - pLow
  return p => {
    if (p <= 0) return -Infinity
    if (p >= 1) return Infinity
    if (p < pLow) {
      const q = Math.sqrt(-2 * Math.log(p))
      return (((((c1*q+c2)*q+c3)*q+c4)*q+c5)*q+c6) / ((((d1*q+d2)*q+d3)*q+d4)*q+1)
    }
    if (p <= pHigh) {
      const q = p - 0.5, r = q * q
      return (((((a1*r+a2)*r+a3)*r+a4)*r+a5)*r+a6)*q / (((((b1*r+b2)*r+b3)*r+b4)*r+b5)*r+1)
    }
    const q = Math.sqrt(-2 * Math.log(1 - p))
    return -(((((c1*q+c2)*q+c3)*q+c4)*q+c5)*q+c6) / ((((d1*q+d2)*q+d3)*q+d4)*q+1)
  }
}

// Genz (2004)-style approximation of the bivariate normal CDF
// via Gauss-Legendre quadrature. Accurate to ~1e-10 for moderate correlations.
bvnCDF = (h, k, r) => {
  // P(X <= h, Y <= k) with corr r
  if (Math.abs(r) < 1e-12) return normalCDF(h) * normalCDF(k)
  if (Math.abs(r) > 0.999999) {
    if (r > 0) return normalCDF(Math.min(h, k))
    return Math.max(0, normalCDF(h) + normalCDF(k) - 1)
  }
  // Gauss-Legendre 20-point on [0, arcsin(r)]
  const w = [
    0.017614007139152,0.040601429800387,0.062672048334109,0.083276741576705,
    0.101930119817240,0.118194531961518,0.131688638449177,0.142096109318382,
    0.149172986472604,0.152753387130726,0.152753387130726,0.149172986472604,
    0.142096109318382,0.131688638449177,0.118194531961518,0.101930119817240,
    0.083276741576705,0.062672048334109,0.040601429800387,0.017614007139152
  ]
  const x = [
    -0.993128599185094,-0.963971927277913,-0.912234428251325,-0.839116971822218,
    -0.746331906460150,-0.636053680726515,-0.510867001950827,-0.373706088715419,
    -0.227785851141645,-0.076526521133497,0.076526521133497,0.227785851141645,
    0.373706088715419,0.510867001950827,0.636053680726515,0.746331906460150,
    0.839116971822218,0.912234428251325,0.963971927277913,0.993128599185094
  ]
  const asr = Math.asin(r) / 2
  let bvn = 0
  for (let i = 0; i < 20; i++) {
    const sn = Math.sin(asr * (1 + x[i]))
    bvn += w[i] * Math.exp((sn * h * k - (h * h + k * k) / 2) / (1 - sn * sn))
  }
  bvn *= asr / (2 * Math.PI)
  bvn += normalCDF(h) * normalCDF(k)
  return bvn
}

vasicekVaR = (p, PD, rho) =>
  normalCDF((Math.sqrt(rho) * qnorm(1 - p) + qnorm(PD)) / Math.sqrt(1 - rho))

vasicekES = (p, PD, rho) => {
  // ES^{1-p} = (1/p) * Phi_2(Φ⁻¹(PD), -Φ⁻¹(1-p); √ρ)
  if (p <= 0) return 1
  if (rho < 1e-10) return PD
  const qPD = qnorm(PD)
  const qp = qnorm(1 - p)
  return bvnCDF(qPD, -qp, Math.sqrt(rho)) / p
}

fmt = (x, d) => x === undefined || isNaN(x) ? "N/A" : x.toFixed(d)

pctFmt = (x, d = 2) => (x * 100).toFixed(d) + "%"

Inputs

Tip

How to experiment

Start with PD = 0.5% and the confidence level at 99%. Sweep ρ from 0 toward 1: the VaR curve in the sweep tab rises, then turns and falls slightly — the famous low-PD non-monotonicity. ES, by contrast, rises all the way. Now push PD up to 5%: both measures rise monotonically and the gap between them (ES − VaR) widens with ρ, a clean measure of tail thickness.

viewof esPD = Inputs.range([0.001, 0.20], {
  label: "Unconditional PD",
  step: 0.001,
  value: 0.02
})

viewof esRho = Inputs.range([0.005, 0.95], {
  label: "Asset correlation ρ",
  step: 0.005,
  value: 0.15
})

viewof esAlpha = Inputs.range([0.90, 0.999], {
  label: "Confidence level 1 − p",
  step: 0.001,
  value: 0.99
})

esCalc = {
  const p = 1 - esAlpha
  const VaR = vasicekVaR(p, esPD, esRho)
  const ES = vasicekES(p, esPD, esRho)
  return { p, VaR, ES, tailGap: ES - VaR, ratio: ES / VaR }
}

{
  const r = esCalc
  return html`<div style="display:grid;grid-template-columns:repeat(auto-fit,minmax(220px,1fr));gap:12px;margin-top:8px;">
  <div style="padding:12px 16px;border-radius:8px;background:#e3f2fd;">
    <div style="font-size:0.8rem;color:#666;">${pctFmt(esAlpha, 1)} loss-rate VaR</div>
    <div style="font-size:1.4em;font-weight:700;color:#2f71d5;">${pctFmt(r.VaR, 3)}</div>
  </div>
  <div style="padding:12px 16px;border-radius:8px;background:#fbe9e7;">
    <div style="font-size:0.8rem;color:#666;">${pctFmt(esAlpha, 1)} Expected Shortfall</div>
    <div style="font-size:1.4em;font-weight:700;color:#d62728;">${pctFmt(r.ES, 3)}</div>
  </div>
  <div style="padding:12px 16px;border-radius:8px;background:#fff3e0;">
    <div style="font-size:0.8rem;color:#666;">ES − VaR</div>
    <div style="font-size:1.4em;font-weight:700;color:#e67e22;">${pctFmt(r.tailGap, 3)}</div>
    <div style="font-size:0.8rem;color:#666;">Tail magnitude beyond VaR</div>
  </div>
  <div style="padding:12px 16px;border-radius:8px;background:#ede7f6;">
    <div style="font-size:0.8rem;color:#666;">ES / VaR</div>
    <div style="font-size:1.4em;font-weight:700;color:#5e35b1;">${fmt(r.ratio, 3)}</div>
    <div style="font-size:0.8rem;color:#666;">Under normality at 99%: ≈ 1.15</div>
  </div>
</div>`
}

esSweepRho = {
  const p = 1 - esAlpha
  const rhos = []
  for (let r = 0.005; r <= 0.95; r += 0.005) rhos.push(r)
  const rows = rhos.map(r => ({
    rho: r,
    VaR: vasicekVaR(p, esPD, r),
    ES: vasicekES(p, esPD, r)
  }))
  return rows
}

esSweepPD = {
  const p = 1 - esAlpha
  const PDs = []
  for (let pd = 0.001; pd <= 0.20; pd += 0.002) PDs.push(pd)
  return PDs.map(pd => ({
    PD: pd,
    VaR: vasicekVaR(p, pd, esRho),
    ES: vasicekES(p, pd, esRho)
  }))
}

{
  return Plot.plot({
    height: 360, marginLeft: 60, marginRight: 20,
    x: { label: "Asset correlation ρ", domain: [0, 1], grid: true },
    y: { label: `${pctFmt(esAlpha, 1)} loss rate`, grid: true, tickFormat: d => (d * 100).toFixed(1) + "%" },
    marks: [
      Plot.ruleY([0]),
      Plot.line(esSweepRho, { x: "rho", y: "VaR", stroke: "#2f71d5", strokeWidth: 2.5 }),
      Plot.line(esSweepRho, { x: "rho", y: "ES", stroke: "#d62728", strokeWidth: 2.5 }),
      Plot.ruleX([esRho], { stroke: "#888", strokeDasharray: "3 3" }),
      Plot.dot([
        { x: esRho, y: esCalc.VaR, color: "#2f71d5" },
        { x: esRho, y: esCalc.ES, color: "#d62728" }
      ], { x: "x", y: "y", fill: "color", r: 5, stroke: "white", strokeWidth: 2 }),
      Plot.text([
        { x: 0.95, y: esSweepRho[esSweepRho.length - 1].VaR, text: "VaR", fill: "#2f71d5" },
        { x: 0.95, y: esSweepRho[esSweepRho.length - 1].ES, text: "ES", fill: "#d62728" }
      ], { x: "x", y: "y", text: "text", fill: "fill", fontWeight: 700, textAnchor: "end", dy: -8 })
    ]
  })
}

html`<p style="color:#666;font-size:0.85rem;">VaR and ES as functions of ρ at fixed PD = ${pctFmt(esPD, 2)}. At low PD, the VaR curve can bend back and slightly decrease at high ρ (the bimodal regime pushes the "all-default" peak past the quantile of interest). ES rises monotonically because it averages all of the tail instead of reading a single quantile.</p>`

{
  return Plot.plot({
    height: 360, marginLeft: 60, marginRight: 20,
    x: { label: "Unconditional PD", grid: true, tickFormat: d => (d * 100).toFixed(1) + "%" },
    y: { label: `${pctFmt(esAlpha, 1)} loss rate`, grid: true, tickFormat: d => (d * 100).toFixed(1) + "%" },
    marks: [
      Plot.ruleY([0]),
      Plot.line(esSweepPD, { x: "PD", y: "VaR", stroke: "#2f71d5", strokeWidth: 2.5 }),
      Plot.line(esSweepPD, { x: "PD", y: "ES", stroke: "#d62728", strokeWidth: 2.5 }),
      Plot.ruleX([esPD], { stroke: "#888", strokeDasharray: "3 3" }),
      Plot.dot([
        { x: esPD, y: esCalc.VaR, color: "#2f71d5" },
        { x: esPD, y: esCalc.ES, color: "#d62728" }
      ], { x: "x", y: "y", fill: "color", r: 5, stroke: "white", strokeWidth: 2 })
    ]
  })
}

html`<p style="color:#666;font-size:0.85rem;">VaR and ES as functions of PD at fixed ρ = ${fmt(esRho, 2)}. Both rise monotonically with PD. The ES − VaR gap narrows as PD grows because the tail becomes less skewed relative to the mean.</p>`

esGrid = {
  const p = 1 - esAlpha
  const dPD = 0.005, dRho = 0.025
  const PDs = []
  for (let pd = 0.005; pd <= 0.20 + 1e-9; pd += dPD) PDs.push(pd)
  const rhos = []
  for (let r = 0.025; r <= 0.95 + 1e-9; r += dRho) rhos.push(r)
  const rows = []
  for (const pd of PDs) for (const r of rhos) {
    rows.push({
      PD: pd, rho: r,
      PD1: pd - dPD / 2, PD2: pd + dPD / 2,
      rho1: r - dRho / 2, rho2: r + dRho / 2,
      gap: vasicekES(p, pd, r) - vasicekVaR(p, pd, r)
    })
  }
  return rows
}

{
  return Plot.plot({
    height: 380, marginLeft: 60, marginRight: 20,
    x: {
      label: "PD",
      grid: true,
      domain: [0, 0.205],
      tickFormat: d => (d * 100).toFixed(1) + "%",
      ticks: [0.01, 0.025, 0.05, 0.075, 0.10, 0.125, 0.15, 0.175, 0.20]
    },
    y: { label: "ρ", grid: true, domain: [0, 1], ticks: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0] },
    color: { legend: true, label: "ES − VaR", scheme: "oranges" },
    marks: [
      Plot.rect(esGrid, {
        x1: "PD1", x2: "PD2", y1: "rho1", y2: "rho2",
        fill: "gap", inset: 0
      }),
      Plot.dot([{ x: esPD, y: esRho }], {
        x: "x", y: "y", r: 7, fill: "none", stroke: "black", strokeWidth: 2
      })
    ]
  })
}

html`<p style="color:#666;font-size:0.85rem;">Heat map of the tail magnitude ES − VaR across PD and ρ, with the current selection circled in black. The gap is largest in the low-PD / high-ρ corner — see the note below for why.</p>`

Note

Why is the gap larger when PD is lower?

The Vasicek distribution becomes more right-skewed as PD shrinks at any given ρ:

Low PD means most of the probability mass sits near 0 (good states, where the common factor \(F\) is at or above its mean and almost no loans default). The tail consists of rare scenarios in which \(F\) is deeply negative and almost everyone defaults at once. The loss distribution is almost bimodal: mass near 0 plus a long, thin spike toward 1.
The 99% VaR marks just the start of that tail. Because the tail is sparse, VaR can land at a relatively modest loss level.
ES averages the entire tail beyond VaR, including those near-total-default scenarios. With low PD and high ρ, those extreme scenarios sit far above where VaR cuts in, so the average of the tail is much larger than its threshold. Hence ES − VaR widens.

As PD rises, the conditional default probability is already non-trivial in the bulk; the tail is not pulled as far above the body of the distribution, so VaR catches up to ES and the gap shrinks. Algebraically: as PD → 0, \(\Phi^{-1}(PD) \to -\infty\), and the integrand defining ES is dominated by the “all-default” region near 1, while VaR sits much lower; the ratio ES/VaR diverges. As PD → α, the distribution becomes more symmetric and ES/VaR → 1.

Stylized takeaway. Investment-grade portfolios (low PD) with realistic correlations are exactly the regime where reading just VaR can badly understate tail exposure — which is why ES is favoured for credit and was adopted by Basel III FRTB for market risk.

Note

Regulatory trend. Basel III’s Fundamental Review of the Trading Book (FRTB) replaced 99% 10-day VaR with 97.5% 10-day ES for market risk capital. Credit risk in the banking book (IRB) is still anchored to a 99.9% Vasicek VaR, but economic-capital and stress-testing practice increasingly uses ES for credit portfolios precisely because it is tail-aware and coherent.

References

Gordy, Michael B. 2003. “A Risk-Factor Model Foundation for Ratings-Based Bank Capital Rules.” Journal of Financial Intermediation 12 (3): 199–232.