TEA Handbook

Concept

Benchmarking against incumbents

analyticalstructural

Overview

Benchmarking is comparing a finished analysis’s headline outputs — levelized cost, carbon intensity, energy intensity, capex per tonne — against the known values of characterized incumbent or competitor processes, to test whether the result is plausible and to see how the analyzed process stacks up. It is an output check on a built model, distinct from borrowing an incumbent’s data as an input anchor.

Body

What it is. Benchmarking takes the model’s key computed outputs and sets them beside the same metrics for real, well-characterized incumbents — the dominant existing route, a best-in-class plant, the market leader. It answers two questions at once: plausibility (does my number land where reality does?) and competitiveness (does the analyzed process beat, match, or lose to the incumbent on the metric that decides the case?).

Input anchor vs. output check. This is the distinction that separates benchmarking from a reference / comparable process. A comparable seeds the model — its measured data is borrowed as the starting figure a cost or performance estimate is built from. Benchmarking tests the finished model — placing its computed output against the incumbent’s. The same incumbent plant can serve both roles, but seeding a model from a datum and checking the model against it are different uses, and using one datum for both at once is circular (see Limits).

What gets benchmarked. The handful of headline metrics the analysis exists to produce: levelized cost in $/t, carbon intensity in CO₂e/t, energy intensity, capex per tonne of capacity, sometimes yield. Each is compared metric by metric against the incumbent’s value for the same quantity.

Like-for-like is the requirement. A benchmark means something only when both numbers share a basis — the same boundary, cost year, capacity definition, and scope. A cradle-to-gate cost set against a published gate-to-gate one, or a current estimate against a decade-old incumbent figure, measures the basis difference rather than the process difference, exactly as a levelized cost is non-comparable across mismatched bases.

One incumbent vs. a class. Benchmarking against a single plant gives one comparison point; benchmarking against the spread of comparable plants is stronger, because incumbents of the “same” type vary widely — the reference-class view of where the analyzed process sits in a distribution, not just against one exemplar.

Two jobs at once. Benchmarking is both the anchor-comparison form of an order-of-magnitude gut check — does the output sit where real processes sit? — and a headline result in its own right, since “process X is competitive with the incumbent on $/t” is often the conclusion the whole analysis was built to support.

Limits & typical error

See also

Mini-example

Benchmarking green ammonia’s levelized cost ($800/t, from the running example) against the incumbent gray route. On a comparable production-cost basis, conventional gray ammonia is on the order of **$200–400/t**, gas-price-dependent (a round market band, not a sourced figure). Green therefore benchmarks at roughly 2–3× the incumbent cost at typical gas prices — both a plausibility pass (a ~2–3× green premium matches the widely reported gap, so ~$800/t is believable) and a competitiveness result (green is not yet cost-competitive on unsubsidized $/t; that gap is the space policy credits must close). On the metric that often decides the case, though, the benchmark flips: green’s carbon intensity (~0 on renewables) decisively beats gray’s (~1.6–2.4 t CO₂e/t), so which process “wins” depends entirely on whether cost or carbon governs.

Separately, to show circularity: if the green plant’s capex had been anchored to a specific published gray-ammonia plant cost as its input comparable, then “benchmarking” green’s capex per tonne against that same plant only re-confirms the anchor — the model was built from it, so it cannot disagree. The benchmark must come from outside the model’s own inputs to test anything.

And to show basis mismatch: setting green’s cradle-inclusive cost beside a gray gate-to-gate cost that excludes upstream gas production compares two different boundaries — the apparent gap is partly the scope difference, not the process difference, until both are put on one basis.

See also