Guide · Layer 4
Written for: seed / Series A scientist-founder
The first three layers built an engine: Layer 1 fixed the shape, Layer 2 put physical numbers on the flows, Layer 3 turned them into one headline figure — the levelized cost, ~$800/t for green ammonia. But a single number is not a decision. A point estimate with a ±30% band around it says nothing on its own about how much to trust it, what would change it, or what to go do next. The analytical layer is where the deterministic engine becomes a decision tool.
It does that by asking four questions of the number the economics produced: is it even right? what moves it? how robust is it? and what should we do about it? The first two layers were about building the model correctly; this one is about interrogating it — and it’s where a TEA stops being a spreadsheet and starts being the thing it was built for. This is the layer that connects the economics back to the science: by finding which assumptions actually decide the answer, it tells you which experiment, which measurement, which R&D bet moves the commercial case. That feedback — from the cost model to the lab bench — is the whole point of doing a TEA this early.
The defining discipline is the maturity anchor’s again: at this stage the analysis stays deterministic — you explore the uncertainty by hand, deliberately, rather than by sampling distributions: move inputs one at a time, then bundle a few of them into coherent scenarios. The skill is not the arithmetic — it’s choosing honest ranges and plausible bundles, and then reading the result for the handful of drivers that matter out of the dozens of inputs that don’t.
This layer runs four moves — gut-check the number → sensitize to find the drivers → bundle into scenarios → feed the drivers back to the science. Each routes into the concept pages for the how; the reads below are about where to spend your effort.
Before any sensitivity, confirm the headline is even plausible — it’s the cheapest test in the whole TEA. Two independent checks: an order-of-magnitude gut-check — recompute the number roughly by a different route and see if it lands in the same place — and benchmarking against incumbents — set your output beside the same metric for real, characterized plants, both to test plausibility and to see how you stack up.
🧭 Coach’s Read. Don’t run a sensitivity on a number you haven’t gut-checked — a tornado chart on a 10×-wrong figure is just precise nonsense, and the most common source of a 10× error is a unit slip the model will never flag itself. Spend the one minute: rebuild the headline from a different direction (for an electrified route, “power cost × ~2” gets you most of the way) and make sure it agrees to within a factor of ~2. The discipline that makes it work is independence — recompute by a genuinely different path, not the same spreadsheet with one cell changed, or you’ll just reproduce your own error and feel reassured. On the benchmark, two traps: compare strictly like-for-like (same boundary, cost year, capacity definition — a cradle-to-gate cost beside a gate-to-gate one measures the basis, not the process), and don’t benchmark against your own input anchor — if the incumbent figure is the same one you seeded the model from, the check is circular and tests nothing. Used well, the benchmark frames the decision: it tells you which metric the case turns on before you’ve spent an hour on the wrong one.
Now find what moves the answer. Run a one-way sensitivity on each input — flex it across its credible range, hold everything else fixed, record the output swing — then rank the swings in a tornado chart. The longest bars are your drivers: the few inputs the whole answer rides on.
🧭 Coach’s Read. The analysis is the range, not the wiggle. What makes an input a driver is leverage = how steeply the output responds × how far the input can plausibly move — so a blanket ±10% on every input is the signature beginner mistake: it ranks by raw sensitivity and buries the real drivers under well-known inputs that can’t actually move. Draw each input’s range from its own uncertainty — a price’s market history, a proxy’s error band, a comparable’s spread — and let the genuinely volatile inputs show genuinely long bars. The payoff is concentration: drivers cluster, and you’ll almost always find 2–3 inputs carry the answer (for green ammonia, the power price, the capacity factor, and the capital) while the rest is noise you can stop polishing. That’s exactly the handoff Layer 3 promised — the input price and the capacity factor that layer flagged usually top the chart. Two reading disciplines: a tornado is a display of one-way runs, so don’t add the bars (each one holds the others fixed) and don’t read it as a probability — it shows leverage, not likelihood. One diagnostic to run as you sweep: if you flex an input across its whole range and the output doesn’t move at all, don’t file it as a non-driver — first suspect a bug. A genuinely dead sensitivity usually means the cost isn’t actually wired through to the answer (a broken reference, a number hand-entered over the formula that should compute it), not that the input doesn’t matter. And aim the sweep at the right variable: the figure that decides a case is often not your absolute metric but the delta versus the reference case — and the two can move together. If a power-price spike lifts both your green route and the gray incumbent, the gap barely changes, and the gap is what you’re actually selling. Sensitize the difference, not just the level.
One-at-a-time analysis misses combinations — the real downside is usually several inputs going wrong together. A scenario captures that: a coherent bundle of inputs moved at once to describe one internally consistent world. Two kinds matter here — the route scenarios (gray / blue / green, which differ in structure) and the condition scenarios (optimistic / base / pessimistic, the same plant under bundled prices, performance, and financing).
🧭 Coach’s Read. Coherence is the entire discipline. Build your downside by moving the drivers together in a way that could actually happen — a stressed grid plausibly hands you a low capacity factor and a high power price at the same time — not by stacking every input’s worst value independently. That independent-worst stack is the most common scenario error: it manufactures a world that can’t exist and a scare number nobody believes, because the conditions that hurt one input often help a correlated one. Keep the set small — three cases (low / base / high) communicate; thirty don’t — and remember scenarios carry no probabilities: the base case is not the “expected” value and the downside is not a P90, so don’t let a reader treat the spread as a confidence interval. Use the gray/blue/green routes as your headline scenario set, but only compare them on one shared basis — same boundary, same carbon-intensity accounting — or the route comparison is apples-to-oranges. One more habit, less about the math than about how you carry the results: it’s worth holding two ranges and not letting them blur. There’s the conservative “low-low” case you plan and manage the company against internally — the one that keeps you honest about runway if things break your way less often than you’d hoped — and there’s the legitimate upside case, the real prize, that you put in front of investors and partners. That’s not cherry-picking or double bookkeeping; it’s the difference between how you run the business and what opportunity you’re inviting someone to back. The discipline is just staying clear which is which — present the upside as upside, never quietly as the expected case (the honest-headline rules in Layer 5 still hold).
This is the payoff of the layer, and of the whole TEA. The drivers tell you where the economics are actually decided — and that points straight back at the science: the next experiment to run, the assumption to pin down, the parameter worth improving. The analysis you just did is the bridge between “what does it cost?” and “what should we go do in the lab?”
🧭 Coach’s Read. Sort your drivers on a second axis the tornado doesn’t show: controllability. Leverage tells you what moves the answer; controllability tells you what you can do about it — and the cross of the two is your action list. A high-leverage input you can’t control (the grid power price) is a risk to hedge or a market bet to state out loud — flag it and move on, because no experiment changes it. A high-leverage input you can control (an electrolyzer’s energy intensity, a reactor’s conversion, a catalyst cost) is your next experiment — that’s where R&D effort buys the most economic improvement, and the TEA just told you so before you spent a year on it. A high-leverage input you can’t yet pin down (a proxy with a wide band) is your next sourcing task, not a lab task. And the inverse rule matters just as much: don’t refine a non-driver, however shaky it feels — sharpening a parameter the answer is insensitive to improves your confidence, not your decision. Two framings worth keeping in hand here. First, your biggest lever should be one you control, not one the market hands you — if the longest bar is the grid price, your headline is really a bet on a market, and the honest move is to say so out loud and to go hunting for a controllable lever you can put R&D behind. Second, when a team asks “what should we target — is 95% good enough, or do we need 98%?”, let the model answer it. Push the parameter to each value and read what it does to the headline; the TEA is precisely the instrument for deciding whether the juice is worth the squeeze, and that’s usually better settled with a sensitivity than by gut. Run this loop and the TEA stops being a report and becomes a research roadmap.
🪜 Leveling Up — Monte Carlo and probabilistic analysis. The hand analysis above explores uncertainty along spokes (one-way) and at a few coherent points (scenarios), deterministically. The deeper method is probabilistic: assign every input a distribution, then Monte Carlo-sample all of them at once to get a full probability distribution of the output — a P10/P50/P90 on your cost per tonne — capturing the interactions that one-way sweeps and a handful of discrete scenarios both miss. Climb to it when a counterparty’s diligence demands a quantified risk view, when input interactions genuinely dominate the answer, or when you’ve moved past go/no-go into formal risk analysis. But know what it asks of you: every input now needs a defended distribution, not just a range — and a Monte Carlo output looks rigorous whether or not the distributions behind it are real, which is where its false-precision risk lives. A P50 to four significant figures built on a dozen guessed distributions tells you less than a deterministic
~$800/t ±30%with three named drivers. Used well — with distributions you can genuinely source — it’s a real step up; the only caution is against reaching for it before you can feed it honestly. At the maturity anchor, deterministic sensitivities plus a few coherent scenarios are deliberately the right tool — they find the drivers and test the robustness without claiming a probability you can’t yet source.
The four moves, run once on green ammonia — every figure a round, illustrative anchor carried from the concept pages and the reference sheet, where the full build and its open validation items live:
The pattern, same as every layer: the magnitudes and mechanics stayed on the concept pages; this layer sequenced them — validate, then sensitize, then bundle, then act — and turned a cost number into a list of what to go do. A late-stage guide could route the same six pages into a probabilistic risk model instead; that swappability is the point.
Once you know what moves the answer and how robust it is, the last layer makes it travel: Layer 5 — Communication Layer, where the headline number and its few drivers go into a deck and in front of investors — without giving away the trade secret that produced them.