MTL

The Evaluation

The evaluation, plainly

Every conversation about a new tool eventually comes down to one question: what do I get, and what does it cost me? Here is the whole answer, up front.

What you get

An independent, structured evaluation of a throughput technology — run against a benchmark built from your own release history — that ends in a quantified ROI estimate and a clear decision. The benchmark, the gap analysis, and the ROI model stay with your team no matter what you decide.

What it costs

No cash. A bounded number of hours from your team, gate by gate, with a stop button at every one. The technology work in the proof-of-value stage is paid for through the program, not by you.

MTL is paid by the program, not by the performers. Our job is a fair read, not a sale.

How it works

Four gates, each with a known cost and a stop button

Gate AValue Scope

~2 hours

We start by understanding you, not the technology: your needs and current context, and a deep dive into your preferences and constraints. It runs as a decision-elicitation exercise that is scientifically valid even when data is scarce — so the criteria that come out of it are ones your team can stand behind.

Output: A one-page evaluation plan with your own go/no-go criteria.

If nothing maps, we stop here — with goodwill intact.

Gate BBenchmark

~4–8 hours of your team's time

An evaluation result only means something against your reality. So before any technology is tested, we establish that standard: an objective benchmark of where release time actually goes in your environment today, and the gap analysis that goes with it.

Output: This is the step a typical proof-of-concept skips — and the reason most POC results can't survive leadership scrutiny. Yours to keep, whatever follows.

Gate CProof of Value

~10–20 engineering hours — the program pays the performer

The candidate technology is measured against your benchmark to answer one question: what is it actually worth in your environment? The program pays the performer for the work; your side is bounded engineering access, scheduled at your convenience, only if you opted in at Gate B.

Output: A quantified ROI estimate your leadership can act on.

Gate DYour decision

With the evidence in hand, your team decides: continue toward adoption, negotiate, build the capability internally, or walk away. Any of those four is a successful outcome of the evaluation — because each one is now a decision with evidence behind it rather than a hunch with a vendor attached.

What you keep, regardless

Yours to keep, at any exit

  • A throughput performance model of your software-release process.
  • A benchmark and gap analysis showing where the time actually goes — reusable for any future tooling or process decision.
  • An ROI model your leadership can act on, in dollars and release-cycle time.

Walking away at any gate leaves you with everything generated up to that point, and no obligation.

What it costs

The full cost, in plain terms

  • No cash outlay. No purchase. No exclusivity. Participation never obligates you to buy anything, from anyone.
  • Bounded hours. Roughly 2 at Gate A, 4–8 at Gate B, 10–20 engineering hours at Gate C — at your convenience, and only as you choose to proceed.
  • One page of paperwork. Participation is expressed in a one-page, non-binding letter — an expression of interest, not a commitment. Anyone who can authorize a third-party pentest can sign it.
  • An estimated in-kind contribution. The letter includes a good-faith estimate of the personnel time and equipment you'd contribute — an estimate that scales with demonstrated ROI, never a hard commitment. It exists to help the program weigh R&D impact, not to bind you.
  • No proprietary disclosure. The letter references your throughput problem and your intent to engage — never your roadmap, your code, or your device internals. Anything deeper happens only under a mutual NDA, only if you choose.

Worth saying out loud

What this is not

  • Not a vendor pitch. No one at MTL earns a commission on any technology's adoption. We are paid by the program, not the performers.
  • Not a procurement process. There is nothing to buy. Gate D can end in “build it ourselves” or “no,” and the evaluation has still done its job.
  • Not a binding commitment. The letter that starts the process is non-binding by design, and every figure in it is a good-faith estimate.
  • Not a science project on your dime. Your hours are bounded and named in advance; the technical work is funded through the program.
  • Not an audit. The benchmark describes your throughput so it can be improved and valued — it is yours, not a report card filed somewhere else.

What’s currently under evaluation

Candidate technologies

MTL evaluates technologies funded as performers in federal health-security R&D. Two are currently in scope:

Karambit.AI

An ARPA-H-funded performer whose software-understanding technology reads a build or patch and produces a human-readable account of what actually changed, and with what confidence — so testing effort can concentrate on the true delta.

Galois

An ARPA-H-funded performer whose technology supports independent algorithm evaluation across the development lifecycle for embedded medical devices.

These are candidate technologies under independent evaluation — not MTL partners, and their presence here is not an endorsement. Whether they fit your environment is precisely the question the evaluation answers.

Why the cohort is three manufacturers

Cohort capacity

Each evaluation in this cohort gets the full attention of a bench — decision scientists, former FDA benefit-risk and device-tools leaders, regulatory and analytic experts — that doesn’t exist assembled anywhere else. That depth doesn’t divide by ten. And program funding is capped per participant, which sets the arithmetic honestly: three manufacturers, evaluated properly, rather than many evaluated thinly.

The first evaluation cohort is forming now; evaluations proceed under federal SBIR funding upon award.

The next step

See whether your release process fits the cohort

A 20–30 minute briefing covers how the evaluation works, what your team keeps regardless, and whether this cohort is a fit. No preparation needed.