Harness Engineering Series #6: Human-in-the-Loop as Runtime Design

Human-in-the-Loop as Runtime Design sounds abstract until it is tied to a decision, an owner, and a review loop. The operating question is what changes in the work, who can inspect it, and what happens when the system is wrong.

This post stays in one lane: context, tools, state, checkpoints, orchestration, review, permissions, middleware, and observability. It avoids turning every AI conversation into the same strategy soup. The useful test is whether the idea changes a real workflow, not whether it sounds modern in a planning deck.

The operator problem

The operator problem is the gap between a good demo and a durable work system. Put human review where judgment is costly, not everywhere. The handoff should arrive with evidence, options, and the reason review is needed.

The model matters, but the surrounding operating choices matter more: owner, inputs, permissions, review capacity, escalation, logging, and the mechanism for learning from the next run. If those choices stay informal, the company depends on memory, heroics, and whatever the original builder happened to know.

What good looks like

Good design is usually plain:

Name the accountable owner before choosing the tool.
Write the rule where the work happens, not in a slide.
Define the stop condition before volume grows.
Keep evidence readable enough for a manager to challenge.

For this topic, the artifact is concrete: run spec, context contract, permission table, escalation rule, and run log. If that artifact does not exist, the system is still mostly oral tradition.

The design move

The design move is to pull judgment out of private habit and into the workflow. Put human review where judgment is costly, not everywhere. The handoff should arrive with evidence, options, and the reason review is needed.

A simple test helps: could someone competent join next month, run the workflow, understand the exceptions, and improve the next version without interviewing the one person who built it? If not, too much of the system still lives in people's heads.

Watch the failure mode

The trap is building a general platform before the run is understood. If the first workflow cannot be replayed, audited, and corrected, the platform is mostly furniture.

The fix is a tighter operating loop: state the rule, run it on real work, inspect misses, change the artifact, and repeat. Do not add governance theatre where a sharper rule would do.

A practical starting point

Take one agent workflow that already saves time. Write the run spec beside the prompt: required inputs, tools allowed, approval points, stop conditions, log fields, and who reviews exceptions.

Keep the first pass small enough to inspect by hand. The goal for Harness Engineering is to make agent work reproducible enough that another operator can run, inspect, and improve it.

Bottom line

Human-in-the-Loop as Runtime Design earns its keep only when it changes how work runs. The vocabulary is cheap. The operating artifact, the owner, and the review loop are the proof.

This is part 6 of 10 in Harness Engineering.

Previous: Orchestration Beyond Chat
Next: Sandboxes, Permissions, and Failure Boundaries
View the full series index