Agents will not become useful workers if every interruption turns into a restart. Real work pauses. A customer has not replied. A manager needs to approve. A data export is still running. A model call failed. The agent needs more context. The system rate-limited a batch. Someone changed the source record halfway through the job.

Human workers handle this with memory, notes, inboxes, calendars, and judgment. Software often handles it badly. It treats work as a request-response event. Either the thing finished or it failed. That is not enough for delegated work.

Agent-native tools need durable state and resumable work.

A durable job should have an identity, owner, status, inputs, current step, artifacts, pending decisions, approvals, retries, and a history of what happened. It should survive process restarts, model changes, network failures, and human delays. It should be inspectable by a person without requiring them to replay the whole conversation.

Resumability changes the trust equation. If an agent can say, “I completed steps one through four, paused because approval is required before contacting customers, and here is the exact diff,” the human can make a decision quickly. If the agent says, “Something went wrong,” the human has to become a detective.

This is also where product teams need to stop treating agent work as a chat transcript. A transcript is a poor operating record. It is useful evidence, but it is not a job state model. The system needs structured progress: queued, planning, waiting on input, executing, verifying, blocked, completed, reverted.

Durable work also protects against the most annoying failure mode in automation: duplicate action. If a job can resume idempotently, the agent does not resend the same email, reopen the same ticket, or apply the same update twice because it lost track.

The lesson from reliable systems applies directly: state, checkpoints, retries, idempotency, and logs are not engineering garnish. They are what makes delegation safe.

If a product wants agents to do long-running work, it needs to give them somewhere to put the work while it is unfinished.


This is part 6 of 10 in Agent-Native Tools.