Token efficiency sounds like an infrastructure concern. It is actually product design.
Every unnecessary field, verbose response, unstable identifier, nested object, repeated explanation, and chatty API call becomes part of the cost of using the product through an agent. Humans tolerate clutter because vision is cheap and attention is flexible. Agents pay for clutter in context window, latency, model spend, and failure probability.
A product surface that works well for agents is compact without being cryptic. It returns the state needed for the next decision, not the entire database object because someone might need it someday. It supports filters that match real jobs. It offers summaries that preserve decision-relevant detail. It lets the agent ask for the next slice instead of dumping everything up front.
The naive version of agent tooling says, “We have an API, so agents can use us.” Maybe. But if completing one job requires twenty calls, three joins, two ambiguous enums, and a long explanation of your domain model in the prompt, the product is expensive to operate. The cost is not only tokens. It is supervision. The more noisy the interface, the more often a human has to check whether the agent understood it.
Token-efficient design has a few practical habits.
Use stable object names and IDs. Make common jobs first-class. Return deltas when the agent already has a local mirror. Separate diagnostic detail from default responses. Offer short and full modes. Design errors that say what to fix. Avoid prose where structure would do. Avoid structure where a compact summary is enough.
This is not about making everything terse. Agents need enough context to make good calls. The goal is signal density: the maximum useful operating context per token.
The best products will treat context like scarce working capital. They will help agents spend it where judgment is needed instead of burning it on decoding product weirdness.
For human users, good UX reduces cognitive load. For agent users, good UX reduces context load. Same instinct, different medium.
If a product wants to be part of delegated work, token efficiency belongs in the product requirements. It is as real as latency, permissions, uptime, and price.
This is part 3 of 10 in Agent-Native Tools.
