Every automation has a boundary problem.
What should code decide? What should a model decide? What should a human decide?
If you do not answer that explicitly, the boundary gets decided accidentally. Usually by whoever wrote the first prompt, connected the first API, or approved the first demo.
That is how teams end up with workflows that are impressive in a sandbox and frightening in production.
The boundary map
Before building, draw a simple map:
| Decision or action | Code | Model | Human | Notes |
|---|---:|---:|---:|---|
| Validate required fields | Owns | No | No | Schema and rules |
| Interpret messy message | Supports | Owns | Reviews exceptions | Classification and extraction |
| Apply hard policy threshold | Owns | No | Approves exceptions | Keep policy explicit |
| Draft customer response | Supports | Owns draft | Approves if sensitive | Use tone and policy checks |
| Send customer response | Owns mechanics | No | Owns if high risk | Auto-send only when low risk |
| Update CRM field | Owns | No | Reviews exceptions | Idempotent write |
| Approve refund | Enforces limits | Recommends | Owns final decision above threshold | Audit required |
The map is not bureaucracy. It is how you prevent responsibility from disappearing. Revisit it after incidents, policy changes, major model changes, or new external actions.
Code owns exactness
Code should own anything that must be exact:
- required fields
- enum values
- permission checks
- rate limits
- policy thresholds
- idempotency keys
- dedupe logic
- retries
- durable state
- external writes
If a rule is important and expressible, do not outsource it to a prompt.
A model can suggest that a refund seems appropriate. Code should still check whether the amount is allowed, whether the user has permission, whether the ticket was already processed, and whether the action is reversible.
Models own ambiguity
Models are useful when exact rules are too brittle because the input is language-heavy or judgment-heavy.
A model can decide whether a customer is angry, whether a ticket is about billing, whether a contract clause resembles a termination right, or whether a call summary contains a buying signal.
But the model should work inside constraints:
- fixed input contract
- fixed output schema
- allowed categories
- confidence score
- rationale
- review flag
- prompt and model version logging
The model can reason. It should not be allowed to silently invent new workflow states.
Humans own accountability
Humans belong where accountability matters.
That does not mean humans should click approve on everything. That is lazy design. It means humans should be inserted where the system needs judgment, risk ownership, or exception handling.
Good human boundaries include:
- low-confidence model outputs
- irreversible actions
- sensitive customer communication
- legal, security, finance, or HR issues
- policy exceptions
- novel cases not covered by the taxonomy
- sampled review for quality control
Human review is not the opposite of automation. It is a control surface.
Use risk and reversibility
The simplest way to place the boundary is to score risk and reversibility.
| Risk | Reversibility | Suggested automation |
|---|---|---|
| Low | Easy to reverse | Automate with logging |
| Low | Hard to reverse | Add confirmation or delayed action |
| Medium | Easy to reverse | Automate with confidence gate and sampled review |
| Medium | Hard to reverse | Human approval or staged rollout |
| High | Easy to reverse | Human review plus audit trail |
| High | Hard to reverse | Human owns decision; automation assists only |
A wrong tag in a CRM is annoying. A wrong payroll change is serious. A wrong internal draft is recoverable. A wrong email to a regulator is not.
The architecture should reflect that.
Example: renewal risk workflow
A customer success team wants AI to identify renewal risk from call notes and support history.
Boundary design:
- Code gathers account data, open tickets, renewal date, ARR band, and meeting notes.
- Model summarizes risk signals and classifies renewal risk as low, medium, or high.
- Code checks whether the account is strategic, whether renewal is inside 90 days, and whether open escalations exist.
- Low-risk accounts get a CRM note automatically.
- Medium-risk accounts go to CSM review.
- High-risk strategic accounts create an escalation task for the account owner and manager.
- No customer-facing communication is sent without human approval.
The model identifies signals. Code enforces routing. Humans own the customer strategy.
The operator's rule
Draw the boundary before you build the workflow.
If you cannot say which decisions belong to code, model, and human, you are not designing automation. You are distributing risk randomly.
