Building AI Products Series #7: Model Choice Is Product Strategy

Model choice is often treated as an engineering detail.

It is not.

Choosing between a frontier API, a smaller model, an open model, a fine-tune, retrieval, rules, agents, or human operations changes the product's cost, latency, privacy posture, reliability, roadmap, support model, and competitive durability.

That is strategy.

The best model is not always the biggest model

The strongest model may be the right choice for ambiguous reasoning, complex synthesis, or early product exploration.

It may be the wrong choice for high-volume classification, low-latency autocomplete, simple extraction, or workflows with strict data residency constraints.

Using the biggest model everywhere is easy. It is not always good product design.

A mature AI product often uses a mix: rules for deterministic constraints, retrieval for grounding, a smaller model for routine tasks, a frontier model for hard cases, and human review for high-risk exceptions.

The architecture should match the product promise.

Model strategy shapes UX

Latency is UX.

A model that takes fifteen seconds may be acceptable for a quarterly report synthesis. It is painful for inline writing assistance. A cheap batch process may be perfect for overnight analysis. A real-time workflow may need a smaller model or cached output.

Cost is also UX, indirectly. If every interaction is expensive, the product will add limits, delays, or pricing friction. Users will feel those constraints.

Privacy shapes UX too. If enterprise users cannot send certain data to an external API, the product may need deployment options, data controls, redaction, or local processing.

These are not backend concerns. They determine what the product can promise.

Artifact: model-choice decision matrix

`text

Model-Choice Decision Matrix

Option: Frontier API

Best for:

ambiguous reasoning
broad language tasks
rapid prototyping
high-value low-volume workflows

Watchouts:

cost
latency
vendor dependency
data processing concerns
behavior drift

Option: Smaller hosted model

Best for:

narrow tasks
high-volume classification/extraction
lower cost requirements
predictable latency

Watchouts:

weaker general reasoning
more task-specific tuning needed

Option: Open / self-hosted model

Best for:

privacy-sensitive deployments
control over infrastructure
custom optimization
data residency needs

Watchouts:

ops burden
serving cost
model maintenance
slower access to frontier capability

Option: Fine-tune

Best for:

consistent style or domain-specific behavior
repeated task patterns
reducing prompt complexity

Watchouts:

training data quality
stale behavior
evaluation burden
not a substitute for missing context

Option: RAG / retrieval

Best for:

grounding in proprietary or current data
citation requirements
enterprise knowledge workflows

Watchouts:

retrieval quality
permissions
stale or conflicting documents
source ranking

Option: Rules / deterministic logic

Best for:

compliance constraints
permissions
calculations
predictable business logic

Watchouts:

brittle for ambiguous inputs
maintenance overhead as rules expand

Option: Agentic workflow

Best for:

multi-step tasks with tool use
workflows where planning and execution matter

Watchouts:

observability
runaway cost
permissions
testing difficulty
user trust

Option: Human operations

Best for:

high-risk exceptions
domain judgment
quality bootstrapping
customer-specific edge cases

Watchouts:

scalability
cost
inconsistent review
hidden process debt

The decision is rarely one row. Good products combine rows deliberately.

Vendor drift is product drift

If your product depends on an external model, model updates can change user experience without your code changing.

That is not a reason to avoid APIs. It is a reason to design for drift.

Keep regression sets. Version prompts. Monitor output quality. Roll out model changes gradually where possible. Have fallback options for critical workflows. Communicate meaningful behavior changes to customers when they affect trust or compliance.

A vendor change that breaks customer workflows is still your product problem.

Have an exit plan before you need it. If pricing changes, know which workflows can move to a smaller model, batching, caching, or paid limits. If behavior changes, keep a pinned fallback version or alternate provider for critical paths. If API terms change, know which customers, data types, and features are affected and what can be disabled safely. The plan does not need to be dramatic. It needs to exist.

RAG is not a magic grounding layer

Retrieval-augmented generation is useful. It is not pixie dust.

If retrieval brings back the wrong document, the model may produce a grounded-looking wrong answer. If permissions are wrong, the product may leak data. If documents conflict, the model may smooth over the conflict instead of escalating. If source quality is poor, citations create false confidence.

RAG needs evals, permissions, freshness checks, and UX that exposes sources honestly.

Agents raise the product bar

Agentic systems can be powerful because they can plan, use tools, and complete multi-step work.

They also increase the surface area for failure.

An agent needs clear permissions, bounded actions, observability, interruption, rollback, cost limits, and user control. The more autonomy it has, the more product work is required around it.

Do not ship an agent because it sounds modern. Ship one when the workflow benefits from delegated multi-step execution and the risk controls are real.

The practical standard

Choose models the way you choose product architecture: based on the user promise, risk, economics, latency, data constraints, and durability.

The model is not just what powers the feature.

It shapes what the feature can be.

Building AI Products Series #7: Model Choice Is Product Strategy

The best model is not always the biggest model

Model strategy shapes UX

Artifact: model-choice decision matrix

Vendor drift is product drift

RAG is not a magic grounding layer

Agents raise the product bar

The practical standard

Written by Antoine Buteau

Building AI Products Series #8: Data Flywheels Without Magical Thinking

Building AI Products Series #6: Evals Are Product Requirements