Traditional SaaS trained operators to think of product usage as mostly good. More seats, more activity, more workflows, more data, more stickiness. Infrastructure costs mattered, but the basic economic model was generous enough that usage usually meant progress.
AI changes that instinct. Every generation, retrieval, classification, transcription, evaluation, and agentic loop can carry direct cost. The product may be software, but the marginal act of serving the customer can look more like metered labor than static cloud hosting.
That reality leaves plenty of room for strong AI businesses. It also makes their customer economics less forgiving. A customer with heavy usage may be the best customer in the portfolio if their activity is priced well and the workflow scales cleanly. The same usage can be destructive if the contract is flat, the product retries constantly, the model choice is too expensive, or the customer needs manual review to trust the output.
The first AI COGS mistake is averaging everything. The company reports blended gross margin, but customer behavior is not blended. One segment uses the product lightly and gets strong value. Another segment uses the product intensively but pays enough to support it. A third segment uses the product unpredictably, triggers expensive paths, and opens support tickets whenever the model disappoints. The blended number hides the operating truth.
The second mistake is treating compute as the entire AI cost. Compute matters, but AI delivery also creates evaluation work, quality monitoring, prompt and workflow maintenance, data pipeline maintenance, compliance review, escalation handling, and human-in-the-loop operations. Some of those costs belong in product investment. Some belong in COGS. All of them belong in the customer profitability discussion.
The third mistake is ignoring variance. AI products can have spiky usage patterns. A customer may run a large batch, launch a new workflow, onboard many users, or discover a behavior that causes repeated calls. If the pricing model does not map to cost drivers, expansion can lower profitability.
This is where customer-level instrumentation matters. The company should know which accounts consume the most model cost, which workflows drive retries, which features require human QA, which integrations create support load, and which customers need custom reliability commitments. Without that, pricing and packaging become guesswork.
The operator test: can the team explain customer gross margin by workflow, not just by account?
Workflow is the real unit because AI cost follows behavior. A customer may be profitable in one use case and unprofitable in another. The same logo can contain a healthy workflow, a fragile workflow, and a future product opportunity. Account-level averages are useful, but they leave too much hidden.
Pricing should respond to that reality. Flat pricing can work when usage is predictable and margins are strong. Usage-based pricing can work when customers understand the value metric. Outcome pricing can work when the company can measure value and control delivery cost. Hybrid pricing may be necessary when the product combines seats, usage, and premium human-supported workflows.
The important move is to stop pretending AI COGS is a finance cleanup item. It is a product design constraint, a packaging constraint, and a customer strategy constraint.
In AI, the customer buys the product and activates the cost structure. Customer profitability is the discipline of noticing which activations make the company stronger.
The product team should also know which model paths are economically different. A cheap classification, an expensive reasoning step, a long-context retrieval flow, and a human-reviewed output are not the same product event. If they sit behind the same package and the same customer promise, the company is letting users choose the cost structure without seeing the bill.
Every feature does not need a visible meter. Operators still need internal cost literacy. Product managers, designers, sales leaders, and CSMs should understand which workflows are cheap to scale and which require economic guardrails. Otherwise the company will keep celebrating adoption while the margin story gets worse.
The review should include cost per successful outcome alongside cost per call. If a workflow needs five retries, support intervention, or human review before the customer trusts it, the cheap-looking model call is misleading. AI margin improves when the product reduces wasteful loops, chooses the right model for the job, and prices the expensive paths deliberately.
A good AI product review should include the costly path alongside the happy path. What happens when retrieval misses? What happens when the model produces an answer the user does not trust? What happens when the customer asks for more context, longer documents, larger batches, or stricter review? Those moments often define the true cost to serve.
This is where product quality and margin meet. Better UX can reduce wasteful retries. Better defaults can steer users toward cheaper workflows. Better evals can reduce human review. Better packaging can reserve expensive paths for customers who pay for them. Gross margin is not only negotiated in pricing. It is designed into the product.
The team should review expensive workflows with the same seriousness it gives conversion funnels. Where does cost spike? Where does trust fail? Which default sends users down an expensive path? Those questions turn margin from an accounting surprise into a design input.
Evidence note: this series uses public AI gross-margin context, including Bessemer's State of AI 2025 and analysis of AI gross-margin variance: https://www.bvp.com/atlas/the-state-of-ai-2025 and https://www.tanayj.com/p/the-gross-margin-debate-in-ai
This is part 3 of 10 in Customer Profitability in the AI Era.