Daily Digest — 2026-03-18

1. Inference Engineering: AI's Trillion-Dollar Bet — Avid

Why read: Understand why the "model race" is over and the "inference monopoly" has begun.
Summary: Jensen Huang’s $1 trillion forecast for Blackwell systems signals a pivot from training potential to inference production. As intelligence becomes a commodity (1,000x price reduction in 3 years), value migrates to the scarce ability to serve that intelligence reliably at scale and low latency. The "Inference Engineering" layer—handling file reading, code execution, and sub-agent spawning—is now the primary theater of competition. Companies must stop focusing on building the biggest models and start building the most efficient token factories.
Link: https://twitter.com/Av1dlive/status/2034320739244159322/?rw_tt_thread=True

2. The Intention Layer — Simon Taylor

Why read: A strategic look at how the "Attention Economy" dies when agents start paying for their own resources.
Summary: AI agents collapse the space between human desire and action, rendering the ad-supported "attention economy" obsolete. The missing link is a native protocol for agents to transact, finally fulfilling the "original sin" of the internet: the unimplemented HTTP 402 (Payment Required). In this new "Intention Economy," value is captured through direct fulfillment rather than manufactured desire. Publishers and services must shift from subscription/ad models to micro-payments for agentic access to data and tools.
Link: https://twitter.com/sytaylor/status/2034254522952957981/?rw_tt_thread=True

3. The Flawed Ephemeral Software Hypothesis — Andreas Kirsch

Why read: A critical reality check on the "disposable software" hype from Karpathy and Rauch.
Summary: While AI makes code generation cheap, it does not make the resulting software ephemeral because the bottleneck is not writing code—it is discovering correct behavior through edge-case resolution. Much like Amdahl's Law, the "irreducible sequential part" of software engineering is its collision with reality (integration, state, and UX). Discarding code resets the clock on these hard-won lessons, introducing massive production risk with every regeneration cycle. The future is not disposable apps, but faster iteration on persisted, high-integrity specifications.
Link: https://www.blackhc.net/essays/future_of_software/

4. Harness Engineering: Same Old Story — marv1nnnnn

Why read: A minimalist's argument that "less is more" in agentic orchestration.
Summary: The industry is over-engineering "harnesses"—the middleware layers of multi-agent workflows and reasoning sandwiches—that often provide marginal utility. The most effective coding agents (like Pi) skip complex sub-agent orchestration in favor of a simple model-shell-file interface. Most "novel" AI engineering concepts are actually just rebranded software best practices: tests, CI, and clear documentation. True gains come from model upgrades, not from adding more middleware "bloat" that risks obsolescence with every new frontier model release.
Link: https://twitter.com/marv1nnnnn1/status/2034262240422134053/?rw_tt_thread=True

5. Building Internal Agents for Your Company (Without Getting Fired) — Ben

Why read: Practical architectural patterns for deploying powerful autonomous agents in secure corporate environments.
Summary: Deploying "YOLO bots" in a company is a security nightmare, but a structured approach using tools like Inngest and Windmill can mitigate risk. The key is an "outbound-only" connection architecture (Inngest Connect) that avoids open ports and ngrok tunnels. Agents should be "Context-Heavy, Tool-Light," using TypeScript files to define team-specific domain expertise rather than generic prompts. This setup allows agents to navigate complex internal systems like Attio or Slack while remaining observable and self-reporting.
Link: https://twitter.com/bennyautomatic/status/2032563469208399910/?rw_tt_thread=True

6. The Robotic Tortoise & the Robotic Hare — Tomasz Tunguz

Why read: Empirical proof that local, smaller models can outperform cloud giants in agentic feedback loops.
Summary: A side-by-side race between a local Mac running Qwen 35B and Claude Code (Opus 4.5) showed the "tortoise" (local) finishing in 2 minutes vs. the "hare" (cloud) in 6 minutes. While Opus is "smarter" on benchmarks, the 3x speed advantage of the local model allowed for more rounds of critique and refinement. For many agentic workflows, the ability to run multiple iteration cycles in the time it takes a cloud model to "think" once leads to superior final outcomes. Speed is a feature that enables tighter, more effective human-AI collaboration loops.
Link: mailto:reader-forwarded-email/2c756c769430c92305902d3e2dfc3901

7. Zeihan is wrong, China is strong — Balaji

Why read: A high-stakes geopolitical counter-argument to the thesis of American isolationism and Chinese collapse.
Summary: Balaji challenges Peter Zeihan’s "End of the World" thesis, arguing that the global economy is now fundamentally Eurasian and that a declining US is on the losing side of maritime and demographic battles. Zeihan identifies the correct axes (supply chains, fertility, navy) but misinterprets the outcome by underestimating Chinese resilience and the shift toward digital, land-based trade. For operators, the practical implication is preparing for a world where North American resource independence does not guarantee global dominance.
Link: https://twitter.com/balajis/status/2034384477934546972/?rw_tt_thread=True

8. Agentic SaaS is a Scam (And Here's the Fix) — Machina

Why read: A warning for founders on the pitfalls of "wrapper" startups and the importance of domain-specific "ceilings."
Summary: Most agentic SaaS products are just thin wrappers around generic LLM prompts, selling "convenience" that lacks deep domain expertise. When 10,000 companies use the same automated SEO or outreach agents, the "edge" disappears and the content becomes functionally identical and easily flagged by algorithms. The real value lies in building custom, internal systems that reflect a company’s unique "taste" and expertise. Avoid subscribing to someone else's "ceiling" and instead focus on agents that you control and iterate on yourself.
Link: https://twitter.com/EXM7777/status/2034265533340885318/?rw_tt_thread=True

9. How to Build a Company Run Entirely by AI Agents (Paperclip) — Nick Spisak

Why read: A look at the first open-source governance layer for multi-agent organizations.
Summary: Paperclip is an MIT-licensed tool that organizes disparate agents (Claude Code, Codex, Cursor) into a formal "company" structure with org charts, budgets, and "heartbeats." By assigning agents roles and bosses, it prevents the chaos of uncoordinated terminal windows and manages token costs through hard spending limits. The "heartbeat" feature allows agents to work on a schedule rather than burning tokens in 24/7 loops. This represents a shift from "agent-as-tool" to "agent-as-employee" with explicit management and oversight.
Link: https://twitter.com/NickSpisak_/status/2033518072724705437/?rw_tt_thread=True

10. Guide: AI-Native Design with Paper — TK Kong

Why read: A blueprint for the new "design-to-code" roundtrip workflow using MCP.
Summary: The design process is shifting from manual Figma layouts to AI-native canvases like Paper that talk directly to coding agents via MCP. Using "Paper Snapshot," designers can pull real UI patterns into editable layers, remix them with Claude Code, and push the results back to production code. This workflow collapses the "dev handoff" phase, as the canvas and the code share the same source of truth. Designers are evolving into "agent orchestrators" who manage the visual output of LLMs rather than drawing every frame.
Link: https://twitter.com/tkkong/status/2034368184036561160/?rw_tt_thread=True

Themes from yesterday

The "Harness" Over the Model: A consensus is forming that the environment, tools, and constraints (the harness) are more important for agent performance than the underlying LLM weights.
Economic Inversion: The shift from an "Attention Economy" (monetizing human eyeballs) to an "Intention Economy" (monetizing agentic task fulfillment) via micro-payments.
Local Speed vs. Cloud Smarts: Real-world use cases are showing that "fast enough" local models (Qwen/Llama) often beat "smarter" cloud models by enabling more rapid iteration loops.
The Ephemerality Debate: A sharp divide between those who believe software will become "disposable" and those who argue that real-world complexity requires persisted, high-integrity code artifacts.

Daily Digest — 2026-03-18

Themes from yesterday

Written by Antoine Buteau

Lessons from Mervin Kelly