1. Demystifying AI Agent Loops: Schedules, Goals, and Subagents — Lenny's Newsletter
- Why read: A clear explanation of how AI agent loops apply standard automation patterns to language models instead of batch jobs.
- Summary: "Loop engineering" is mostly traditional software architecture—heartbeats, crons, and webhooks—applied to AI. The most effective loops run until a specific goal is met, rather than stopping on a timer. You should treat these loops like new hires: define the job, the schedule, and what failure looks like. Loose success criteria cause infinite loops and waste tokens, so set exact validation thresholds. Getting these mechanics right lets you automate repetitive work like PR reviews.
- Read more
2. Goal-Based Loops and Subagent Architecture — Lenny's Newsletter
- Why read: How primary agent loops delegate work to specialized subagents to handle complex workflows.
- Summary: Agent loops become more useful when they can spawn subagents for specific tasks. A primary loop might find gaps in a codebase and deploy isolated subagents to manage individual pull requests. This nested structure allows dynamic verification of multi-step processes until all checks pass. You need strict stopping criteria and clear goals to prevent runaway API costs. This setup turns passive coding assistants into independent problem solvers.
- Read more
3. Moving From Prompting to Goal Design in Autonomous Agents — elvis
- Why read: Why building long-running coding agents requires moving from chat-based prompting to upfront goal design.
- Summary: Developers are learning to steer agents using strict goals and artifacts instead of turn-by-turn chat. You specify the end state, metrics, and constraints before the agent starts. This goal is a contract. Weak goals let the model take shortcuts; strong goals force it to check its work against real-world rules. Good goal design includes benchmark scores, hold-out sets, or exact layout constraints. The model does the work, but the human decides exactly what "done" means.
- Read more
4. The Evaluator as a First-Class Component in Coding Agents — elvis
- Why read: Why automated evaluators are necessary for agents to self-correct during long tasks.
- Summary: Independent agents need reliable evaluation mechanisms, whether that is an LLM acting as a judge, a test suite, or a script. Use deterministic checks like unit tests and lint rules for binary success criteria. Use language-model evaluators for subjective outcomes like report coherence or design intent. Combining execution loops with strict evaluators lets the system plan, fix mistakes, and keep working without human intervention. The evaluator is now as important as the model it judges.
- Read more
5. Agentic RL: Multi-turn Trajectories and Scalable Rollout Infrastructure — Cameron R. Wolfe, Ph.D.
- Why read: The technical requirements for training agentic models using reinforcement learning.
- Summary: Training models for multi-turn agent loops requires adapting reinforcement learning to handle complex environment interactions. An agent uses a model, tools, instructions, and a sandbox to reason and act repeatedly. This setup needs modular environments and stable learning methods to process continuous feedback. You also need infrastructure that supports scalable rollouts to assign rewards accurately over long time horizons. Getting these pieces right is necessary to build agents that solve open-ended problems.
- Read more
6. Redesigning Operations for AI Agents at Enterprise Scale — Will Grannis
- Why read: Google Cloud's perspective on reshaping enterprise operations to work with AI agents.
- Summary: Getting the most out of AI agents means redesigning business operations to suit them. This involves simplifying internal tooling to build agents safely, and restructuring external products so third-party agents can use them. Doing this lets companies process high volumes of inbound demand, cutting transaction times from weeks to minutes. It also requires working with regulators to build compliance rules for autonomous transactions. Adapting to this shift lowers customer acquisition costs and opens new revenue streams.
- Read more
7. The Agent Gym: Pairing AI Agents with Human Subject Matter Experts — Will Grannis
- Why read: How to turn back-office operations into continuous learning systems by pairing agents with human experts.
- Summary: Automating back-office processes like invoice-to-pay usually fails because they rely on undocumented human knowledge. The "Agent Gym" method fixes this by having agents do the well-documented work and pause to ask a human expert when they hit a gray area. The expert documents their reasoning, which feeds back into the system to improve the agent's future decisions. This creates a learning loop that scales the organization's capacity. It shows that the best AI deployments support human judgment instead of trying to replace it.
- Read more
8. AI Security: Red-Teaming and Indirect Prompt Injections — Latent.Space
- Why read: How the cybersecurity community is testing advanced models for vulnerabilities like indirect prompt injections.
- Summary: Driven by new export controls, security teams are looking closely at prompt injections and jailbreaks. AI security differs from traditional cybersecurity and requires new tools. Companies now use automated red-teaming software to test their models in live coding environments. These tests map out complex vulnerability vectors to build modern AI safety toolkits. Securing the agent layer is now a hard requirement before launching autonomous systems.
- Read more
9. Cost-Plus vs Value-Based Pricing in the AI Inference Market — Tomasz Tunguz
- Why read: Why reselling raw AI inference yields zero margins, and how to capture value with alternative pricing.
- Summary: Fast-growing AI companies are abandoning cost-plus pricing, which caps margins based on raw compute costs. Instead, they charge per resolved ticket or completed task. This decouples revenue from inference costs and protects margins. When customers pay for completed work rather than tokens, they are less likely to bypass your platform to save money on API calls. Selling units of work instead of raw compute is how you build a defensible AI business.
- Read more
10. The Agent Harness: Why the Model is Only 10% of the Equation — Addy Osmani
- Why read: Why the tools and logic surrounding an LLM matter more than the model itself.
- Summary: An agent requires a model and a harness—the tools, sandboxes, routing, and rules around it. The model accounts for roughly 10% of the system's effectiveness; the harness handles the rest. Most agent failures stem from missing tools, bad rules, or noisy context, not model limits. Successful teams build a solid harness and reuse it across projects. Improving the harness boosts performance immediately, without waiting for better models.
- Read more
11. Context Engineering: The Highest-Leverage Knob in Agent Design — Addy Osmani
- Why read: How controlling the information an agent sees improves reasoning and cuts API costs.
- Summary: Context engineering is the practice of structuring the instructions and data an agent sees during its loop. Because agents run continuously, bloated context windows degrade performance and drive up API bills. Separating static rules from dynamic, turn-by-turn state keeps the agent focused. Good context management maintains necessary guardrails without overwhelming the model with past data. It is the most direct way to manage both token costs and output quality.
- Read more
12. Automating GTM: Using Agents to Discover Hidden Customer Segments — Cannonball GTM
- Why read: A case study on building an agent to automate go-to-market research.
- Summary: Custom agents let growth teams automate heavy workflows, like finding and scoring customer segments using public data. In this case, the agent maps regulations, flags key data points, and ranks segments by urgency and fit. The process works like a collaborative session: the agent pauses at key steps so a human can adjust parameters or fix bad assumptions. Keeping a human in the loop ensures the resulting segments are useful for sales teams. Building these specialized agents will become standard practice for go-to-market planning.
- Read more
13. The Daytona Playbook: Turning Community Events into an AI GTM Engine — wearedevelopers.com
- Why read: How to use in-person events to build a developer ecosystem and drive go-to-market efforts.
- Summary: When Daytona shifted to AI infrastructure, they needed active users, not passive observers. They ran different types of events: intense hackathons, casual developer gatherings, and academic meetups. This mix connected builders and researchers, raising the company's profile in the Bay Area. After running 150 global events, community engagement became a core piece of their sales strategy. In-person connections are still an effective way to drive developer adoption.
- Read more
14. *Brand Clarity: Understanding What You Are Really Selling* — Shreyas Doshi
- Why read: A reminder that companies sell emotional or functional outcomes, not literal products.
- Summary: Good strategy requires knowing what value you actually provide. Apple sells taste, Amazon sells convenience, Disney sells nostalgia, and Stripe sells care. Understanding this core offering matters more than listing features. When leaders identify their true value proposition, it aligns the company's messaging and product development. This clarity helps founders and executives navigate competitive markets.
- Read more
15. Post-Training and Harness Optimization for Frontier Legal Agents — Harvey
- Why read: The full-stack process for training a legal AI model to handle long-horizon reasoning.
- Summary: Hitting top scores on the Legal Agent Benchmark meant post-training a frontier model with heavily optimized evaluators. Post-training, harness design, and grading accuracy have to be solved together. By grading in batches with smaller, aligned evaluator models, the team cut compute costs while keeping a reliable reward signal. They trained the model using full-parameter reinforcement learning in a sandbox, teaching it to use tools like grep on complex legal files. This integrated approach is required to train models for complex, multi-step workflows.
- Read more
Themes from yesterday
- The Shift to Agentic Workflows: The standard for autonomous systems is moving from chat prompts to goal-oriented, long-running loops.
- Harness Over Model: An agent's success relies more on its tools, evaluators, and context management than on the base model.
- Value-Based Business Models: AI companies are moving from token-based pricing to outcome-based pricing, charging for completed work instead of compute.
- Human-in-the-Loop Orchestration: The best enterprise deployments pair agents with human experts to create continuous learning loops.