1. Model strategy for @harvey: — Gabe Pereyra
- Why read: How Harvey builds specialized legal foundation models for law firms.
- Summary: Harvey is training legal models to manage multi-month cases. They evaluate open-weight models using synthetic and human data pipelines and open-source the benchmarks. Partnering with Baseten, FireworksAI, and NVIDIA, they show post-training open models can rival top-tier alternatives. This gives law firms cheaper, secure ways to process large data rooms while owning the underlying models.
- Read more
2. The Agent Loop Architecture — Dan Farrelly | Inngest.com
- Why read: The infrastructure required to keep AI agent loops running without breaking.
- Summary: Basic agent loops fail during long tasks if a process restarts or crashes. The fix is durable execution: checkpointing each step and saving decisions. A functional loop operates as a scheduler paired with a decision-maker. It checks the current state and decides the next move without redoing finished work. This turns skills into composable workflows, stopping errors like duplicate messages and making the system reliable.
- Read more
3. You don't need ten agents. You need two tracks. — Hugo Baraúna
- Why read: A framework for working with AI agents without drowning in parallel tasks.
- Summary: Running a dozen agents at once creates chaos. Instead, split the workflow into two tracks: specification and implementation. First, work with one agent to brainstorm requirements and draft a technical design. Then, hand the spec to the implementation track, where another agent writes the code autonomously. You review the code asynchronously while drafting the next spec, balancing your attention.
- Read more
4. How to answer "How are you different from Claude?" without sounding defensive — Arnie Gullov-Singh
- Why read: How to position your AI product against general-purpose models in sales conversations.
- Summary: Competing with Claude on features is a losing strategy. Shift the focus to end-to-end workflows. Claude handles one step in a manual process; your product manages the pipeline from data ingestion to delivery. Map the buyer's day to expose the manual work required before and after prompting. When you show how your product integrates with their CRM and existing processes, the Claude comparison stops mattering.
- Read more
5. Thinking Traces are Better for Understanding — Jeffrey Emanuel
- Why read: Why reviewing an AI's internal reasoning is more informative than its final output.
- Summary: Polished answers hide the work. Reading an LLM's "thinking trace" shows how it attacked the problem, which threads it pulled, and where it jumped. For complex topics like ML theory, watching the model weigh technical tradeoffs clarifies the underlying concepts. Seeing the model assemble the pieces helps humans update their own mental models, matching how we naturally learn.
- Read more
6. The Mom-and-Pop SaaS era has arrived — Lenny's Newsletter
- Why read: How cheaper software development allows niche experts to build their own SaaS products.
- Summary: AI's biggest economic impact is opening software creation to non-technical domain experts. As development costs drop, hyper-specific software businesses become viable. Teachers, accountants, and consultants will build tools for their own niches, creating "Mom-and-Pop SaaS" companies. Future software success will rely on domain expertise over coding ability.
- Read more
7. The Making of the [un]CFO: Building a Company in the Age of AI — Ali Esfahani
- Why read: The evolving role of finance leaders in early-stage AI companies.
- Summary: AI startup CFOs do more than accounting. At Unconventional AI, the finance lead manages cap tables while helping with recruiting, legal, and technical research. For fundraising, the strategy is to educate investors early, share a roadmap, and show execution before asking for money. This hybrid role allows finance leaders to build the company infrastructure while shaping the pitch for investors.
- Read more
8. Monitoring AI and Human Agents in 2026 — SaaStr
- Why read: Why monitoring automated and human agents is necessary to protect your brand.
- Summary: Deploying AI agents alongside humans increases the risk of brand damage. One bad email from an agent or PR firm can ruin a client relationship. Companies need auditing systems for anything speaking on their behalf. Automation doesn't replace the need for oversight; founders must review agent outputs to catch errors and keep messaging aligned.
- Read more
9. Today, agents execute isolated tasks — Tony Chen
- Why read: How new benchmarks evaluate AI agents on long-horizon business management.
- Summary: Current AI handles short tasks well. The next test is navigating complex systems over time. CEO-Bench evaluates this by making agents run a simulated startup with $1M over 500 days. The agents face a noisy market with delayed feedback and must write scripts and make business decisions. So far, only models like Claude Fable 5 and GPT-5.5 turn consistent profits, shifting the focus from basic tool use to strategic planning.
- Read more
10. 1/ We fine-tune a lot of customer models, so we... — Charlie O'Neill
- Why read: Data-backed best practices for supervised fine-tuning of dense and MoE models.
- Summary: Experiments on models from 0.6B to 235B parameters show that the optimal LoRA learning rate stays constant regardless of model size. This contradicts the assumption that learning rates should scale inversely with model width. The rate is flat across Qwen and Llama architectures. Practitioners can pick one learning rate and skip expensive hyperparameter sweeps, saving compute during fine-tuning.
- Read more
11. Why AI design looks so generic — Jordan Crawford
- Why read: A technique to stop AI from generating generic UI designs.
- Summary: If you ask an AI to copy a web page, it usually only grabs the text. Missing the CSS, it guesses the layout based on average training data, producing bland results. To fix this, use a headless browser to capture a screenshot and the computed CSS. Feeding the AI exact colors, typography, and layout data lets it reproduce the actual design rather than a generic template.
- Read more
12. The Professor of Outputmaxxing — Anjney Midha, AMP — Latent.Space
- Why read: Why hardware utilization matters more than raw GPU count for scaling AI.
- Summary: The focus on buying GPUs hides a bigger engineering problem: maximizing Model FLOPs Utilization (MFU). Some labs get under 10% MFU, while top teams hit 60-70%. Scaling requires optimizing scheduling, networking, data pipelines, and parallelism. Fixing this infrastructure ensures hardware spending actually accelerates training. Companies solving these systems problems will beat those just buying more chips.
- Read more
13. encoder free models and the bitter lesson — rumik
- Why read: How the shift toward simpler AI inputs illustrates Richard Sutton's "Bitter Lesson."
- Summary: Developers keep building complex encoders to process data for models, only to drop them as compute increases. Models like Gemma 4 12B show that scaled-up systems perform better on raw signals than processed inputs. This shifts the processing work from hand-crafted architectures to the model's learned weights. It proves that raw computation beats human-engineered heuristics, and future models will likely use simpler inputs.
- Read more
14. The Loop Layer: Checkpointing Agent State — Dan Farrelly | Inngest.com
- Why read: Why checkpointing AI agent state is necessary to prevent execution failures.
- Summary: Running agents on servers requires more than terminal scripts. Restarts or memory issues can wipe an agent's state, leading to duplicate work and lost data. The fix is checkpointing: saving every decision and completed step. If the system crashes, it resumes from the last checkpoint instead of starting over. This allows agents to handle long tasks reliably.
- Read more
15. Harvey Labs and the Future of Legal AI — Gabe Pereyra
- Why read: How Harvey Labs is building autonomous AI for the legal industry.
- Summary: Harvey Labs formed a research group to build multi-agent systems for legal work. They combine reinforcement learning with legal expertise to train specialized models for law firms. Their goal is an ecosystem where enterprises and governments control their own fine-tuned models, guaranteeing security. These efforts aim to produce AI systems capable of acting like senior associates.
- Read more
Themes from yesterday
- Durable infrastructure: Agent loops are moving past basic prompts toward checkpointed systems that survive crashes.
- Simpler architectures: Complex encoders and hyperparameter sweeps are losing ground to raw inputs and stable learning rates as models scale.
- Niche software: Falling development costs are allowing non-technical domain experts to build their own SaaS products and custom models.