1. Three weeks ago there were rumors... Mythos — Andrew Curran

  • Why read: Get ahead of the rumors regarding Anthropic's next "step-change" model and its implications for the scaling law debate.
  • Summary: Rumors suggest Anthropic has completed a training run for a model (codenamed Mythos or Capybara) that performs twice as well as expected, potentially defying standard scaling laws. This "step change" indicates that architectural breakthroughs at scale are creating a massive performance gap between frontier labs and the rest of the field. Strategically, this explains OpenAI's pivot away from secondary projects like Sora to focus entirely on massive compute runs. For operators, this suggests a future where frontier intelligence becomes significantly more expensive, making compute and energy the ultimate bottlenecks.
  • Link: https://twitter.com/AndrewCurran_/status/2037967531630367218/?rw_tt_thread=True

2. How Kimi, Cursor, and Chroma Train Agentic Models with RL — Philipp Schmid

  • Why read: A technical deep dive into how leading AI companies are using Reinforcement Learning to move beyond simple chatbots into autonomous agents.
  • Summary: This report analyzes technical breakthroughs from Moonshot AI (Kimi), Cursor, and Chroma, highlighting a shift toward "Agent Swarms" and self-editing context. Key innovations include training models to decompose tasks into parallel sub-agents and using RL to prune retrieved documents to save context space. Unlike standard LLMs, these models are trained inside production harnesses with outcome-based rewards to ensure reliability in real-world environments. For developers, the takeaway is clear: the future of agency isn't just better prompts, but fine-tuning models to orchestrate their own tools and parallelize workflows.
  • Link: https://twitter.com/_philschmid/status/2037924497563505058/?rw_tt_thread=True

3. Lessons from six months in agentic payments — Kahlil Lalji

  • Why read: Essential perspective for builders looking to solve the "last mile" of AI autonomy: the ability for agents to actually spend money.
  • Summary: After raising $10M to build Natural, Kahlil argues that the infrastructure for agentic payments is being built quietly beneath a surface of "blank stares" from the market. He challenges the "stablecoin-only" narrative, noting that fiat rails are often superior for domestic use, and warns against building thin "point solutions" like simple wallets. The winning strategy is a full-stack approach that combines identity, authorization, and multi-bank settlement. Operators should prepare for a world where payments are routed through complex orchestration schemes rather than simple manual card entries.
  • Link: https://twitter.com/bykahlil/status/2037603888543723892/?rw_tt_thread=True

4. 4 automations every GTM engineer should build in their first 90 days — 🏍benyamin

  • Why read: Highly tactical advice for Go-To-Market operators on how to use AI to eliminate grunt work and drive revenue.
  • Summary: Focus on automations that are "closest to revenue and farthest from a human's best use of time." Recommended plays include tracking champion job changes to trigger Slack alerts, automating executive LinkedIn "air cover" on active deals, and using LLMs to classify email bounces for better list hygiene. Finally, the author suggests using Claude Code to orchestrate entire campaigns—from pulling lists to validating emails. The goal for any GTM engineer is to find low-judgment, high-volume tasks and make them disappear so the team can focus on human-to-human relationships.
  • Link: https://twitter.com/BenyaminHolley/status/2037975410030391322/?rw_tt_thread=True

5. We Used Autoresearch on Our AI Skill, It Taught Us to Write Better Tests — Lotte

  • Why read: Learn how to apply Karpathy's "Autoresearch" loop to optimize prompt engineering and instruction sets at machine speed.
  • Summary: By using a minimal Python script to automate the "experiment-evaluate-optimize" loop, the Langfuse team improved their AI skill score from 0.35 to 0.82 overnight. The process involves an outer "optimizer" agent that generates hypotheses for better instructions and an inner loop that tests those instructions against static code checks. This transition from "vibe-based" prompting to automated, test-driven instruction optimization is a game-changer for maintaining complex agent behaviors. It forces developers to define "correctness" through rigorous test repositories rather than manual spot-checking.
  • Link: https://twitter.com/lotte_verheyden/status/2037665098983190904/?rw_tt_thread=True

6. Your Agent Doesn’t Need More Context. It Needs a Model of You. — 11AM w/ Seed Club

  • Why read: Understand the architectural shift from "Memory as Storage" (RAG) to "Memory as Reasoning" (User Modeling).
  • Summary: Traditional AI memory relies on stuffing markdown files into a context window, leading to "context rot" and the "needle in a haystack" problem. Plastic Labs argues for a "continual learning system" where the agent synthesizes evidence to build a reasoning-based model of the user. This approach ensures that as user preferences shift over months, the agent's understanding evolves rather than staying stuck in static retrieval. For product builders, the message is that owning the "user state" is the primary moat; relying on third-party memory APIs creates a risk of being subsumed by incumbents.
  • Link: https://twitter.com/11AMdotclub/status/2037937883785617485/?rw_tt_thread=True

7. The non-developer's guide to Claude Cowork — Nick Spisak

  • Why read: A practical onboarding guide for maximizing the utility of Claude's new desktop "Cowork" capabilities.
  • Summary: Claude Cowork is a major shift from a chatbot to an employee that performs "real work" across email, calendars, and Slack. To avoid common pitfalls, non-technical users should prioritize three steps: importing memories from other AIs, setting explicit Global Instructions, and—most importantly—always using "Plan Mode." Plan Mode forces the AI to show its intended steps for approval before it touches files or sends messages, preventing "rogue" behavior. This guide provides the tactical scaffolding needed to turn a desktop app into a functioning second employee.
  • Link: https://twitter.com/NickSpisak_/status/2037535318614610191/?rw_tt_thread=True

8. Tech Has Lost Touch — Cam Fink

  • Why read: A sobering look at the growing populist backlash against AI and the political risks facing the industry.
  • Summary: Beyond the technical threats of AI lies an existential social one: middle America is turning anti-tech. The perception of AI has shifted from a tool for human productivity to a "technocratic behemoth" associated with elitism and looming unemployment. This resentment is becoming personal, fueled by AI-driven "dead internet" sycophancy and fear of a permanent underclass. For the tech community, the warning is clear: policy and politics will soon follow this populist sentiment, making anti-AI platforms an easy win for future elections.
  • Link: https://twitter.com/seekingtau/status/2037978230573932952/?rw_tt_thread=True

9. The belief that EPD is collapsing into a single role... — Kevin Yien

  • Why read: Reframes the "AI is killing roles" debate by focusing on the expansion of professional boundaries.
  • Summary: The idea that Product, Engineering, and Design (EPD) are collapsing into a single "Builder" role is a misunderstanding of traditional boundaries. Instead of roles disappearing, they are simultaneously expanding and deepening; designers are becoming more technical, while PMs are doing more growth and marketing. The "Builder" isn't a new bucket, but an acknowledgment that the old silos were always slightly artificial. Career success in the next decade will belong to those who embrace this fluidity rather than looking for a neat new title to hide behind.
  • Link: https://twitter.com/kevinyien/status/2037942789632106889/?rw_tt_thread=True

10. People Are Making $100,000 With Just Claude and a Laptop — NeilXbt

  • Why read: Proof-of-concept for the "Solo AI Operator" economy and the blueprint for building niche service businesses.
  • Summary: AI has crossed the threshold where a single person can deliver professional-grade agency work previously requiring a team. Successful solo operators are following a specific playbook: picking a narrow niche (e.g., "AI contracts for freelancers"), shipping before they are ready, and building in public. The gap between businesses needing AI implementation and the supply of people who can actually do it is currently a wide-open window. This is not about building the next unicorn, but about leveraging tools like Claude to run high-margin, low-overhead service businesses.
  • Link: https://twitter.com/neil_xbt/status/2037986296493125809/?rw_tt_thread=True

Themes from yesterday

  • The Rise of the "Builder" Generalist: Traditional silos between design, product, and engineering are dissolving as AI tools empower individuals to handle the full production lifecycle.
  • Agentic Infrastructure Maturity: Significant investment is shifting from LLM "chat" to the underlying plumbing of agency, including agentic payments, reasoning-based memory, and parallel RL training.
  • Automated Optimization Loops: Leading practitioners are moving away from manual prompt tweaking toward automated "Autoresearch" loops that use AI to optimize its own instructions against rigorous test suites.
  • The Populist Friction: A growing cultural and political backlash against "Anti-Human" AI and tech elitism is creating a new category of political risk for the industry.