1. Thoughts on LLMs, May 2026 — Nate Berkopec
- Why read: A reality check on current LLM capabilities that cuts through AGI hype to focus on practical workflows.
- Summary: Five months after Opus 4.5, day-to-day work with language models shows linear, not exponential, progress. Calling them artificial "intelligence" distracts from their actual utility as jagged, looped autocomplete engines. Successful implementations rely on verifiability, running automated tool loops, and handling repetitive drudgery. Engineers should stop focusing on single-prompt chat and start building agentic loops. Treating models as workflow engines instead of synthetic colleagues produces better software.
- Read more
2. The most important AI hardware idea right now is not... — dylan ツ
- Why read: Explores how separating inference phases changes cloud economics and extends the life of older hardware.
- Summary: In a training-focused environment, hardware strategy meant buying the newest GPUs and depreciating the old ones. The shift to inference changes this. Language model requests have two phases with different compute requirements. Prefill is compute-heavy and parallel. Decoding is latency-sensitive and memory-bandwidth bound. Companies now route these phases across specialized devices instead of using monolithic nodes. This disaggregation turns hardware obsolescence into a routing problem. Older GPUs can now serve as prefill engines in a redesigned serving line.
- Read more
3. When A.I. Comes for the Elites — Peter Turchin
- Why read: A historical perspective on how social power determines who benefits during technological shifts.
- Summary: History shows that technological displacement is dictated by the social power of the displaced. The industrial era's manual laborers suffered wage stagnation and declining life expectancy because they lacked collective bargaining power and political representation. The AI shift now targets cognitive labor, threatening a class of workers traditionally insulated from automation. The severity of this disruption depends on whether knowledge workers can exert enough social power to protect their economic interests. If they fail, they may face the structural marginalization previously seen in the industrial working class.
- Read more
4. Why Palantir Makes New Hires Read a Theater Improvisation Book — Tropical Value
- Why read: Explains why highly credentialed candidates often fail in ambiguous environments.
- Summary: Palantir asks new hires to read Keith Johnstone's "Impro" because formal education often suppresses creative responsiveness. The book contrasts "blockers," who kill ideas to protect their status, with "acceptors," who build momentum. In many tech companies, performing competence masks an inability to create. Talent often requires unlearning the status-driven behaviors ingrained by elite schooling. Leaders who recognize that status plays stem from insecurity can build teams that operate well without a script.
- Read more
5. Credential Brokering for AI Agents, Explained — Tony Dang
- Why read: A security primer on protecting automated systems from prompt injection and credential exfiltration.
- Summary: As agents gain autonomy, their non-deterministic nature exposes them to indirect prompt injections. Malicious instructions hidden in GitHub issues, tweets, or websites can trick an agent into leaking its task credentials. Credential brokering solves this flaw. It allows agents to use credentials for various systems without seeing or accessing the raw keys. Teams moving from chatbots to agentic workflows need to implement this pattern.
- Read more
6. Three theses on AI value capture — Luis Garicano
- Why read: An economic argument challenging the consensus that frontier AI labs will capture the bulk of the value created by their models.
- Summary: Despite massive investment, frontier AI labs face intense competition and zero network effects. An API call creates no lock-in, and capable open-weight models cap pricing power. Therefore, most value generated by these models will flow to users, hardware suppliers, and implementation layers. The constraints on AI adoption are organizational, so real margins exist in applying models to business problems. For middle-tier companies, the best strategy is pooling resources into shared open models to keep the frontier competitive and commoditized.
- Read more
7. Deep Tech Companies Are Built Different — Leo Polovets
- Why read: How hardware-based startups break the standard software playbook regarding pivots, talent, and defensibility.
- Summary: As physical bottlenecks re-emerge in energy, defense, and manufacturing, deep tech firms operate on different foundations than SaaS. Pivots are constrained because resetting a hardware team is harder than changing code. Early architectural decisions matter heavily because mistakes cost months of physical prototyping instead of hours of debugging. The friction that makes these companies difficult to start (specialized talent, supply chains, complex engineering) forms a moat for the winners. Founders and investors must adapt to physical constraints rather than relying on the software playbook.
- Read more
8. The Pipeline Is Dead, Long Live the Agent Mesh — Sean Escriva
- Why read: How concurrent agent architectures are replacing sequential CI/CD pipelines.
- Summary: Traditional software delivery pipelines encoded organizational boundaries into infrastructure, causing wait times. With intelligent agents, the assumption that stages must be sequential breaks down. An agent mesh allows security, compliance, and implementation agents to operate concurrently against the same repository, sharing a typed data layer. A security agent can find a vulnerability and ask an implementation agent to adjust an import before a pull request opens. Moving from sequential handoffs to concurrent agent evaluations cuts the waiting queues that drive up software delivery costs.
- Read more
9. Kimi K2.6 replaced my entire dev team. Here's how I built an $80,000/month agency solo — Noisy
- Why read: A case study on how massive context windows and parallel agent swarms challenge the traditional agency business model.
- Summary: The traditional agency model relies on high headcount and sequential workflows. Systems like Kimi K2.6 allow single operators to deploy swarms of hundreds of parallel sub-agents to handle research, analysis, coding, and writing simultaneously. Because the model scores well on SWE-Bench, it solves complex engineering problems without human permission loops. This cuts delivery timelines from weeks to hours, substituting API tokens for salaries. Legacy agencies struggle to compete on price and speed against this economic advantage.
- Read more
10. Mica: AI Transformation at Cockroach Labs — Jordan Lewis
- Why read: How Cockroach Labs deployed an internal AI platform by focusing on federated identity and low-friction app creation.
- Summary: Cockroach Labs deployed an internal agent platform by eliminating configuration friction. Mica connects to workplace services through a federated identity system, ensuring agents act only with the user's exact permissions. This solves the "blank page" problem by starting every session with personalized context. Employees built thousands of connected applications and automated workflows using plain English, creating an internal marketplace of shared skills. Widespread adoption relies on providing secure infrastructure instead of just urging employees to use generic chatbots.
- Read more
11. The AI public offerings are coming; Karpathy is Anthropic bound — Matt Slotnick
- Why read: Details the impending OpenAI IPO and the revenue growth of leading foundation model labs.
- Summary: OpenAI is reportedly preparing to confidentially file for its IPO, citing roughly $5.7 billion in Q1 revenue. Anthropic released financials showing revenues doubling quarter-over-quarter and achieving operating profitability. The core challenge supporting these valuations is enterprise diffusion: moving organizations from human-attended copilots to background agents. Sustained economic transformation depends on companies rearchitecting to use intelligence as a baseline substrate. The ability of enterprises to consume inference capacity will determine if these valuations hold as foundation model labs go public.
- Read more
12. The Neocloud Boom — Jamin Ball
- Why read: A quantitative breakdown of the multi-trillion dollar data center buildout powering the next generation of AI.
- Summary: The expansion of AI infrastructure is a massive capital deployment, drawing comparisons to the 19th-century US railroad buildout. Estimates suggest over 150GW of new capacity will come online in the next 4.5 years, with data center capital expenditure potentially reaching $7.5 trillion. This spending drives a boom in "Neocloud" providers, who finance and operate these specialized facilities. The scale of this physical bottleneck means existing hyperscalers cannot service it alone. An entirely new tier of independent infrastructure providers is emerging.
- Read more
13. A Framework for Agent Memory: Remember, Cite, Forget — Vox
- Why read: An architectural framework for designing agent memory systems that prevent context collapse and hallucination.
- Summary: Dumping more data into a context window degrades performance. Agent memory requires explicit mechanisms to remember, cite, and forget. Memory should be structured across distinct layers, from hot-session working memory to long-term policy files. Agents often fail by allowing temporary session summaries to overwrite durable project rules or direct user instructions. Vector databases are for candidate retrieval, not for establishing factual authority. Effective design separates semantic ingestion from retrieval ontology so models do not have to continuously rediscover company state.
- Read more
14. Are We Learning the Wrong Bitter Lesson? — Ashwin Gopinath
- Why read: A critique of how the industry misinterprets Richard Sutton's "Bitter Lesson", leading to high token costs and inefficient architectures.
- Summary: The industry has misinterpreted Richard Sutton's "Bitter Lesson" as an excuse to abandon data organization. Instead of modeling facts, companies dump raw text into massive context windows and expect compute to reconstruct reality on every query. This turns routine tasks into expensive retrieval operations, raising API costs and introducing failure modes based on document formatting. The fix is to separate state from perspective by preserving semantic facts at ingestion and applying business ontology during retrieval. Companies can avoid burning frontier-model tokens just to parse their own internal documents.
- Read more
15. 1/ Some things I've learned recently running coding agents on... — Simon Last
- Why read: Tactical advice for managing autonomous coding agents on large software projects.
- Summary: The playbook for coding agents is shifting. Tasks should be scoped as multi-week engineering efforts instead of bite-sized tickets. Operators can use long-running implementer sessions that compact context, allowing the agent to internalize codebase conventions. The human's role moves from active supervision to feeding highly specified, verifiable tasks into a queue. Success requires investing in plan documents and using reviewer agents to independently verify the implementer's work. The goal is to exit the execution loop, intervening only to adjust the system when the testing harness fails.
- Read more
Themes from yesterday
- The shift from single-prompt chat interactions to long-running, concurrent agent swarms capable of parallel execution.
- The physical limits of the AI boom, highlighted by capital flowing into hard infrastructure like power, cooling, and "Neoclouds" to support inference scaling.
- A pushback against unstructured context windows, prompting a return to structured memory frameworks and explicit data organization to manage costs and hallucinations.