Lessons from Simon Willison

Simon Willison is a software engineer who co-created the Django web framework and built Datasette, a tool for exploring SQLite databases. He is known for taking detailed technical notes in public and for coining the term "prompt injection" to describe security flaws in large language models. This collection covers his approach to coding, data architecture, and treating AI as a practical daily tool.

Part 1: The "Batteries Included" Web

On journalism deadlines: Django was originally forged to solve a specific problem in a newsroom, allowing developers to build complex, database-driven sites in hours instead of months. — Source: [Simon Willison's Weblog]
On the framework realization: The creators initially thought they were building a CMS; releasing it as an open-source framework was a secondary realization about where the true value lived. — Source: [Django Chat Podcast]
On sensible defaults: Providing "batteries included" features like admin interfaces and security prevents developers from constantly reinventing the wheel. — Source: [Simon Willison's Weblog]
On open source productivity: "Open source has been the single biggest productivity boost to software engineering of any in anyone’s lifetime." — Source: [Heavybit]
On bug reporting standards: "I increasingly want issue reports to be condensed to what the human actually observed: I ran this command. I expected this to happen. This happened instead." — Source: [Simon Willison's Weblog]
On LLMs and framework contributions: "If you use an LLM to contribute to Django, it needs to be as a complementary tool, not as your vehicle." — Source: [Teaching Python]
On the value of boring tech: "The thing I love about Django today is that Django qualifies as boring technology." — Source: [Django Chat Podcast]
On avoiding AI slop in open source: Plausible-looking but untested code generated by AI is a severe drain on the already limited time of project reviewers. — Source: [Servo Project Policy]
On the core of engineering: The hardest part of technology isn't the syntax beginners worry about, but rather wrangling systemic complexity in a sustainable way. — Source: [Simon Willison's Weblog]

Part 2: The Baked Data Architecture

On the baked data concept: The baked data pattern involves bundling a read-only SQLite database directly into a deployment package instead of querying a live external server. — Source: [Datasette Documentation]
On scaling to zero: Because the database is simply a file, baked data deployments can run on serverless platforms that cost nothing when traffic is idle. — Source: [Datasette Documentation]
On query speed: Querying a local SQLite file within the same container often outperforms network calls to remote managed databases. — Source: [Datasette Documentation]
On eliminating split-brain states: Deploying immutable data alongside the code removes the need for complex database migrations. — Source: [Simon Willison's Weblog]
On dynamic flexibility: Baked data provides the performance of static site generators without requiring you to pre-render hundreds of thousands of individual HTML pages. — Source: [Simon Willison's Weblog]
On CI/CD integration: Data from Markdown files, CSVs, or APIs can be compiled into a SQLite database entirely within GitHub Actions before deployment. — Source: [GitHub Blog]
On read-heavy applications: This architecture is perfect for sites where data changes daily or hourly, though it is unsuitable for platforms requiring constant user writes. — Source: [Datasette Documentation]
On data democratization: Tools like Datasette allow anyone to take a SQLite database and instantly convert it into a browsable interface and API. — Source: [Simon Willison's Weblog]
On side project scope: "I know from past experience that if your side project has user accounts, that's not a side project—that's an unpaid job." — Source: [Simon Willison's Weblog]

Part 3: Vibe Coding & LLM Amplification

On the reality tether: "I feel like programmers have it easy... There's no way to automatically check a legal brief written by A.I. for hallucinations." — Source: [Simon Willison's Weblog]
On testing as verification: Code generated by LLMs is uniquely useful because it can be executed and tested to instantly verify its correctness. — Source: [Simon Willison's Weblog]
On expanding ambition: "AI makes me more ambitious," allowing developers to tackle projects they previously dismissed as too time-consuming. — Source: [Heavybit]
On vibe coding limits: Relying on the high-level intent of "vibe coding" works for small utilities but falls short for production-grade engineering. — Source: [Simon Willison's Weblog]
On true vibe engineering: If an AI writes your code but you review, test, and understand it, that is using an LLM as a typing assistant, not vibe coding. — Source: [Simon Willison's Weblog]
On punching above your weight: AI tools do not replace engineering expertise; they amplify it, allowing a senior developer to move significantly faster. — Source: [Simon Willison's Weblog]
On AI exhaustion: Using AI removes the natural pauses in coding, keeping the developer in a state of constant, high-intensity decision-making that can lead to burnout. — Source: [Simon Willison's Weblog]
On generating UIs: Asking LLMs to output raw HTML and JavaScript is an incredibly effective way to build interactive tools without complex build steps. — Source: [Simon Willison's Weblog]
On the Code Interpreter: ChatGPT's ability to write and execute its own Python code was a step change in AI utility because it allowed the model to verify its own logic. — Source: [Simon Willison's Weblog]
On maintenance costs: If AI allows you to double your code output, you must also find ways to halve your maintenance burden through rigorous automated testing. — Source: [Simon Willison's Weblog]

Part 4: The Reality of Prompt Injection

On coining the term: Prompt injection is a fundamental failure to separate developer instructions from untrusted user data, much like SQL injection. — Source: [Simon Willison's Weblog]
On architectural flaws: Prompt injection is currently an unsolved problem at the architectural level, not a minor bug. — Source: [Simon Willison's Weblog]
On percentage-based defenses: Any defense against prompt injection that is only 99% effective is a total failure in a security context. — Source: [RedMonk]
On security versus moderation: Filtering an AI to prevent it from saying bad words is a moderation issue; preventing attackers from hijacking the system is a security crisis. — Source: [Simon Willison's Weblog]
On the lethal trifecta: AI agents become highly dangerous when they combine access to private data, access to untrusted content, and the ability to perform external actions. — Source: [Simon Willison's Weblog]
On indirect prompt injection: Attackers can hide malicious instructions in external data, such as a webpage summary, which the AI then executes without user knowledge. — Source: [Palo Alto Networks]
On dual-LLM defenses: A quarantined model should process untrusted data to create a safe summary before passing it to a privileged model that has action permissions. — Source: [Simon Willison's Weblog]
On human approval: Any AI action with real-world consequences, such as sending an email or deleting data, must require manual human approval. — Source: [Simon Willison's Weblog]
On stop energy: The tech industry needs "stop energy" to recognize that many current AI agents are fundamentally unsafe due to unresolved injection vulnerabilities. — Source: [Simon Willison's Weblog]

Part 5: Navigating the Jagged Frontier

On LLM unpredictability: The "jagged frontier" describes how LLMs can write difficult SQL queries effortlessly while failing at basic logic or arithmetic. — Source: [Simon Willison's Weblog]
On the difficulty of software: "I’m constantly reminded as I work with these tools how hard the thing that we do is, like producing software is a ferociously difficult thing to do." — Source: [Heavybit]
On AI-generated writing: "I have never knowingly finished reading an email signed by a human but written by AI. It feels like being lied to." — Source: [Simon Willison's Weblog]
On his AI journalism beat: "I decided that my beat in all of this, my journalistic beat, was going to be: What can it do?" — Source: [Simon Willison's Weblog]
On AI code smell: Unfiltered AI output has a distinct "digital smell" that experienced developers can instantly recognize when reviewing pull requests. — Source: [Simon Willison's Weblog]
On coding agents: "What’s left is everything else, and the more time I spend working with coding agents the larger that 'everything else' becomes." — Source: [Simon Willison's Weblog]
On evaluating quality: Accepting poor code from an AI is a choice; you can feed design patterns back into the agent to produce much better results. — Source: [Simon Willison's Weblog]
On search-based research: Newer models have dramatically improved at search-based research, fundamentally changing how engineers look up documentation and solve obscure bugs. — Source: [Simon Willison's Weblog]
On the prompt architecture: "Prompt injection is a new kind of software attack, and prompt design is a new kind of software architecture." — Source: [Simon Willison's Weblog]
On the difficulty of LLMs: "LLMs are much harder to use effectively than most people realize." — Source: [Teaching Python]

Part 6: Learning in Public & TILs

On publishing TILs: Documenting what you learn in short "Today I Learned" posts is one of the most effective ways to build a career and a personal knowledge base. — Source: [Simon Willison's Weblog]
On lower friction: TILs are liberating because they do not require the polish, formatting, or length of a formal tutorial. — Source: [Simon Willison's Weblog]
On GitHub as a notebook: GitHub Issues function as almost the best notebook in the world due to free storage, Markdown support, and excellent searchability. — Source: [Simon Willison's Weblog]
On building out in the open: Consistently sharing small discoveries helps demystify the development process for others while cementing your own understanding. — Source: [Simon Willison's Weblog]
On personal tracking: Writing weekly or monthly notes creates a transparent record of what you actually accomplished versus what you planned to do. — Source: [Simon Willison's Weblog]
On replacing Twitter: Hosting your own "Notes" feed on a personal domain reclaims short-form thoughts from walled-garden social media platforms. — Source: [Simon Willison's Weblog]
On utilizing LLMs for personal sites: Using AI to generate a refactoring plan can allow you to add new features to a decade-old blog engine in under an hour. — Source: [Simon Willison's Weblog]
On reinventing the wheel: "You should build your own wheel, because it'll teach you more about how they work than reading a thousand books on them ever will." — Source: [Simon Willison's Weblog]
On continuous learning: A developer's true value lies in the willingness to remain a beginner and document the journey from confusion to comprehension. — Source: [Simon Willison's Weblog]
On public accountability: Sharing raw, unpolished technical notes normalizes the fact that software engineering requires constant troubleshooting. — Source: [Simon Willison's Weblog]

Part 7: Small Tools & Open Source Ecosystems

On the LLM CLI: Providing a command-line interface for large language models makes them a first-class citizen of the Unix terminal alongside utilities like grep and sed. — Source: [Simon Willison's Weblog]
On automated pipelines: AI is most useful when treated as a pipeable utility within larger automation scripts rather than a chatbot tab in a browser. — Source: [Simon Willison's Weblog]
On plugin systems: Plugins are the ultimate solution for open-source maintenance, allowing projects to grow without bottlenecking the core maintainer. — Source: [Simon Willison's Weblog]
On feature growth: A well-designed plugin architecture means a piece of software can gain a major new feature overnight without the creator doing any work. — Source: [Simon Willison's Weblog]
On context loading: Tools that concatenate entire directories into a single prompt string are essential for feeding complex codebases into LLMs. — Source: [Simon Willison's Weblog]
On independent maintenance: A single developer can sustain a vast ecosystem of tools by relying heavily on CI/CD automation and modular design. — Source: [GitHub Next]
On component networks: Dividing complex systems into networks of replaceable components remains a primary goal of both traditional and agentic software architecture. — Source: [Simon Willison's Weblog]
On CSS: "CSS is hard because it’s solving a hard problem!" — Source: [Simon Willison's Weblog]
On data extraction: Command-line utilities that scrape or search specific Python symbols are required to bridge the gap between unstructured code and AI context windows. — Source: [Simon Willison's Weblog]

Part 8: Software Engineering & The "Boring" Philosophy

On innovation tokens: Every project has a limited budget of "innovation tokens" to spend on new technologies before the complexity overwhelms the team. — Source: [Choose Boring Technology]
On prioritizing problems: If you spend your innovation tokens on your database and deployment strategy, you will lack the energy to build the feature that actually makes your product unique. — Source: [Simon Willison's Weblog]
On withered technology: Mature, "withered" technologies like PostgreSQL or SQLite can be used in radical new ways without inheriting the instability of brand-new tools. — Source: [Simon Willison's Weblog]
On early generalization: "Early generalizations are often wrong, and when a generalization is wrong it ossifies the structure of the code around it in a way that is harder to fix." — Source: [Simon Willison's Weblog]
On software brain: Developers must avoid "software brain," the urge to view the entire world strictly as something to be automated and modeled as data flows. — Source: [Simon Willison's Weblog]
On conceptual integrity: Maintaining conceptual integrity is the hardest challenge when using AI tools that can spit out new features faster than they can be thoughtfully integrated. — Source: [Simon Willison's Weblog]
On tech leverage: Engineering choices should always filter through the question: what tools will let me have more of an impact on the world and accelerate my development? — Source: [Simon Willison's Weblog]
On SQLite as a format: Utilizing SQLite as a primary distribution format, rather than just a local cache, unlocks new architectural patterns for data portability. — Source: [Datasette Documentation]
On avoiding the hype cycle: The most practical approach to technology is to stick to what actually works today, ignoring the theoretical promises of tomorrow. — Source: [Simon Willison's Weblog]

Lessons from Simon Willison

Lessons from Simon Willison

Part 1: The "Batteries Included" Web

Part 2: The Baked Data Architecture

Part 3: Vibe Coding & LLM Amplification

Part 4: The Reality of Prompt Injection

Part 5: Navigating the Jagged Frontier

Part 6: Learning in Public & TILs

Part 7: Small Tools & Open Source Ecosystems

Part 8: Software Engineering & The "Boring" Philosophy

Get the next notes and essays.

More profiles

Lessons from Alex Sacerdote

Lessons from Paul Desmarais Jr.

Lessons from Michele Romanow

Explore the surrounding system