Lessons from Yejin Choi

# Lessons from Yejin Choi

Computer scientist and MacArthur Fellow Yejin Choi researches how to teach AI common sense and moral reasoning. She challenges the industry assumption that bigger language models will automatically fix their own reasoning flaws, focusing instead on algorithmic efficiency and pluralistic values. This profile gathers her perspectives on the unspoken rules of human intelligence and the limits of scale.

Part 1: The Paradox of AI

On the AI paradox: "Current language models are incredibly smart and shockingly stupid at the same time." — Source: TED Talk
On jagged intelligence: "AI systems perform exceptionally well on professional exams but fail at basic common sense tasks that a child finds trivial." — Source: Stanford HAI
On generative capabilities: "There is a generative AI paradox: the ability to generate fluent text far exceeds the system's ability to actually understand what it is saying." — Source: GatesNotes
On exceptions: "For AI, exceptions are incredibly difficult to handle, whereas humans navigate exceptions effortlessly even when nobody explicitly taught us about them." — Source: TED Talk
On brittle logic: "When a model fails to understand that a heavier object falls at the same rate as a lighter one in a vacuum, it reveals that its knowledge is memorized rather than grounded in physical reality." — Source: Mindscape Podcast
On surface fluency: "We confuse a model's linguistic competence with true reasoning capability, mistaking good grammar for logical thought." — Source: Time Magazine
On missing foundations: "If you train a model exclusively on the internet, it learns everything about human opinions but very little about the physics of everyday life." — Source: Stanford HAI
On the illusion of understanding: "Just because an AI can write a beautiful poem about a sunset does not mean it possesses a functional concept of the sun, light, or time." — Source: MacArthur Foundation
On unexpected failures: "You can give a model a complex coding problem and it succeeds, but ask it to figure out how to stack three irregularly shaped blocks, and it fails completely." — Source: TED Talk
On solving the paradox: "To fix the gap between smart and stupid, we cannot just feed models more text; we have to fundamentally rethink how they represent reality." — Source: GatesNotes

Part 2: The Dark Matter of Intelligence

On defining common sense: "Common sense is the dark matter of intelligence—it is the vast, unspoken, implicit knowledge that you and I have that holds everything together." — Source: TED Talk
On unspoken rules: "We do not talk about the obvious. Because it is so obvious, nobody writes it down, which means language models do not learn it from reading the web." — Source: MacArthur Foundation
On physical constraints: "Humans intuitively know that a glass will break if pushed off a table; AI has to predict the next word in a sentence about a glass falling without actually understanding gravity." — Source: Mindscape Podcast
On social dynamics: "A significant part of our intelligence involves understanding social norms and reading the room, which requires inferring intent rather than just translating words." — Source: Stanford HAI
On the limits of text: "If we want to teach AI the dark matter of intelligence, text alone is insufficient. We need multimodal grounding that mimics how humans interact with the physical world." — Source: Time Magazine
On building knowledge graphs: "Projects like ATOMIC and COMET were designed to map out this unspoken, inferential knowledge so that algorithms could reference it directly." — Source: MacArthur Foundation
On the invisibility of context: "Context is invisible to a machine until it is explicitly modeled, whereas humans filter every interaction through a massive contextual net." — Source: TWIML AI Podcast
On basic causality: "Current systems struggle with basic cause-and-effect because they map statistical correlations, which are an incomplete shadow of the true causal dark matter." — Source: Mindscape Podcast
On the interstitial glue: "Intelligence is not just facts and figures; it is the interstitial glue that connects disparate pieces of information into a coherent worldview." — Source: GatesNotes
On measuring the unknown: "The exact fraction of knowledge that humans have but never write down is unknown, but my speculation is that it is overwhelmingly large." — Source: TED Talk

Part 3: The Fallacy of Scale

On the narrative of size: "The current industry narrative is that larger is always better, but scaling up an inherently flawed architecture will not fix its foundational blind spots." — Source: Time Magazine
On environmental impact: "Relying purely on massive scale is economically and environmentally unsustainable, creating an arms race that ignores efficiency." — Source: UN Security Council Briefing
On human efficiency: "Our own minds are living proof that a path away from pouring billions into ever-larger models is possible, as the human brain operates on the energy of a lightbulb." — Source: Time Magazine
On diminishing returns: "While scaling laws guarantee predictable gains in language perplexity, they do not guarantee the emergence of reliable, compositional reasoning." — Source: Stanford HAI
On David vs. Goliath: "We do not always need a Goliath model. Sometimes a smaller, highly focused model can outperform a massive one if it is trained with better algorithms and data." — Source: MacArthur Foundation
On brute force vs. elegance: "Instead of relying on the brute force of scaling, the AI community needs to prioritize algorithmic elegance and structural innovation." — Source: TWIML AI Podcast
On the illusion of progress: "Just making a model ten times larger gives the illusion of progress, but it often just masks the same underlying fragility." — Source: Mindscape Podcast
On hardware reliance: "If the only way to solve AI is to build larger data centers, we have failed to understand the nature of intelligence itself." — Source: GatesNotes
On alternative architectures: "We must actively explore neuro-symbolic approaches and models that integrate structured knowledge, rather than assuming deep learning alone at scale will solve everything." — Source: Stanford HAI
On scientific fragility: "The pure scaling approach is scientifically fragile because we cannot fully explain why it works, making it impossible to guarantee it will behave safely." — Source: Time Magazine

Part 4: Pluralistic Alignment and Moral Reasoning

On diverse perspectives: "We should not align AI to a single, averaged human preference; pluralistic alignment ensures models can respect diverse values across different communities." — Source: Stanford HAI
On defeasible morality: "Moral reasoning is defeasible, meaning our judgments often change when new context is introduced. AI must learn to adapt to nuanced ethical dilemmas." — Source: MacArthur Foundation
On procedural reasoning: "We must move beyond evaluating the outcomes of moral decisions and instead evaluate the procedural reasoning the AI uses to arrive at its conclusion." — Source: TWIML AI Podcast
On the Overton window: "An Overton pluralistic model is designed to present a spectrum of reasonable, diverse responses rather than insisting on one universal truth." — Source: Stanford HAI
On teaching ethics: "Teaching an AI ethics is similar to teaching it common sense; both require understanding the unspoken social contracts that govern human behavior." — Source: Mindscape Podcast
On human oversight: "Because moral values vary heavily by culture, we need steerable models that allow human users to define the value distribution." — Source: Time Magazine
On the MoReBench project: "Creating benchmarks for moral reasoning helps us test how models navigate complex dilemmas where there is no strictly correct answer." — Source: MacArthur Foundation
On avoiding homogenization: "If we rely on a few massive models controlled by central entities, we risk homogenizing global culture and erasing minority viewpoints." — Source: UN Security Council Briefing
On collaboration: "Solving AI alignment requires deep collaboration with moral philosophers and psychologists, not just computer scientists." — Source: GatesNotes

Part 5: Benchmarks and Evaluation

On static benchmarks: "When a model beats a static benchmark, it often means the benchmark is saturated, not that the model has achieved human-level intelligence." — Source: TWIML AI Podcast
On the Turing Test: "The classical Turing Test is no longer sufficient; we need evaluation environments that can distinguish between mere fluency and genuine comprehension." — Source: Stanford HAI
On adversarial testing: "To truly evaluate a model, we must use adversarial benchmarks like WinoGrande that actively search for the system's blind spots." — Source: MacArthur Foundation
On metric limitations: "Traditional metrics reward surface-level pattern matching, which gives us a false sense of security regarding a model's capabilities." — Source: Mindscape Podcast
On evaluating toxicity: "We must build robust frameworks to evaluate toxic degeneration in language models to prevent them from amplifying the worst parts of the internet." — Source: Time Magazine
On physical interaction: "Benchmarks like PIQA test physical interaction question answering, revealing how poorly models grasp the spatial realities of objects." — Source: Stanford HAI
On social IQ: "Evaluating a model's social intelligence requires testing its ability to infer intent and predict human reactions in everyday scenarios." — Source: MacArthur Foundation
On reference-free metrics: "Developing reference-free metrics like CLIPScore allows us to evaluate model outputs dynamically without relying on static, human-written answers." — Source: TWIML AI Podcast
On jagged capability: "Because AI capabilities are highly jagged, a model can score 90% on an evaluation yet fail disastrously on a minor variant of the same task." — Source: TED Talk

Part 6: The Nature of the Human Mind

On biological inspiration: "Our brains process information highly efficiently; mimicking the principles of biological intelligence should guide how we build synthetic models." — Source: GatesNotes
On developmental learning: "A human child learns physics by knocking over blocks and dropping toys, a multi-sensory process that text-only models completely miss." — Source: Mindscape Podcast
On human creativity: "Human intelligence is marked by the struggle to generate creative output, even when our underlying understanding of a concept is profound." — Source: Time Magazine
On folk psychology: "A core component of human cognition is folk psychology—our intuitive ability to attribute beliefs, desires, and intents to others." — Source: Stanford HAI
On implicit biases: "The human mind naturally relies on heuristics and implicit assumptions, which allow us to process complex environments rapidly." — Source: MacArthur Foundation
On reasoning vs. recall: "Humans use active reasoning to solve novel problems, whereas models heavily rely on recalling patterns from their training data." — Source: TWIML AI Podcast
On the nature of thought: "True thought requires an internal model of the world that allows for simulation and forecasting, not just backward-looking prediction." — Source: GatesNotes
On human values: "Our values are not static code; they are dynamic, shifting based on context, empathy, and lived experience." — Source: Stanford HAI
On biological constraints: "The biological constraints of the human brain forced evolution to prioritize extreme efficiency, a design principle we have abandoned in AI." — Source: Mindscape Podcast

Part 7: Democratizing AI with Smaller Models

On open-source research: "Democratizing AI means ensuring that powerful tools are accessible to researchers globally, not just locked behind corporate walls." — Source: UN Security Council Briefing
On breaking monopolies: "Focusing on smaller, high-quality models allows universities and smaller organizations to compete with massive tech monopolies." — Source: Time Magazine
On knowledge distillation: "We can unlock immense potential by distilling the reasoning capabilities of massive models into smaller, more agile architectures." — Source: TWIML AI Podcast
On synthetic data: "Using high-quality synthetic data to train smaller models helps bridge the performance gap without requiring astronomical computing resources." — Source: Stanford HAI
On global equity: "AI should be a tool that serves the global majority, which requires building infrastructure that can be run without massive energy expenditure." — Source: UN Security Council Briefing
On model orchestration: "Smaller models can act as efficient orchestrators, delegating tasks to specific tools rather than trying to contain all knowledge internally." — Source: MacArthur Foundation
On lowering barriers: "By making models smaller, we lower the barrier to entry, allowing diverse voices to participate in steering AI development." — Source: Time Magazine
On targeted utility: "A small model trained explicitly for a specific domain can often outperform a massive, generalized model while using a fraction of the compute." — Source: GatesNotes
On sustainable progress: "Sustainable progress in AI requires a commitment to algorithmic innovation that prioritizes efficiency as a primary metric of success." — Source: TWIML AI Podcast

Part 8: The Adventure of Research

On the unknown: "Working in AI right now has the feeling of a grand adventure; you are constantly exploring unknown territory and encountering unexpected phenomena." — Source: New York Times
On asking new questions: "The role of academic research is not just to build better products, but to ask fundamental questions about the nature of intelligence." — Source: GatesNotes
On curiosity: "When you see an unexpected failure in a model, it fuels the curiosity to dive deeper and find out what else is missing from our understanding." — Source: TED Talk
On interdisciplinary horizons: "The future of AI research must be interdisciplinary, pulling insights from cognitive neuroscience, linguistics, and philosophy." — Source: Stanford HAI
On challenging assumptions: "Good research requires the willingness to challenge the prevailing assumptions of the industry, even when everyone else is moving in one direction." — Source: Time Magazine
On the role of universities: "Universities play a critical role in AI because they can prioritize long-term, foundational inquiries over short-term commercial gains." — Source: GatesNotes
On scientific rigor: "We have to redefine scientific rigor in AI, moving beyond raw leaderboard scores to deeply analyze how and why models fail." — Source: TWIML AI Podcast
On future generations: "Part of the adventure is mentoring the next generation of researchers to think critically about the societal impacts of the technology they build." — Source: MacArthur Foundation
On embracing uncertainty: "The rapid evolution of AI means we must become comfortable with uncertainty, using it as a catalyst for deeper exploration and discovery." — Source: Mindscape Podcast

Lessons from Yejin Choi

Part 1: The Paradox of AI

Part 2: The Dark Matter of Intelligence

Part 3: The Fallacy of Scale

Part 4: Pluralistic Alignment and Moral Reasoning

Part 5: Benchmarks and Evaluation

Part 6: The Nature of the Human Mind

Part 7: Democratizing AI with Smaller Models

Part 8: The Adventure of Research

Explore the surrounding system

Get the next notes and essays.

More profiles

Lessons from Evan Spiegel

Lessons from Jason Lemkin

Lessons from Brian Armstrong