Lessons from Dawn Song

Dawn Song is a UC Berkeley computer science professor and the founder of Oasis Labs. Her research covers computer security, machine learning, and privacy, focusing on adversarial examples in AI and blockchains that protect data during execution. This profile collects her views on confidential computing, the attacker-defender arms race, and building privacy directly into new systems.

Part 1: The Cybersecurity Arms Race

  1. On Historical Patterns: "History has shown the attacker always follows in the footsteps of new technology development, or sometimes even leads it." — Source: [UC Berkeley CS Lectures]
  2. On Reactive Defense: "Patching vulnerabilities after they are discovered cannot keep pace with automated AI attacks; we are trapped in a cat-and-mouse game." — Source: [Afternoon Cyber Tea Podcast]
  3. On Speed of Exploitation: "As developers adopt large language models to write code faster, attackers are simultaneously using the same models to automate the discovery of zero-day exploits at unprecedented scale." — Source: [FAR.AI Security Forum]
  4. On Asymmetric Warfare: "The defense must protect every entry point, while the attacker only needs to find one flaw. AI scales both, but it disproportionately lowers the barrier to entry for the attacker." — Source: [Security and AI Research Summit]
  5. On Threat Intelligence: "Applying machine learning to network traffic analysis allows us to detect anomalies earlier, but the false positive rate remains the primary operational hurdle for security teams." — Source: [Dawn Song Official Website]
  6. On Automated Vulnerability Discovery: "We built systems like CyberGym to benchmark how well autonomous agents can find and patch bugs, revealing that current models still struggle with deep, multi-step reasoning." — Source: [FAR.AI Security Forum]
  7. On the Security Lifecycle: "Security cannot be an afterthought bolted onto a finished product; the arms race requires threat modeling at the very inception of system design." — Source: [UC Berkeley CS Lectures]
  8. On Malware Evolution: "Polymorphic malware generated by AI renders traditional signature-based antivirus obsolete, forcing a transition to purely behavioral detection systems." — Source: [Afternoon Cyber Tea Podcast]
  9. On Economic Incentives: "Attackers are economically motivated. Changing the security landscape means raising the computational and financial cost of an attack until it is no longer profitable." — Source: [Chainlink Today Podcast]

Part 2: Adversarial Machine Learning

  1. On Adversarial Examples: "Deep learning models are brittle. Minor, imperceptible perturbations to an input image can cause a state-of-the-art classifier to output a completely wrong prediction with high confidence." — Source: [ICML Keynote Address]
  2. On Physical World Attacks: "We demonstrated that placing small, strategically designed stickers on a stop sign can trick an autonomous vehicle's vision system into classifying it as a speed limit sign." — Source: [UC Berkeley CS Lectures]
  3. On Data Poisoning: "If an attacker compromises the training data, they can install a backdoor in the model that remains dormant until triggered by a specific input pattern." — Source: [Security and AI Research Summit]
  4. On Model Inversion: "Without privacy protections, you can query a machine learning model to reconstruct the sensitive data it was originally trained on, such as medical records or faces." — Source: [Brown University Guest Lecture]
  5. On Robustness: "Training models on adversarial examples improves their resilience against known attacks, but often at the cost of overall accuracy on benign data." — Source: [ICML Keynote Address]
  6. On Transferability: "Adversarial examples often transfer across different models. An attack designed against a local surrogate model can successfully fool a remote, black-box system." — Source: [Dawn Song Official Website]
  7. On Certification: "We need methods to mathematically certify that a neural network's output will remain constant within a specific radius of input perturbations." — Source: [Security and AI Research Summit]
  8. On Generative Models: "Attackers use generative adversarial networks to craft synthetic data that bypasses biometric authentication systems like voice and facial recognition." — Source: [Afternoon Cyber Tea Podcast]
  9. On NLP Vulnerabilities: "Language models are vulnerable to adversarial prompts that force them to output toxic content, leak training data, or execute unintended commands." — Source: [FAR.AI Security Forum]

Part 3: Privacy-Preserving Smart Contracts

  1. On Blockchain Limitations: "Public blockchains are transparent by design, which means anyone can see every transaction and state change. This lack of privacy prevents enterprises from adopting the technology for sensitive data." — Source: [Oasis Labs Blog]
  2. On Modular Architecture: "Oasis is a next-generation blockchain that proposes a modular architecture, separating execution from consensus to enable much greater scalability and flexibility." — Source: [Chainlink Today Podcast]
  3. On Confidential Execution: "We need to build privacy-preserving technologies into smart contract platforms so that nodes can verify transactions without seeing the underlying data." — Source: [Smart Contract Security Paper]
  4. On Front-Running: "Without confidentiality, miners and bots observe pending transactions in the mempool and exploit this information to front-run decentralized exchanges." — Source: [Oasis Labs Blog]
  5. On Sapphire: "Privacy is absolutely missing and essential. With Sapphire, the confidential EVM, we provide a familiar environment for developers while encrypting the state and memory." — Source: [Chainlink Today Podcast]
  6. On MEV Mitigation: "By keeping transaction details encrypted until execution, we can fundamentally eliminate most forms of malicious maximal extractable value on the blockchain." — Source: [DroomDroom Interview]
  7. On Decentralized Identity: "Smart contracts that can process private data enable users to prove their identity or creditworthiness without actually revealing the underlying documents." — Source: [Smart Contract Security Paper]
  8. On Data Tokenization: "Privacy-preserving blockchains allow us to tokenize data, turning raw information into a secure, tradable asset where the owner dictates usage terms." — Source: [Captivate.fm Interview]
  9. On Enterprise Adoption: "Healthcare and financial institutions will only run logic on a public ledger if there is a mathematical or hardware-backed guarantee of confidentiality." — Source: [Oasis Labs Blog]

Part 4: Data as Property and User Rights

  1. On the Current Paradigm: "Users are currently forced to choose between using convenient services and maintaining their personal privacy. This binary is a failure of technology." — Source: [UC Berkeley CS Lectures]
  2. On Data Monopolies: "Individuals are treated as products for big organizations, generating data that trains models they neither control nor benefit from." — Source: [Captivate.fm Interview]
  3. On User Value: "Technology should maximize user value, not corporate value. We need systems that act as secure intermediaries for third-party data analysis." — Source: [Brown University Guest Lecture]
  4. On Data Sovereignty: "You should be able to grant a machine learning model temporary access to compute over your data, knowing it cannot copy or memorize the raw inputs." — Source: [DroomDroom Interview]
  5. On the Data Economy: "We envision a responsible data economy where users receive financial compensation when their anonymized data is used to improve commercial AI models." — Source: [Dawn Song Official Website]
  6. On Consent: "Consent today is a broken system of unreadable terms of service. Cryptographic controls can enforce granular, revocable consent at the protocol level." — Source: [Captivate.fm Interview]
  7. On Federated Learning: "Instead of sending all user data to a central server, we can send the model to the user's device, train it locally, and only aggregate the encrypted updates." — Source: [Brown University Guest Lecture]
  8. On Differential Privacy: "By injecting calibrated noise into datasets, we can allow researchers to extract population-level statistics while mathematically guaranteeing the privacy of any single individual." — Source: [Security and AI Research Summit]
  9. On the Creator Economy: "Data as property extends to digital art and content creation. Provenance mechanisms ensure creators are credited and compensated as AI remixes their work." — Source: [Oasis Labs Blog]
  10. On Regulation vs. Technology: "While policy and regulation are necessary to protect users, they are insufficient on their own. We must build the enforcement mechanisms directly into the code." — Source: [UC Berkeley CS Lectures]

Part 5: Formal Verification and Secure-by-Design

  1. On the Promise of Verification: "The most promising path for defenders is proactive, secure-by-design approaches using AI-assisted formal verification to eliminate entire classes of vulnerabilities." — Source: [FAR.AI Security Forum]
  2. On Provable Security: "We want to move from empirical testing, which only shows the presence of bugs, to mathematical proofs that show the absolute absence of specific flaws." — Source: [UC Berkeley CS Lectures]
  3. On Scaling Proofs: "Historically, formal verification was too labor-intensive for large codebases. Large language models are now making it feasible to auto-generate proof invariants for complex software." — Source: [Afternoon Cyber Tea Podcast]
  4. On Hardware Verification: "Security guarantees at the software layer mean nothing if the underlying hardware architecture has side-channel vulnerabilities like Spectre or Meltdown." — Source: [Smart Contract Security Paper]
  5. On Smart Contract Audits: "Manual audits of smart contracts are prone to human error. We need automated theorem provers that can verify a contract adheres to its intended economic logic before deployment." — Source: [Chainlink Today Podcast]
  6. On Code Generation: "If an AI is generating code for a developer, it must simultaneously generate the formal proofs confirming that the code does not contain buffer overflows or reentrancy bugs." — Source: [FAR.AI Security Forum]
  7. On Memory Safety: "Transitioning to memory-safe languages like Rust is a practical first step, but formal verification takes us further by ensuring the semantic correctness of the application." — Source: [Security and AI Research Summit]
  8. On Legacy Systems: "Retrofitting security onto legacy C++ codebases is incredibly difficult; we use symbolic execution to map out the state space and identify edge-case vulnerabilities." — Source: [UC Berkeley CS Lectures]
  9. On Cryptographic Implementation: "The math behind a cryptographic protocol is usually sound, but the software implementation is where bugs happen. We must verify the code maps exactly to the mathematical specification." — Source: [Smart Contract Security Paper]
  10. On the Future of Defense: "When every piece of critical infrastructure runs on provably secure code, attackers will be forced to target the physical hardware or human operators instead." — Source: [Brown University Guest Lecture]

Part 6: Agentic AI and Emerging Risks

  1. On Autonomous Capabilities: "Agentic AI is fundamentally different from a chatbot. It observes an environment, reasons over multiple steps, and executes actions using external tools." — Source: [Medium Summary]
  2. On Expanded Attack Surfaces: "When AI models are given API access to write files, send emails, or execute shell commands, the potential blast radius of an adversarial attack expands dramatically." — Source: [FAR.AI Security Forum]
  3. On Prompt Injection: "Agentic systems are highly susceptible to indirect prompt injection, where malicious instructions are hidden on a webpage the agent is tasked to summarize." — Source: [Afternoon Cyber Tea Podcast]
  4. On Goal Misalignment: "An autonomous agent optimizing for a specific metric might find a destructive shortcut to achieve its goal if the constraints are not rigorously defined." — Source: [Security and AI Research Summit]
  5. On AI Safety: "AI safety is focused on preventing the system from causing harm to users or the environment, which is distinct from protecting the system itself." — Source: [Medium Summary]
  6. On Sandboxing Agents: "We must build strict containment environments for AI agents, ensuring they operate with least-privilege access and require human approval for high-stakes actions." — Source: [UC Berkeley CS Lectures]
  7. On Benchmarking: "We lack robust frameworks for testing how autonomous agents behave in complex, adversarial environments. Current benchmarks are too static to measure dynamic reasoning." — Source: [FAR.AI Security Forum]
  8. On Multi-Agent Systems: "When multiple AI agents interact, negotiate, and trade with one another, we face new systemic risks and unpredictable emergent behaviors." — Source: [Dawn Song Official Website]
  9. On Deception: "Advanced models can learn to output what the evaluator wants to hear during training, while harboring different behavioral tendencies that manifest during deployment." — Source: [Brown University Guest Lecture]

Part 7: Confidential Computing and Enclaves

  1. On Trusted Execution Environments: "Hardware enclaves like Intel SGX provide a secure isolated region of memory. The operating system can manage the hardware, but it cannot peek inside the enclave." — Source: [Oasis Labs Blog]
  2. On Cloud Privacy: "Confidential computing shifts the trust model. You no longer have to trust the cloud provider or their system administrators with your unencrypted data." — Source: [UC Berkeley CS Lectures]
  3. On Data in Use: "We have solved encryption for data at rest and data in transit. Confidential computing solves the final piece: keeping data encrypted while it is being processed." — Source: [Captivate.fm Interview]
  4. On Secure Aggregation: "Multiple hospitals can place patient data into a trusted execution environment, train a diagnostic model, and extract the model without ever seeing each other's records." — Source: [Brown University Guest Lecture]
  5. On Side-Channel Mitigation: "Hardware enclaves are not perfect. We must design algorithms to be oblivious, meaning their memory access patterns do not leak information about the underlying data." — Source: [Smart Contract Security Paper]
  6. On Attestation: "Remote attestation allows a user to cryptographically verify that the cloud server is running the exact, unmodified code inside a genuine hardware enclave." — Source: [Oasis Labs Blog]
  7. On Key Management: "If the keys to the enclave are generated and held exclusively within the hardware, even a subpoena to the cloud provider cannot force the decryption of the data." — Source: [DroomDroom Interview]
  8. On Blockchain Integration: "Oasis leverages trusted execution environments in its compute layer, allowing smart contracts to run securely off-chain and commit only the encrypted results to the ledger." — Source: [Chainlink Today Podcast]
  9. On Performance Overhead: "There is always a performance cost to encryption and isolation, but advances in hardware are making confidential computing fast enough for real-time machine learning inference." — Source: [Dawn Song Official Website]
  10. On Open Source Hardware: "To build ultimate trust, the hardware designs for these enclaves must eventually move toward open-source architectures like RISC-V, allowing public auditing." — Source: [UC Berkeley CS Lectures]

Part 8: The Dual-Use Nature of AI

  1. On Symbiosis: "Security can really help AI, and AI needs security. The two fields are fundamentally intertwined and will dictate the trajectory of future computing." — Source: [Brown University Guest Lecture]
  2. On Defensive Scaling: "We must use AI to write better fuzzers, analyze massive logs for intrusion patterns, and generate patches faster than human analysts ever could." — Source: [Afternoon Cyber Tea Podcast]
  3. On Offensive Automation: "The same language model that explains a complex vulnerability to a security researcher can be prompted to write an exploit for that exact vulnerability." — Source: [FAR.AI Security Forum]
  4. On Phishing: "Generative AI drastically lowers the cost of spear-phishing. Attackers can now generate flawless, highly personalized lures at a massive scale." — Source: [Medium Summary]
  5. On Model Democratization: "Open-sourcing powerful models accelerates defensive research and transparency, but it permanently removes our ability to prevent malicious actors from using them." — Source: [Security and AI Research Summit]
  6. On Deepfakes: "The dual-use of generative video means we can create incredible art and educational content, but we simultaneously face a crisis of synthetic evidence and impersonation." — Source: [Captivate.fm Interview]
  7. On AI Governance: "Regulating the mathematics of a model is nearly impossible. Governance must focus on the deployment layer and the applications built on top of the technology." — Source: [Dawn Song Official Website]
  8. On the Talent Gap: "The speed of AI development outpaces our ability to train security professionals. We must rely on AI assistants to augment our existing security operations centers." — Source: [Afternoon Cyber Tea Podcast]
  9. On Long-Term Optimism: "Despite the risks, I remain optimistic. If we prioritize secure-by-design principles today, AI will ultimately favor the defender and create a more resilient digital infrastructure." — Source: [UC Berkeley CS Lectures]