Visual summary of operating lessons from John Tukey.

John Tukey was an American statistician and mathematician who spent his career at Princeton and Bell Labs arguing for practical data exploration over rigid formulas. He coined the terms "bit" and "software" and invented visual tools like the box plot. This collection gathers his rules, quotes, and frameworks for approaching data, teaching, and scientific collaboration.

Part 1: The Philosophy of Approximation

  1. On the right question: "Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise." — Source: [The Future of Data Analysis]
  2. On being right: "Be approximately right rather than exactly wrong." — Source: [WikiQuote]
  3. On precision: "There is no point in being precise when you don't know what you're talking about." — Source: [AZ Quotes]
  4. On exactness vs approximation: "An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem." — Source: [Motley Bytes]
  5. On effort: "If a thing is not worth doing, it is not worth doing well." — Source: [WordPress Collection]
  6. On prioritizing problems: "Statistical analysis should focus on what the data actually says rather than forcing it to answer perfectly a hypothetical question that does not matter." — Source: [Statistical Science Interview]
  7. On empirical science: "Data analysis should be treated as an empirical science rather than just a branch of pure mathematics." — Source: [Berkeley.edu]
  8. On models: "The purpose of models is not to fit the data, but to sharpen the questions." — Source: [Bookey]
  9. On rigidity: "Traditional mathematical statistics are often too rigid for the messy reality of experimental data." — Source: [Wikipedia]
  10. On discovering truth: "The true value of a statistical method is in its ability to reveal something useful, not in its mathematical elegance." — Source: [Amstat Videos]

Part 2: Exploratory Data Analysis (EDA)

  1. On EDA as detective work: "Exploratory data analysis is detective work—numerical detective work—or counting detective work—or graphical detective work." — Source: [Nveil]
  2. On the nature of EDA: "Exploratory data analysis is an attitude, a flexibility, and a reliance on insight rather than blindness to help solve real world statistical problems." — Source: [Bookey]
  3. On learning order: "It is important to understand what you CAN DO before you learn to measure how WELL you seem to have DONE it." — Source: [Today in Science]
  4. On finding clues: "Unless the detective finds the clues, judge or jury has nothing to consider." — Source: [Nveil]
  5. On exploratory vs confirmatory: "Statistics had become too focused on confirmatory analysis (p-values) at the expense of looking at the data without preconceived notions." — Source: [Querio]
  6. On visualization rules: "The three core rules of data analysis are: Make a picture. Make a picture. Make a picture." — Source: [SAS]
  7. On the value of pictures: "The greatest value of a picture is when it forces us to notice what we never expected to see." — Source: [AZ Quotes]
  8. On basic tools: "If we need a short suggestion of what exploratory data analysis is... It is an attitude; a flexibility; and some graph paper." — Source: [St. Andrews]
  9. On numerical vs graphical: "Numerical quantities focus on expected values, graphical summaries on unexpected values." — Source: [SAS]
  10. On inventing the boxplot: "The box-and-whisker plot was designed to help researchers immediately see the structure of their data before applying formal statistical tests." — Source: [Medium]

Part 3: The Reality of Data

  1. On extracting answers: "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data." — Source: [Bookey]
  2. On magic wands: "Data may not contain the answer... Statistics is not a magic wand." — Source: [WikiQuote]
  3. On shaping data: "Data is like clay. With the right tools, you can shape it into something meaningful." — Source: [AZ Quotes]
  4. On pie charts: "There is no data that can be displayed in a pie chart, that cannot be displayed BETTER in some other type of chart." — Source: [St. Andrews]
  5. On the process: "Data analysis is a process, not a single event." — Source: [Motley Bytes]
  6. On robustness: "Data methods must be robust and resistant so they aren't easily ruined by a few extreme outliers." — Source: [Scribd]
  7. On data contamination: "Many optimal classical methods like least squares fail completely when data is even slightly contaminated by non-normal distributions." — Source: [Scribd]
  8. On outliers: "A data point more than 1.5 times the Interquartile Range (IQR) above the third quartile or below the first quartile should be flagged as a potential outlier." — Source: [Medium]
  9. On messy data: "The rigid nature of standard statistics is unsuited to the inherently messy and unpredictable nature of real-world experimental data." — Source: [Wikipedia]

Part 4: Coining Terms and Language

  1. On naming "bit": "Well, isn't the word obviously bit?" — Source: [Princeton University]
  2. On rejecting clunky terms: "Before settling on 'bit', alternative terms like 'binit' and 'bigit' were considered and dismissed." — Source: [Princeton University]
  3. On defining "software": "Today the 'software' comprising the carefully planned interpretive routines, compilers, and other aspects of automative programming are at least as important to the modern electronic calculator as its 'hardware'..." — Source: [American Mathematical Monthly]
  4. On anagrams in science: "He coined the term 'cepstrum' as an anagram of 'spectrum' to describe techniques for analyzing time-series data." — Source: [Wikipedia]
  5. On resampling techniques: "He introduced the term 'jackknife' for a method of estimating the bias and variance of a statistic by systematically leaving out observations." — Source: [Scribd]
  6. On visual terminology: "He popularized the 'stem-and-leaf display' as a way to group data points without losing the original numbers." — Source: [Medium]
  7. On language as a tool: "Tukey acted as an amateur linguist for mathematics, creating words that helped solidify abstract concepts into concrete tools." — Source: [Princeton University]
  8. On the first use of bit: "The word 'bit' first entered the formal scientific record in Claude Shannon’s 1948 paper, crediting Tukey for the suggestion." — Source: [Wikipedia]
  9. On defining the EDA field: "By naming 'Exploratory Data Analysis,' he successfully created a distinct vocabulary to separate it from Confirmatory Data Analysis." — Source: [St. Andrews]

Part 5: Computing and Algorithms

  1. On the FFT algorithm: "Co-developing the Fast Fourier Transform reduced discrete Fourier transform complexity from O(N^2) to O(N log N), revolutionizing digital signal processing." — Source: [ResearchGate]
  2. On algorithmic impact: "The FFT algorithm became the backbone for modern telecommunications, audio/video compression, and medical imaging." — Source: [ResearchGate]
  3. On solving problems mentally: "He was known to solve complex university scheduling algorithms in his head while lying flat on his back on a table." — Source: [Princeton University]
  4. On election night algorithms: "He developed the early computer algorithms used by NBC to project presidential election winners from early voting returns." — Source: [Project Euclid]
  5. On early computing: "He recognized early on that the instructions fed into machines, the software, would require as much rigorous planning as the physical hardware itself." — Source: [Automox]
  6. On detecting signals: "He utilized advanced spectral analysis to detect faint signals, specifically applying it to the detection of underground nuclear tests." — Source: [Wikipedia]
  7. On computing resources: "He understood that computational efficiency was the only way to make complex data analysis feasible on mid-20th-century hardware." — Source: [ResearchGate]
  8. On multiple comparisons: "He developed Tukey's Honest Significance Test to ensure analysts don't find false positive results simply by running high volumes of computational tests." — Source: [Wikipedia]
  9. On downweighting extremes: "He created Tukey’s Biweight, a computing estimator that automatically downweights outliers so they do not disproportionately skew computational results." — Source: [Scribd]

Part 6: Teaching and Students

  1. On first experiences: "The first time I was in a statistics course, I was there to teach it." — Source: [WordPress Collection]
  2. On teaching constraints: "Teaching data analysis is not easy, and the time allowed is always far from sufficient." — Source: [Princeton University]
  3. On seminar rules: "They need not know the answers to the problems they talked about." — Source: [Biographical Memoirs]
  4. On open questioning: "The audience could ask almost any question any time." — Source: [Biographical Memoirs]
  5. On creating a safe space: "Remarks were never to be held against the one speaking out." — Source: [Biographical Memoirs]
  6. On crosswords and certainty: "Doing statistics is like doing crosswords except that one cannot know for sure whether one has found the solution." — Source: [Biographical Memoirs]
  7. On department building: "He was the driving force behind the creation of Princeton’s Department of Statistics in 1965, serving as its first chairman." — Source: [UChicago]
  8. On learning by doing: "He gave his students crossword books with the answer keys removed to teach them how to navigate analytical ambiguity." — Source: [Biographical Memoirs]
  9. On lecture style: "He possessed an eccentric delivery style, sometimes ambling to the podium in baggy clothes and sitting cross-legged on a table to answer student questions." — Source: [Amstat Videos]

Part 7: Collaboration and Consulting

  1. On playing in backyards: "The best thing about being a statistician is that you get to play in everyone's backyard." — Source: [AZ Quotes]
  2. On consulting: "A consultant is a man who thinks with other people's brains." — Source: [WordPress Collection]
  3. On responsibility: "The statistician cannot evade the responsibility for understanding the process he applies or recommends." — Source: [WikiQuote]
  4. On the dual career: "For four decades, he seamlessly maintained simultaneous roles as a leading researcher at Bell Labs and a professor at Princeton." — Source: [Berkeley.edu]
  5. On interdisciplinary science: "He viewed statistics not as an isolated mathematical branch, but as the ultimate interdisciplinary tool for all sciences." — Source: [IMStat]
  6. On chemistry and math: "Having started in chemistry before falling over the fence into mathematics, he always maintained a cross-disciplinary perspective in collaborative work." — Source: [Princeton University]
  7. On understanding the subject: "He insisted that statisticians must immerse themselves in the subject matter of their collaborators rather than just blindly processing their numbers." — Source: [Project Euclid]
  8. On Bell Labs culture: "Working at Bell Labs alongside figures like Claude Shannon provided the perfect environment for his collaborative approach to applied problem-solving." — Source: [Wikipedia]
  9. On practical utility: "He prioritized practical utility over mathematical perfection, making him an ideal partner for engineers and applied scientists." — Source: [St. Andrews]
  10. On the omnipresent work ethic: "Colleagues and family described his work habits as omnipresent, always ready to discuss a problem regardless of the time or location." — Source: [Amstat Videos]

Part 8: Science and Public Policy

  1. On sampling bias: "A random selection of three people would have been better than a group of 300 chosen by Kinsey." — Source: [Wikipedia]
  2. On survey methodology: "His work evaluating the Kinsey Report became a landmark defense of the absolute necessity of probability sampling in social science." — Source: [Wikipedia]
  3. On environmental advocacy: "He chaired the National Research Council committee that issued early warnings about chlorofluorocarbons depleting the Earth's ozone layer." — Source: [MBounthavong]
  4. On global policy: "In 1972, he served as a U.S. delegate to the UN Conference on the Human Environment, addressing global environmental degradation." — Source: [MBounthavong]
  5. On military defense: "During WWII, he worked on the Fire Control Research Office, designing trajectory systems for B-29 bombers and anti-aircraft guns." — Source: [Wikipedia]
  6. On code-breaking: "His wartime mathematical work included significant contributions to cryptographic efforts deciphering German Enigma communications." — Source: [Wikipedia]
  7. On the U-2 spy plane: "His expertise in signal processing and data analysis led him to work on classified national security projects, including the U-2 spy plane program." — Source: [Project Euclid]
  8. On policy communication: "He believed that scientists had a public duty to communicate the statistical realities of environmental and social risks directly to policymakers." — Source: [MBounthavong]
  9. On the scientific generalist: "He epitomized the scientific generalist, refusing to limit his statistical methods to the classroom and instead applying them directly to the era's most pressing national challenges." — Source: [Wikipedia]