Toward a Science of Reproducibility
Wolfgang Pauli was one of the 2nd-rank founding fathers of quantum mechanics, just below the top tier of Planck, Bohr, Heisenberg, Dirac, and Schroedinger. Physicists treat it as a joke that whenever Pauli walked into a laboratory, the experiment would go haywire, and the apparatus would have to be restarted after Pauli was gone.
Physics lore has it that there was a horseshoe hanging over the door to the laboratory of Neils Bohr. “Surely, Dr Bohr, you don’t believe in the old superstition that a horseshoe can bring good luck?” Bohr replied, “Of course not, but the charm works even for people who don’t believe in it.”
Subjectivity has posed a fundamental problem for the scientific method, which has been apparent to theorists since the earliest formulation of QM (the Copenhagen interpretation) explicitly included an observer in the prescription for experimental prediction. In QM, unlike classical physics, the answer you get from an experiment depends on the question you ask. The contradiction was sharpened in 1964 when J.S. Bell proved that there is no trivial interpretation, where there is an objective “truth” hiding behind the Uncertainty Principle. Bell proved that the observer’s (subjective, free) choice of what to measure affects reality at a fundamental level, and this effect can be felt after or before he makes his measurement. It is masked by the uncertainty principle to the extent that this effect cannot be used to pass information through quantum entanglement; nevertheless Bell’s Theorem proves that the choice of what to measure has had an effect on real, measurable physical outcomes far from the place where the measurement .
Researchers in psi and the paranormal claimed that Bell’s Theorem vindicated their science, and offered a framework for understanding results they had been chronicling for at least a century. But physicists and chemists remained confidence that Bell’s Theorem was nothing they had to think about. Their results were objectively reproducible,no matter what philosophy of QM one preferred.
That all changed with the marathon experimental series of Bob Jahn and Brenda Dunne, conducted at the Princeton PEAR lab over the course of four decades. They looked for a subjective effect on a purely physical phenomenon, measured voltage fluctuations in a source of quantum noise. They found small effects that eventually accumulated 7 sigma significance, equivalent to p<10-11, or odds against chance of about a hundred billion. The revolutionary significance of this finding is that physical measurables are affected not just be physical causes but mental causes as well. This principal has been vindicated in other contexts in the decades since, most notably by the interference fringe experiments of Dean Radin.
Science since Francis Bacon has been based on the axiom that there is an objective reality, and that different people studying that reality, if they are sufficiently careful, will always perceive that same reality. After Jahn and Dunne, we are 99.999999999% sure this is false.
This is a crisis for our scientific understanding and methodology so deep that no one wants to consider the changes to foundations of science that must logically follow. The entire world scientific community is in denial.
How can we re-formulate our understanding of science in response?
I don’t claim to have an answer to this question, but the purpose of this essay is to propose an approach. Our next step toward a scientific worldview that comprehends the experimenter and the apparatus (the subject and the object) is to study reproducibility in a wide range of contexts—physics, chemistry, biology, psychology, and parapsychology. Let’s map out the reproducibility of a wide range of different experiments, and see if we can find the correlates. What does reproducibility depend on? Perhaps from an empirical science of reproducibility, we can then take a step toward a more theoretical science of subject and object.
We’re not starting from scratch. Reproducibility has been studied in a wide range of scientific disciplines, and there are generalities that most scientists believe. Here are some elements of conventional wisdom—a starting place which I believe to be pretty far from the truth .
- Experiments in parapsychology are the least reproducible, with effects all over the map.
- Experiments in psychology are much more reproducible, but are difficult because human subjects are different, so you can never really repeat the same experiment.
- Experiments in medicine depend, similarly, on the individuals involved. In addition, most of the problems with reproducibility in biomedicine can be traced to laboratories or investigators that have a financial interest in the outcome.
- Experiments in biology of genetically identical plants or animals are statistically repeatable.
- Experiments in biochemistry are technically difficult, but in principle should be absolutely repeatable.
- Experiments in inorganic chemistry are repeatable.
- Experiments in macroscopic physics are absolutely repeatable.
- Experiments in microscopic physics are perfectly repeatable, but only on a statistical basis. At the level of single quanta, half the information is always missing, and there is an element of pure chance in every measurement. Since this is “pure chance”, we can completely understand the difference between any two runs of the same experiment (at different times or in different labs) based on mathematical statistics alone.
We don’t know what the landscape might look like a decade from now after the science of reproducibility is systematically mapped. But we know already that the conventional understanding is wrong at just about every turn.
- Experiments in parapsychology are as reproducible as other psychology experiments, perhaps more reproducible because the researchers have learned (in response to reductionist critics) to be so cautious about possible confounders in their experiment.
- There is a well-established “experimenter effect” in psychology experiments. Meta-analyses of whole classes of experiments has led to discovery of time-reversed causation and other anomalies.
- John Ioannidis, in the most cited study in the history of epidemiology, has convinced us that the vast majority of biomedical studies are unreproducible.
- Genetically identical individuals have personalities of their own. Even single-cell protists behave in individual ways.
- Anyone who works in a biology lab knows about the “batch effect” — the experimenter running the same experiment in the same lab on the same equipment gets different results.
- Rupert Sheldrake has compiled lists of melting points that change systematically over time, generally becoming higher. He explains this with his theory of morphic resonance, according to which physical laws are habits of the universe that can become established with repetition.
- Some physical measurements seem to be far more reproducible than others. This is not just a question of wide error bars for some experiments and narrow error bars for others. There are physical constants that seem to change over time in a way that can’t be explained by the availability of better and better experimental techniques. Measurements of the gravitational constant and of the speed of light both seem to have changed systematically over the 20th Century. I wrote about this two years agoin an essay titled the Zeroth Law of Science.
- The absolute randomness of quantum uncertainty is built into the foundations of quantum theory, and, as claimed above, the experiments of Jahn and Dunne erode that foundation.
The bottom line
I hope I’ve convinced you that our map of the reproducibility landscape across experimental domains is all wrong. Perhaps I’ve convinced you that surveying that landscape anew would be a useful step toward a science of the subjective.
Who wants to join me in taking this on?