by Jim Baggott
No facts without theory
The process of building a body of accepted scientific facts is often fraught with difficulty, and rarely runs smoothly. We might be tempted to think that once we have built it, this body of evidence forms a clear, neutral, unambiguous substrate on which scientific theories can be contrived. Surely the facts form a ‘blank sheet of paper’, on which the theorists can exercise their creativity?
But this is not the case. It is in fact impossible to make an observation or perform an experiment without the context of a supporting theory in some shape or form. French physicist and philosopher Pierre Duhem once suggested that we go into a laboratory and ask a scientist performing some basic experiments on electrical conductivity to explain what he is doing:
Is he going to answer: ‘I am studying the oscillations of the piece of iron carrying this mirror?’ No, he will tell you that he is measuring the electrical resistance of a coil. If you are astonished, and ask him what meaning these words have, and what relation they have to the phenomena he has perceived and which you at the same time perceived, he will reply that your question would require some long explanations, and he will recommend that you take a course in electricity.10
Facts are never theory-neutral; they are never free of contamination from some theory or other. As we construct layer upon layer of theoretical understanding of phenomena, the concepts of our theories become absorbed into the language we use to describe the phenomena themselves. Facts and theory become hopelessly entangled.
If you doubt this, just look back over the previous paragraphs concerning the search for the Higgs boson at CERN.
This brings us to our second principle.
The Fact Principle. Our knowledge and understanding of empirical reality are founded on verified scientific facts derived from careful observation and experiment. But the facts themselves are not theory-neutral. Observation and experiment are simply not possible without reference to a supporting theory of some kind.
So how do scientists turn this hard-won body of evidence into a scientific theory?
Theory from facts: anything goes?
The naïve answer is to say that theories are derived through a process of induction. Scientists use the data to evolve a system of generalizations, built on abstract concepts. The generalizations may be elevated to the status of natural patterns or ‘laws’. The laws in turn are explained as the logical and inevitable result of the properties and behaviour of a system of theoretical concepts and theoretical entities.
A suitable example appears to be provided by the German mathematician and astronomer Johannes Kepler, who deduced his three laws of planetary motion after years spent mulling over astronomical data collected by the eccentric Dane Tycho Brahe, at Benatky Castle and Observatory near Prague.
Brahe’s painstaking observations of the motion of the planet Mars suggested a circular orbit around the sun, to within an accuracy of about eight minutes of arc. But this was not good enough for Kepler:
… if I had believed that we could ignore these eight minutes, I would have patched up my hypothesis accordingly. But since it was not permissible to ignore them, those eight minutes point the road to a complete reformulation of astronomy.11
Brahe’s observations were just too good.
In his book Astronomia Nova (New Astronomy), published in 1609, Kepler used Brahe’s data to argue that the planets move not in circular orbits around the sun, but in elliptical orbits with the sun at one focus. For this scheme to work, he had to assume that the earth behaves like just any other planet, also moving around the sun in an orbit described by an ellipse.*
This means that a planet moves closer to the sun for some parts of its orbit and further away for other parts. Kepler also noted a balance between the distance of the planet from the sun and the speed of its motion in the orbit. A planet moves faster around that part of its orbit that takes it closest to the sun, and more slowly in that part of its orbit that is more distant. An imaginary line drawn from the sun to the planet will sweep out an area as the planet moves in its orbit. Kepler deduced that the balance between speed and proximity to the sun means that no matter where the planet is in its orbit, such an imaginary line will sweep out equal areas in equal times.
In 1618, Kepler added a third law. The cube of the mean radius of the orbit divided by the square of the period (the time taken for a planet to complete one trip around the sun) is approximately constant for all the planets in the solar system.
Kepler had used Brahe’s facts to develop a set of empirical laws based on the abstract concept of an elliptical orbit. In 1687, Isaac Newton deepened our understanding by devising a theory that explained the origins of Kepler’s elliptical orbits in terms of other abstract concepts — the forces acting between bodies — in three laws of motion and a law of universal gravitation.
So, this seems all very straightforward. Kepler induced his three laws from Brahe’s extensive set of accurate astronomical data. Newton then ‘stood on the shoulders of giants’, using Kepler’s conclusions, among many others, to derive his own laws, thus driving science inexorably in the direction of ultimate truth. Right?
Wrong. Whenever historians examine the details of scientific discoveries, they inevitably find only confusion and muddle, vagueness and error, good fortune often pointing the way to the right answers for the wrong reasons. Occasionally they find true genius. Theorizing involves a deeply human act of creativity. And this, like humour, doesn’t fare well under any kind of rational analysis.
Kepler himself was a mystic. He was pursuing a theory of the solar system in which the sun-centred orbits of the six planets then known were determined by a sequence of Platonic solids, set one inside the other like a Russian matryoshka doll. In arriving at his three laws, he assumed that the planets were held in their orbits by magnetic-like emanations from the sun, which dragged them around like spokes in a wheel. He inverted the roles of gravity and inertia. He made numerous basic arithmetic errors which happened to cancel out.
He stumbled on the mathematical formula that describes the planetary orbits but did not realize that this represented the equation for an ellipse. In frustration, he rejected this formula and, growing increasingly convinced that the orbit must be an ellipse instead, tried that, only to discover that it was the same formula that he’d just abandoned. ‘Ah, what a foolish bird I have been,’ he wrote.12
Not to worry. Newton figured it all out 69 years later, clearing the confusion, sorting the muddle, clarifying the vagueness and eradicating the errors. Except that he didn’t. Although there can be no doubting that Newton’s mechanics represented a tremendous advance, in truth he had simply replaced one mystery with another.
Kepler’s elliptical orbits were no longer a puzzle. They could be explained in terms of the balance between gravitational attraction and the speed with which a planet moves in its orbit. We were left instead to puzzle over the origin of Newton’s force of gravity, which appeared to act instantaneously, at a distance, and with no intervening medium to carry it. Newton himself lamented:
That gravity should be innate, inherent, and essential to matter, so that one body may act upon another, at a distance through a vacuum, without the mediation of anything else, by and through which their action and force may be conveyed from one to another, is to me so great an absurdity, that I believe no man who has in philosophical matters a competent faculty of thinking, can ever fall into it.13
He was accused of introducing ‘occult elements’ into his mechanics. This mystery would not be resolved for another two hundred years, when Einstein replaced Newton’s gravity with the interplay between matter and curved spacetime.
What this excursion into the history of science tells us is that while there can be no doubt at all that Kepler and Newton developed theories and made discoveries that represented real advances in our knowledge of empirical reality, the methods by which these advances were made do not lend themselves to a simple, convenient rationalization. It seems odd that we
can’t devise a ‘scientific’ theory about how science is actually supposed to work, a universal scientific method applicable to all science for all time. But it’s a fact.
As Einstein himself admitted: ‘There is no logical path to these laws; only intuition, resting on sympathetic understanding of experience, can reach them.’
Now intuition is all about the acquisition of knowledge without inference or the use of logical reasoning. Often jammed in the door of intuition we find the foot of speculation. If it is the case that in the act of scientific creativity ‘anything goes’, then the door is wedged firmly open for all manner of speculative theorising.*
The importance of the abstract
When a theory is constructed, it will contain concepts that are more or less familiar, often depending on the degree of speculation involved. One important point to note is that the concepts that form the principal base ingredients of any scientific theory are utterly abstract.
For example, in the standard model of particle physics we find elementary particles such as electrons and quarks. These are already rather abstract conceptual entities, but they are not the ones I’m referring to here. Underpinning the theory on which the standard model is based is the abstract mathematical concept of the point particle. This is an idealization, an assumption that for mathematical convenience particles like electrons and quarks can be represented as though they have no spatial extension, with all their mass concentrated to an infinitesimally small point.
Now when the theory is used to make predictions on distance scales much larger than the particles themselves, the assumption of point particles is not likely to cause much of a problem. But as we probe ever smaller distance scales, we can expect to run into trouble. String theory was developed, in part, as a way of avoiding problems with point particles. In this case, the abstract mathematical concept of a zero-dimensional point particle is replaced by another abstract mathematical concept of a one-dimensional string (and, subsequently, many-dimensional ‘branes’).
These mathematical abstractions form a kind of ‘toolkit’ that theorists use to construct their theories. It is a toolkit of points, limits, asymptotes, infinities, infinitesimals, and much more.* There is little or nothing we can do about this abstraction, but we must not forget it is there. It will have some very important implications for scientific methodology.
The Theory Principle. Although physical theories are constructed to describe empirical facts about reality, they are nevertheless founded on abstract mathematical (we could even say metaphysical) concepts. The process of abstraction from facts to theories is highly complex, intuitive and not subject to simple, universal rules applicable to all science for all time. In the act of scientific creation, any approach is in principle valid provided it yields a theory that works.
The Theory Principle begs some obvious questions. Specifically, given the complex and intuitive nature of scientific creation, how are we supposed to know if a theory ‘works’?
Putting theories to the test
This, at least, appears to have a straightforward answer. We know a theory works because we can test it. Now this may be a test against preexisting facts — observations or experimental data — that are necessarily different from the facts from which the theory was sprung. Or it may involve a whole new set of observations or a new series of experiments specifically designed for the purpose.
There are plenty of examples from history. Einstein’s general theory of relativity correctly predicted observations of the advance in the perihelion (the point of closest approach to the sun) of the planet Mercury — a pre-existing fact that couldn’t be explained using Newton’s theory of universal gravitation. The general theory of relativity also predicted that the path of light from distant stars should be bent in the vicinity of sun, a prediction borne out by observations during a solar eclipse in May 1919, recorded by an expedition led by British astrophysicist Arthur Eddington.*
It is when we try to push beyond the test itself in search of a more specific criterion for scientific methodology that we start to run into more problems. In the 1920s, a group of philosophers that came to be known as the Vienna Circle sought to give verification pride of place. For a theory to be scientific, they argued, it needs to be verifiable by reference to the hard facts of empirical reality. This sounds reasonable, until we realize that this leaves us with no certainty. As British philosopher Bertrand Russell put it:
But the real question is: Do any number of cases of a law being fulfilled in the past afford evidence that it will be fulfilled in the future? If not, it becomes plain that we have no ground whatever for expecting the sun to rise tomorrow … It is to be observed that all such expectations are only probable; thus we have not to seek for a proof that they must be fulfilled, but only for some reason in favour of the view that they are likely to be fulfilled.14
This is a life-or-death issue, as Russell went on to explain: ‘The man who has fed the chicken every day throughout its life at last wrings its neck instead, showing that more refined views as to the uniformity of nature would have been useful to the chicken.’15
Verifiability won’t do, argued Austrian philosopher Karl Popper. He suggested falsifiability instead. Theories can never be verified in a way that provides us with certainty, as the chicken can attest, but they can be falsified. A theory should be regarded as scientific if it can in principle be falsified by an appeal to the facts.
But this won’t do either. To see why, let’s take another look at an episode from the history of planetary astronomy.
The planet Uranus was discovered by William Herschel in 1781.** When Newton’s mechanics were used to predict what its orbit should be, this was found to disagree with the observed orbit. What happened? Was this example of disagreement between theory and observation taken to falsify the basis of the calculations, and hence the entire structure of Newtonian mechanics?
No, it wasn’t.
Remember that theories are built out of abstract mathematical concepts, such as point particles or gravitating bodies treated as though all their mass is concentrated at their centres. If we think about how Newton’s laws are actually applied to practical situations, such as the calculation of planetary orbits, then we are forced to admit that no application is possible without a whole series of so-called auxiliary assumptions or hypotheses. And, when faced with potentially falsifying data, the tendency of most scientists is not to throw out an entire theoretical structure (especially one that has stood the test of time), but instead to tinker with the auxiliary assumptions.
This is what happened in this case. The auxiliary assumption that was challenged was the (unstated) one that the solar system consists of just seven planets. British astronomer John Adams and French mathematician Urbain Le Verrier independently proposed that this assumption be abandoned in favour of the introduction of an as yet unobserved eighth planet that was perturbing the orbit of Uranus. In 1846 the German astronomer Johann Galle discovered the new planet, subsequently called Neptune, less than one degree from its predicted position.
This does not necessarily mean that a theory can never be falsified, but it does mean that falsifiability is not a robust criterion for a scientific method. Emboldened by his success, in 1859 Le Verrier challenged the same auxiliary assumption in attempting to solve the problem of the anomaly in the perihelion of Mercury. He proposed another as yet unobserved planet — which he called Vulcan — between the sun and Mercury itself.
No such planet could be found. When confronted by potentially falsifying data, either the theory itself or at least one of the auxiliary assumptions required to apply it must be modified, but the observation or experiment does not tell us which. In fact, in this case it was Newton’s theory of universal gravitation that was at fault.
So, neither verifiability nor falsifiability provides a sufficiently robust criterion for defining ‘science’. And yet history shows that some theories have indeed been falsified and that others have been at least temporarily ‘verified�
��, in the sense that they have passed all the tests that have been thrown at them so far.
It seems to me that the most important defining criterion is therefore the testability of the theory. Whether we seek to verify it or falsify it, and irrespective of what we actually do with the theory once we know the test results, to qualify as a scientific theory it should in principle be testable.
The Testability Principle. The principal requirement of a scientific theory is that it should in some way be testable through reference to existing or new facts about empirical reality. The test exposes the veracity or falsity of the theory, but there is a caveat. The working stuff of theories is itself abstract and metaphysical. Getting this stuff to apply to the facts or a test situation typically requires wrapping the abstract concepts in a blanket of auxiliary assumptions; some explicitly stated, many taken as read. This means that a test is rarely decisive. When a test shows that a theory is false, the theory is not necessarily abandoned. It may simply mean that one or more of the auxiliary assumptions are wrong.
I want to be clear that the demand for testability in the sense that I’m using this term should not be interpreted as a demand for an immediate yes-no, right-wrong evaluation. Theories take time to develop properly, and may even be perceived to fail if subjected to tests before their concepts, limitations and rules of application are fully understood. Think of testability instead as more of a professional judgement than a simple one-time evaluation.
The Testability Principle demands that scientific theories be actually or potentially capable of providing tests against empirical facts. Isn’t this rather loose? How can we tell if a novel theoretical structure has the potential for yielding predictions that can be tested? For sure, it would be a lot easier if this was all black and white. But I honestly don’t think it’s all that complicated. A theory which, despite considerable effort, shows absolutely no promise of progressing towards testability should not be regarded as a scientific theory. A theory that continually fails repeated tests is a failed theory.