by Brian Cox
To elevate all of what we just said to science rather than numerology we have some explaining to do. Firstly, we need to explain why the chemical properties are similar for elements in the same vertical column. What is clear from our scheme is that the first element in each of the first three rows starts off the process of filling levels with increasing values of n. Specifically, hydrogen starts things off with a single electron in the otherwise empty n = 1 level, lithium starts off the second row with a single electron in the n = 2 level and sodium starts the third row with a single electron in the otherwise empty n = 3 level. The third row is a little odd because the n = 3 level can hold eighteen electrons and there are not eighteen elements in the third row. We can guess at what is happening though – the first eight electrons fill up the n = 3 levels withm l = 0 and l = 1, and then (for some reason) we should switch to the fourth row. The fourth row now contains the remaining ten electrons from the n = 3 levels with l = 2 and the eight electrons from the n = 4 levels withm l = 0 and l = 1. The fact that the rows are not entirely correlated with the value of n indicates that the link between the chemistry and the energy-level counting is not as simple as we have been making out. However, it is now known that potassium and calcium, the first two elements in the fourth row, do have electrons in the n = 4, l = 0 level and that the next ten elements (from scandium to zinc) have their electrons in the belated n = 3, l = 2 levels.
Figure 7.2. Filling the energy levels of krypton. The dots represent electrons and the horizontal lines represent the energy levels, labelled by the quantum numbers n, l and m. We have grouped together levels with different values of m but the same values of n and l.
To understand why the filling up of the n = 3 and l = 2 levels is deferred until after calcium requires an explanation of why the n = 4, l = 0 levels, which contain the electrons in potassium and calcium, is of lower energy than the n = 3, l = 2 levels. Remember, the ‘ground state’ of an atom will be characterized by the lowest-energy configuration of the electrons, because any excited state can always lower its energy by the emission of a photon. So when we have been saying that ‘this atom contains these electrons sitting in those energy levels’ we are telling you the lowest energy configuration of the electrons. Of course, we have not made any attempt to actually compute the energy levels, so we aren’t really in a position to rank them in order of energy. In fact it is a very difficult business to calculate the allowed electron energies in atoms with more than two electrons, and even the two-electron case (helium) is not so easy. The simple idea that the levels are ranked in order of increasing n comes from the much easier calculation for the hydrogen atom, where it is true that the n = 1 level has the lowest energy followed by the n = 2 levels, then come the n = 3 levels and so on.
The obvious implication of what we just said is that the elements on the far right of the periodic table correspond to atoms in which a set of levels has just been completely filled. In particular, for helium the n = 1 level is full, whilst for neon the n = 2 level is full, and for argon the n = 3 level is fully populated, at least for l = 0 and l = 1. We can develop these ideas a little further and understand some important ideas in chemistry. Fortunately we aren’t writing a chemistry textbook, so we can be brief and, at the risk of dismissing an entire subject in a single paragraph, here we go.
The key observation is that atoms can stick together by sharing electrons – we will meet this idea in the next chapter when we explore how a pair of hydrogen atoms can bind to make a hydrogen molecule. The general rule is that elements ‘like’ to have all their energy levels neatly filled up. In the case of helium, neon, argon and krypton, the levels are already completely full, and so they are ‘happy’ on their own – they don’t ‘bother’ reacting with anything. For the other elements, they can ‘try’ to fill their levels by sharing electrons with other elements. Hydrogen, for example, needs one extra electron to fill its n = 1 level. It can achieve this by sharing an electron with another hydrogen atom. In so doing, it forms a hydrogen molecule, with chemical symbol H2. This is the common form in which hydrogen gas exists. Carbon has four electrons out of a possible eight in its n = 2, l = 0 and l = 1 levels, and would ‘like’ another four if possible to fill them up. It can achieve this by binding together with four hydrogen atoms to form CH4, the gas known as methane. It can also do it by binding with two oxygen atoms, which themselves need two electrons to complete their n = 2 set. This leads to CO2 – carbon dioxide. Oxygen could also complete its set by binding with two hydrogen atoms to make H2O – water. And so on. This is the basis of chemistry: it is energetically favourable for atoms to fill their energy levels with electrons, even if that is achieved by sharing with a neighbour. Their ‘desire’ to do this, which ultimately stems from the principle that things tend to their lowest energy state, is what drives the formation of everything from water to DNA. In a world abundant in hydrogen, oxygen and carbon we now understand why carbon dioxide, water and methane are so common.
This is very encouraging, but we have a final piece of the jigsaw to explain: why is it that only two electrons can occupy each available energy level? This is a statement of the Pauli Exclusion Principle, and it is clearly necessary if everything we have been discussing is to hang together. Without it, the electrons would crowd together in the lowest possible energy level around every nucleus, and there would be no chemistry, which is worse than it sounds, because there would be no molecules and therefore no life in the Universe.
The idea that two and only two electrons can occupy each energy level does seem quite arbitrary, and historically nobody had any idea why it should be the case when the idea was first proposed. The initial breakthrough was made by Edmund Stoner, the son of a professional cricketer (who took eight wickets against South Africa in 1907, for those who read their Wisden Cricketers’ Almanack) and a former student of Rutherford’s who later ran the physics department at the University of Leeds. In October 1924, Stoner proposed that there should be two electrons allowed in each (n, l, m) energy level. Pauli developed Stoner’s proposal and in 1925 he published a rule that Dirac named after him a year later. The Exclusion Principle, as first proposed by Pauli, states that no two electrons in an atom can share the same quantum numbers. The problem he faced was that it appeared that two electrons could share each set of n, l and m values. Pauli got round the problem by simply introducing a new quantum number. This was an ansatz; he didn’t know what it represented, but it had to take on one of only two values. Pauli wrote that, ‘We cannot give a more precise reason for this rule.’ Further insight came in 1925, in a paper by George Uhlenbeck and Samuel Goudsmit. Motivated by precise measurements of atomic spectra, they identified Pauli’s extra quantum number with a real, physical property of the electron known as ‘spin’.
The basic idea of spin is quite simple, and dates back to 1903, well before quantum theory. Just a few years after its discovery, German physicist Max Abraham proposed that the electron was a tiny, spinning electrically charged sphere. If this were true, then electrons would be affected by magnetic fields, depending on the orientation of the field relative to their spin axis. In their 1925 paper, which was published three years after Abraham’s death, Uhlenbeck and Goudsmit noted that the spinning ball model couldn’t work because, in order to explain the observed data, the electron would have to be spinning faster than the speed of light. But the spirit of the idea was correct – the electron does possess a property called spin, and it does affect its behaviour in a magnetic field. Its true origin, however, is a direct and rather subtle consequence of Einstein’s Theory of Special Relativity that was only properly appreciated when Paul Dirac wrote down an equation describing the quantum behaviour of the electron in 1928. For our purposes, we shall need only acknowledge that electrons do come in two types, which we refer to as ‘spin up’ and ‘spin down’, and the two are distinguished by having opposite values of their angular momentum, i.e. it is like they are spinning in opposite directions. It’s a pity that Abraham died just a few years be
fore the true nature of electron spin was discovered, because he never gave up his conviction that the electron was a little sphere. In his obituary in 1923, Max Born and Max Von Laue wrote: ‘He was an honourable opponent who fought with honest weapons and who did not cover up a defeat by lamentation and nonfactual arguments … He loved his absolute ether, his field equations, his rigid electron, just as a youth loves his first flame, whose memory no later experience can extinguish.’ If only all of one’s opponents were like Abraham.
Our goal in the remainder of this chapter is to explain why it is that electrons behave in the strange way articulated by the Exclusion Principle. As ever, we shall make good use of those quantum clocks.
Figure 7.3. Two electrons scattering.
We can attack the question by thinking about what happens when two electrons ‘bounce’ off each other. Figure 7.3 illustrates a particular scenario where two electrons, labelled ‘1’ and ‘2’, start out somewhere and end up somewhere else. We have labelled the final locations A and B. The shaded blobs are there to remind us that we have not yet thought about just what happens when two electrons interact with each other (the details are irrelevant for the purposes of this discussion). All we need to imagine is that electron 1 hops from its starting place and ends up at the point labelled A. Likewise, electron 2 ends up at the point labelled B. This is what is illustrated in the top of the two pictures in the figure. In fact, the argument we are about to present works fine even if we ignore the possibility that the electrons might interact. In that case, electron 1 hops to A oblivious to the meanderings of electron 2 and the probability of finding electron 1 at A and electron 2 at B would be simply a product of two independent probabilities.
For example, suppose the probability of electron 1 hopping to point A is 45% and the probability of electron 2 hopping to point B is 20%. The probability of finding electron 1 at A and electron 2 at B is 0.45 × 0.2 = 0.09 = 9%. All we are doing here is using the logic that says that the chances of tossing a coin and getting ‘tails’ and rolling a dice and getting a ‘six’ at the same time is one-half multiplied by one-sixth, which is equal to (i.e. just over 8%).2
As the figure illustrates, there is a second way that the two electrons can end up at A and B. It is possible for electron 1 to hop to B whilst electron 2 ends up at A. Suppose that the chance of finding electron 1 at B is 5% and the chance of finding electron 2 at A is 20%. Then the probability of finding electron 1 at B and electron 2 at A is 0.05 × 0.2 = 0.01 = 1%.
We therefore have two ways of getting our two electrons to A and B – one with a probability of 9% and one with a probability of 1%. The probability of getting one electron at A and one at B, if we don’t care which is which, should therefore be 9% + 1% = 10%. Simple; but wrong.
The error is in supposing that it is possible to say which electron arrives at A and which one arrives at B. What if the electrons are identical to each other in every way? This might sound like an irrelevant question, but it isn’t. Incidentally, the suggestion that quantum particles might be strictly identical was first made in relation to Planck’s black body radiation law. A little-known physicist called Ladislas Natanson had pointed out, as far back as 1911, that Planck’s law was incompatible with the assumption that photons could be treated as identifiable particles. In other words, if you could tag a photon and track its movements, then you wouldn’t get Planck’s law.
If electrons 1 and 2 are absolutely identical then we must describe the scattering process as follows: initially there are two electrons, and a little later there are still two electrons located in different places. As we’ve learnt, quantum particles do not travel along well-defined trajectories, and this means that there really is no way of tracking them, even in principle. It therefore makes no sense to claim electron 1 appeared at A and electron 2 at B. We simply can’t tell, and it is therefore meaningless to label them. This is what it means for two particles to be ‘identical’ in quantum theory. Where does this line of reasoning take us?
Look at the figure again. For this particular process, the two probabilities we associated with the two diagrams (9% and 1%) are not wrong. They are, however, not the whole story. We know that quantum particles are described by clocks, so we should associate a clock with electron 1 arriving at A with a size equal to the square root of 45%. Likewise there is a clock associated with electron 2 arriving at B and it has a size equal to the square root of 20%.
Now comes a new quantum rule – it says that we are to associate a single clock with the process as a whole, i.e. there is a clock whose size squared is equal to the probability to find electron 1 at A and electron 2 at B. In other words, there is a single clock associated with the upper picture in Figure 7.3. We can see that this clock must have a size equal to the square root of 9%, because that is the probability for the process to happen. But what time does it read? Answering this question is the domain of Chapter 10 and it involves the idea of clock multiplication. As far as this chapter is concerned, we don’t need to know the time, we only need the important new rule that we have just stated, but which is worth repeating because it is a very general statement in quantum theory: we should associate a single clock with each possible way that an entire process can happen. The clock we associate with finding a single particle at a single location is the simplest illustration of this rule, and we have managed to get this far in the book with it. But it is a special case, and as soon as we start to think about more than one particle we need to extend the rule.
This means that there is a clock of size equal to 0.3 associated with the upper picture in the figure. Likewise, there is a second clock of size equal to 0.1 (because 0.1 squared is 0.01 = 1%) associated with the lower picture in the figure. We therefore have two clocks and we want a way to use them to determine the probability to find an electron at A and another at B. If the two electrons were distinguishable then the answer would be simple – we would just add together the probabilities (and not the clocks) associated with each possibility. We would then obtain the answer of 10%.
But if there is absolutely no way of telling which of the diagrams actually happened, which is the case if the electrons are indistinguishable from each other, then following the logic we’ve developed for a single particle as it hops from place to place, we should seek to combine the clocks. What we are after is a generalization of the rule which states that, for one particle, we should add together the clocks associated with all of the different ways that the particle can reach a particular point in order to determine the probability to find the particle at that point. For a system of many identical particles, we should combine together all the clocks associated with all of the different ways that the particles can reach a set of locations in order to determine the probability that particles will be found at those locations. This is important enough to merit reading a few times – it should be clear that this new law for combining clocks is a direct generalization of the rule we have been using for a single particle. You may have noticed that we have been very careful with our wording, however. We did not say that the clocks should necessarily be added together – we said that they should be combined together. There is a good reason for our caution.
The obvious thing to do would be to add the clocks together. But before leaping in we should ask whether there is a good reason why this is correct. This is a nice example of not taking things for granted in physics – exploring our assumptions often leads to new insights, as it will do in this instance. Let’s take a step back, and think of the most general thing we could imagine. This would be to allow for the possibility of giving one of the clocks a turn or a shrink (or expansion) before we add them. Let’s explore this possibility in more detail.
What we are doing is saying, ‘I have two clocks and I want to combine them to make a single clock, so that I can use that to tell me what the probability is for the two electrons to be found at A and B. How should I combine them?’ We are not pre-empting the answer, because we want to understand if adding clocks together really
is the rule we should use. It turns out that we do not have much freedom at all, and simply adding clocks is, intriguingly, one of only two possibilities.
To streamline the discussion, let’s refer to the clock corresponding to particle 1 hopping to A and particle 2 hopping to B as clock 1. This is the clock associated with the upper picture in Figure 7.3. Clock 2 corresponds to the other option, where particle 1 hops to B instead. Here is an important realization: if we give clock 1 a turn before adding it to clock 2, then the final probability we calculate must be the same as if we choose to give clock 2 the same turn before adding it to clock 1.
To see this, notice that swapping the labels A and B around in our diagrams clearly cannot change anything. It is just a different way of describing the same process. But swapping A and B around swaps the diagrams in Figure 7.3 around too. This means that if we decide to wind clock 1 (corresponding to the upper picture) before adding it to clock 2, then this must correspond precisely to winding the clock 2 before adding it to clock 1, after we’ve swapped labels. This piece of logic is crucial, so it’s worth hammering home. Because we have assumed that there is no way of telling the difference between the two particles, then we are allowed to swap the labels around. This implies that a turn on clock 1 must give the same answer as when we apply the same turn to clock 2, because there is no way of telling the clocks apart.