Fermi-Dirac statistics is the statistical system which applies to particles which obey Pauli’s famous exclusion principle. The exclusion principle states that no two identical fermions in the same system can be in identical states. Pauli exclusion is the explanation for many physical phenomena, including much of the structure of the periodic table of elements. We have seen that atoms are composed of a positively charged nucleus surrounded by a “cloud” of electrons. Electrons are fermions, so only one can be in the ground (the lowest energy) state. The others must occupy higher energy states. That all the electrons in an atom must occupy increasingly energetic shells explains among other things why metals are shiny and excellent electric conductors. It explains why some elements are gases at room temperature and others are solids. In fact, the variety of interactions between the electrons in various states of the different elements gives rise to much of the field of chemistry. Without the exclusion principle, chemistry would be a very different field!
Another application of the exclusion principle is the stability of white dwarf stars. White dwarfs are the leftover remains of a once active star. They no longer have an internal energy source to stabilize them against the crushing force of their own gravity. Why then, do they not collapse on themselves? The answer is the Pauli exclusion principle. One electron in a white dwarf cannot get too near the state of a nearby electron. This provides a pressure force that stabilizes the star against its own gravity.
Bosons are essentially the opposite of fermions, and they do not obey Pauli exclusion. In fact, bosons “like” to be in the same state. Photons are bosons, and the fact that they like to be in the same state is exploited in lasers and explains why lasers produce such coherent monochromatic light. One of the most interesting things about bosons is that many of their properties can be viewed, not just at a very small scale which is typical of quantum theory, but on a macroscopic scale. This is a direct result of their ability to occupy the same quantum state. If a gas of identical bosons is sufficiently cool, all the particles in the gas will tend to occupy their lowest energy state and will begin to act coherently. This gas is known as a Bose-Einstein condensate. If there are a large number of particles in the gas, then the entire gas will exhibit uniquely quantum mechanical properties on a macroscopic scale. In the last the two decades techniques to ultra-cool gases have been developed, allowing the production of Bose-Einstein condensates for the first time. As these techniques are refined, it will be interesting to study the many peculiar quantum phenomena that will be observable in the macroscopic world.
THE QUANTUM THEORY OF THE ELECTRON
BY
PAUL A.M. DIRAC
Proceedings of the Royal Society of London. Series A, Containing papers of a Mathematical and Physical Character, Vol. 117, No. 778. (Feb. 1, 1929) pp. 610–624.
The new quantum mechanics, when applied to the problem of the structure of the atom with point-charge electrons, does not give results in agreement with experiment. The discrepancies consist of “duplexity” phenomena, the observed number of stationary states for an electron in an atom being twice the number given by the theory. To meet the difficulty, Goudsmit and Uhlenbeck have introduced the idea of an electron with a spin angular momentum of half a quantum and a magnetic moment of one Bohr magneton. This model for the electron has been fitted into the new mechanics by Pauli,gg and Darwin,gh working with an equivalent theory, has shown that it gives results in agreement with experiment for hydrogen-like spectra to the first order of accuracy.
The question remains as to why Nature should have chosen this particular model for the electron instead of being satisfied with the point-charge. One would like to find some incompleteness in the previous methods of applying quantum mechanics to the point-charge electron such that, when removed, the whole of the duplexity phenomena follow without arbitrary assumptions. In the present paper it is shown that this is the case, the incompleteness of the previous theories lying in their disagreement with relativity, or, alternatetively, with the general transformation theory of quantum mechanics. It appears that the simplest Hamiltonian for a point-charge electron satisfying the requirements of both relativity and the general transformation theory leads to an explanation of all duplexity phenomena without further assumption. All the same there is a great deal of truth in the spinning electron model, at least as a first approximation. The most important failure of the model seems to be that the magnitude of the resultant orbital angular momentum of an electron moving in an orbit in a central field of force is not a constant, as the model leads one to expect.
§ 1. PREVIOUS RELATIVITY TREATMENTS
The relativity Hamiltonian according to the classical theory for a point electron moving in an arbitrary electro-magnetic field with scalar potential A0 and vector potential A is
where p is the momentum vector. It has been suggested by Gordongi that the operator of the wave equation of the quantum theory should be obtained from this F by the same procedure as in non-relativity theory, namely, by putting
in it. This gives the wave equation
(1)
the wave function ψ being a function of x1, x2, x3, t. This gives rise to two difficulties.
The first is in connection with the physical interpretation of ψ . Gordon, and also independently Klein,gj from considerations of the conservation theorems, make the assumption that if ψm , ψn are two solutions
and
are to be interpreted as the charge and current associated with the transition m → n. This appears to be satisfactory so far as emission and absorption of radiation are concerned, but is not so general as the interpretation of the non-relativity quantum mechanics, which has been developedgk sufficiently to enable one to answer the question: What is the probability of any dynamical variable at any specified time having a value lying between any specified limits, when the system is represented by a given wave function ψn? The Gordon-Klein interpretation can answer such questions if they refer to the position of the electron (by the use of ρnn), but not if they refer to its momentum, or angular momentum or any other dynamical variable. We should expect the interpretation of the relativity theory to be just as general as that of the non-relativity theory.
The general interpretation of non-relativity quantum mechanics is based on the transformation theory, and is made possible by the wave equation being of the form
(2)
i.e., being linear in W or ∂ /∂ t, so that the wave function at any time determines the wave function at any later time. The wave equation of the relativity theory must also be linear in W if the general interpretation is to be possible.
The second difficulty in Gordon’s interpretation arises from the fact that if one takes the conjugate imaginary of equation (1), one gets
which is the same as one would get if one put −e for e. The wave equation (1) thus refers equally well to an electron with charge e as to one with charge −e. If one considers for definiteness the limiting case of large quantum numbers one would find that some of the solutions of the wave equation are wave packets moving in the way a particle of charge −e would move on the classical theory, while others are wave packets moving in the way a particle of charge e would move classically. For this second class of solutions W has a negative value. One gets over the difficulty on the classical theory by arbitrarily excluding those solutions that have a negative W. One cannot do this on the quantum theory, since in general a perturbation will cause transitions from states with W positive to states with W negative. Such a transition would appear experimentally as the electron suddenly changing its charge from −e to e, a phenomenon which has not been observed. The true relativity wave equation should thus be such that its solutions split up into two non-combining sets, referring respectively to the charge −e and the charge e.
In the present paper we shall be concerned only with the removal of the first of these two difficulties. The resulting theory is therefore still only an approximation, but it appears to be good enough to account for all the dup
lexity phenomena without arbitrary assumptions.
§ 2. THE HAMILTONIAN FOR NO FIELD
Our problem is to obtain a wave equation of the form (2) which shall be invariant under a Lorentz transformation and shall be equivalent to (1) in the limit of large quantum numbers. We shall consider first the case of no field, when equation (1) reduces to
(3)
if one puts
The symmetry between p0 and p1, p2, p3 required by relativity shows that, since the Hamiltonian we want is linear in p0, it must also be linear in p1, p2 and p3. Our wave equation is therefore of the form
(4)
where for the present all that is known about the dynamical variables or operators α1, α2, α3, β is that they are independent of p0, p1, p2, p3, i.e., that they commute with t, x1, x2, x3. Since we are considering the case of a particle moving in empty space, so that all points in space are equivalent, we should expect the Hamiltonian not to involve t, x1, x2, x3. This means that α 1, α2, α3, β are independent of t, x1, x2, x3, i.e., that they commute with p0, p1, p2, p3. We are therefore obliged to have other dynamical variables besides the co-ordinates and momenta of the electron, in order that α1, α2, α3, β may be functions of them. The wave function ψ must then involve more variables than merely x1, x2, x3, t
Equation (4) leads to
(5)
where the ∑ refers to cyclic permutation of the suffixes 1, 2, 3. This agrees with (3) if
If we put β = α4 mc, these conditions become
(6)
We can suppose the αµ’s to be expressed as matrices in some matrix scheme, the matrix elements of αµ being, say, αµ (ζʹ ζ”). The wave function ψ must now be a function of ζ as well as x1, x2, x3, t.
The result of αµ multiplied into ψ will be a function (αµψ ) of x1, x2, x3, t, ζ denned by
We must now find four matrices αµ to satisfy the conditions (6). We make use of the matrices
which Pauli introducedgl to describe the three components of spin angular momentum. These matrices have just the properties
(7)
that we require for our α’s. We cannot, however, just take the σ’s to be three of our α’s, because then it would not be possible to find the fourth. We must extend the σ’s in a diagonal manner to bring in two more rows and columns, so that we can introduce three more matrices ρ1, ρ2, ρ3 of the same form as σ1, σ2, σ3, but referring to different rows and columns, thus:—
The ρ’s are obtained from the σ’s by interchanging the second and third rows, and the second and third columns. We now have, in addition to equations (7) and also
(7ʹ)
If we now takeα1 = ρ1σ1, α2 = ρ1σ2, α3 = ρ1σ3, α4 = ρ3,
all the conditions (6) are satisfied, e.g.,
The following equations are to be noted for later reference
(8)
together with the equations obtained by cyclic permutation of the suffixes.
The wave equation (4) now takes the form
(9)
where σ denotes the vector (σ1, σ2, σ3 ).
§ 3. PROOF OF INVARIANCE UNDER A LORENTZ TRANSFORMATION
Multiply equation (9) by ρ3 on the left-hand side. It becomes, with the help of (8),[ρ3ρ0 + iρ2 (σ1ρ1 + σ2ρ2 + σ3ρ3) + mc] ψ = 0.
Putting
(10)
we have
(11)
The pµ transform under a Lorentz transformation according to the law
where the coefficients aµν are c-numbers satisfying∑µaµνaµr = δνr, ∑raµraνr = δµν.
The wave equation therefore transforms into
(12)
where
Now the γµ, like the αµ, satisfy
These relations can be summed up in the single equationγµγν + γνγµ = 2δμν.
We have
Thus the satisfy the same relations as the γµ. Thus we can put, analogously to (10)
where the ρʹ’s and σʹ’s are easily verified to satisfy the relations corresponding to (7), (7ʹ) and (8), if and are defined by .
We shall now show that, by a canonical transformation, the ρʹ’s and σʹ’s may be brought into the form of the ρ’s and σ’s. From the equation = 1, it follows that the only possible characteristic values for are ±1. If one applies to a canonical transformation with the transformation function the result is
Since characteristic values are not changed by a canonical transformation, must have the same characteristic values as—. Hence the characteristic values of are +1 twice and −1 twice. The same argument applies to each of the other ρʹ’s, and to each of the σʹ’s.
Since and commute, they can be brought simultaneously to the diagonal form by a canonical transformation. They will then have for their diagonal elements each +1 twice and −1 twice. Thus, by suitably rearranging the rows and columns, they can be brought into the form ρ3 and σ3 respectively. (The possibility is excluded by the existence of matrices that commute with one but not with the other.)
Any matrix containing four rows and columns can be expressed as
(13)
where the sixteen coefficients c, cr , cʹr , crs are c-numbers. By expressing in this way, we see, from the fact that it commutes with = ρ3 and anticommutesgm with = σ3, that it must be of the form
=i.e., of the form
The condition =1 shows that a12a21=1,a34a43=1. If we now apply the canonical transformation: first row to be multiplied by and third row to be multiplied by (a43/a34) , and first and third columns to be divided by the same expressions, will be brought into the form of σ1, and the diagonal matrices and will not be changed.
If we now express in the form (13) and use the conditions that it commutes with = σ1 and = σ3 and anticommutes with = ρ3, we see that it must be of the form
The condition = 1 shows that = 1, or = cos θ , = sin θ . Hence is of the form
If we now apply the canonical transformation: first and second rows to be multiplied by e i.θ. and first and second columns to be divided by the same expression, will be brought into the form ρ1 , and σ1, σ3, ρ3 will not be altered. and must now be of the form ρ2 and σ2, on account of the relations .
Thus by a succession of canonical transformations, which can be combined to form a single canonical transformation, the ρʹ’s and σʹ’s can be brought into the form of the ρ’s and σ’s. The new wave equation (12) can in this way be brought back into the form of the original wave equation (11) or (9), so that the results that follow from this original wave equation must be independent of the frame of reference used.
§ 4. THE HAMILTONIAN FOR AN ARBITRARY FIELD
To obtain the Hamiltonian for an electron in an electromagnetic field with scalar potential A0 and vector potential A, we adopt the usual procedure of substituting p0 + e/c. A0 for p0 and p + e/c. A for p in the Hamiltonian for no field. From equation (9) we thus obtain
(14)
This wave equation appears to be sufficient to account for all the duplexity phenomena. On account of the matrices ρ and σ containing four rows and columns, it will have four times as many solutions as the non-relativity wave equation, and twice as many as the previous relativity wave equation (1). Since half the solutions must be rejected as referring to the charge + e on the electron, the correct number will be left to account for duplexity phenomena. The proof given in the preceding section of invariance under a Lorentz transformation applies equally well to the more general wave equation (14).
We can obtain a rough idea of how (14) differs from the previous relativity wave equation (1) by multiplying it up analogously to (5). This gives, if we write eʹ for e/c
(15)
We now use the general formula, that if B and C are any two vectors that commute with σ
(16)
Taking B = C = p + eʹA, we find
Thus (15) becomes
where E and H are the electric and magnetic vectors of the field.
This differs from (1) by the two extra terms
in F. These two terms,
when divided by the factor 2m, can be regarded as the additional potential energy of the electron due to its new degree of freedom. The electron will therefore behave as though it has a magnetic moment eh/2mc. σ and an electric moment ieh/2mc. ρ1 σ . This magnetic moment is just that assumed in the spinning electron model. The electric moment, being a pure imaginary, we should not expect to appear in the model. It is doubtful whether the electric moment has any physical meaning, since the Hamiltonian in (14) that we started from is real, and the imaginary part only appeared when we multiplied it up in an artificial way in order to make it resemble the Hamiltonian of previous theories.
The Dreams That Stuff is Made of Page 37