We have now to turn our attention to a new point which distinguishes the new from the classical theory and may result in ‘rendering Larmor’s theorem invalid: This new point is the neglect of the terms in H2 entering in the expression for the magnetic energy. These can certainly be neglected in the classical theory for orbits of small dimensions but can not be neglected for orbits of large dimensions or hyperbolic paths. In the limiting case of a free electron the period of revolution is just twice the normal Larmor precession. In quantum mechanics all these orbits, the distant as well as the near ones, are so intimately connected due to the peculiar kinematics and geometry that the justification of neglecting H2 is no longer evident, for the probability of transition even from the unexcited state to that of a free electron is always considerable. For the oscillator we have certainly the normal Zeeman effect. For the nuclear atom, however, it is possible that the intimate connection between inner and outer orbits may lead to different results. There are, on the other hand, powerful arguments against such an explanation, particularly the intimate connection between the anomalous Zeeman effect and the multiplet structure of spectral lines. A new physical concept seems to be required here. Such an idea has been formulated by Uhlenbeck and Goudsmit, but here I can only indicate it. Pauli, from the study of multiplets, has been led to attribute to each electron not three quantum numbers, as would correspond to its number of degrees of freedom, but four. Until now this has been considered as something purely formal, to be eliminated if possible. Uhlenbeck and Goudsmit, however, take this hypothesis earnestly. They attribute to the electron a proper rotation and a corresponding magnetic field determined by the fourth quantum number. Preliminary calculations by Heisenberg and Jordan have shown that this idea forms a basis for an exact theory of the abnormal Zeeman effect, but I am unable to give further details on this at present.
LECTURE 18
Pauli’s theory of the hydrogen atom.
We now come to the crucial question for the whole new theory: Is it able to account for the properties of the hydrogen atom? Let us recall that the explanation of the hydrogen spectrum (Balmer’s formula) was the first great success of Bohr’s theory and has since remained its keynote. If the new theory failed here it would have to be abandoned in spite of its many conceptual advantages, but, as Pauli has shown, it stands the test successfully. I can give here only the fundamental ideas and results of this development, as yet unpublished.
In the classical theory of Keplerian motion it is customary to operate with polar coördinates. This process fails here because it does not seem possible to consider angular variables as matrices. Pauli avoided this difficulty by retaining rectangular coördinates and introducing an additional coördinate, the radius vector r, which is related to x, y, z, by the relationr 2 = x 2 + y 2 + z2 .
The process will first be explained using the classical model. We have the energy function
(1)
and the equations of motion,
(2)
From these follows that the angular momentum
(3)
is constant with respect to time. There follows, further, by using the relationM × r = (r × p) × r = pr2 − (p · r)r
that the vector
(4)
is also constant with respect to time. Placing M = |M| we obtain at once
which is the equation of a conic. If we take the xy-plane in the plane of this curve and the x-axis in the direction of A, that is
we obtain
or
A is therefore the eccentricity and we find for the energy
(5)
This calculation can be repeated, with only slight changes, in matrix mechanics.
The matrices x, y , z,r are commutative among themselves, as also the momentum matrices px , py , pz, pr. The following are also commutative:x with py, pz . . .
px with y , z . . .
but,
The energy and the equations of motion are the same as above. From the latter there follows at once, as has been shown generally before, that the angular momentum is constant in time.
It can be shown further that the vector
is constant in time. To prove this a longer calculation is necessary and secondary commutation relations are needed, as for instance
Derivatives with respect to time are transformed by means of the formula
The problem is now to find the constant vectors M and A. For these the following commutation relations hold:
(6)
(7)
Finally, the following equation is found
(8)
This equation differs from the corresponding classical equation only by the term added to M2. This is just one important characteristic of the new theory.
In the solution of these equations W is always a diagonal matrix, but the constant components of the vector matrices p, A are not diagonal matrices, as was shown in general above. According to our previous results, the requirement that, besides W, pz and p2 be diagonal matrices has a definite meaning, i.e., the addition of a weak axially symmetrical perturbing field of force the energy of which depends on pz and p.
The same method used above (Lecture 16) can now be applied to determine the vector P. Then Equations (6) are exactly the same as Equations (4) and (5), Lecture 16, except that the coördinates qlx,qly,qlz are replaced by Ax, Ay, Az. Instead of the quantum numbers n1, n2 we write the usual symbols k and m, where k determines the total angular momentum, denoted by j above, and m its z-component. We then have
(9)
where m runs through a complete series of half or whole numbers from −k to +k. Further we obtain for Ax , Ay , Az expressions quite similar to those found before for qlx, qly, qlz, for example the following one:
Consider in Equations (7) a given W and the smallest possible k. Then a closer discussion shows that the equations in question can be satisfied if m is zero and only zero, hence kmin = 0. Herein is contained the integrality of k and m.
Formula (7) gives further for the functions C(k+ 1, k) = C(k, k+ 1) the equation,
W is here assumed to be negative, i.e., ellipse, not hyperbola; R is Rydberg’s constant. As solution of this equation we obtain
where km is the maximum value of k for a given |W|. We now have the components of A and, therefore, also the value of
that is,
(10)
There follows finally from Equation (8):
If we write n = km + 1, then n corresponds to the main quantum number of Bohr’s theory and takes the values 1,2,3,.... For a given n, k has the values k = 0, 1, 2, . . . n − 1. We have thus found Balmer’s formula
(11)
and have shown at the same time how each term is split up on removal of the degeneration by the addition of weak perturbing forces. This split is given by k = 0, 1, 2, . . . n − 1; m = −k, −k + 1,...k − 1,k.
A characteristic trait of the new theory is that the value k = n does not occur. It follows in particular that in the unexcited state n = 1, k = 0 and, therefore, m = 0; in other words, the normal state is not magnetic. This result must be revised, however, if the rotating magnetic electron of Uhlenbeck and Goudsmit is accepted.
Pauli has succeeded in deriving in a similar way the Stark effect for the hydrogen atom. In this case also no additional conditions need be imposed. The same holds in the case where an electric and a magnetic field act in arbitrary directions (crossed fields). Just here the classical theory of multipleperiodic systems encountered great difficulties, for the frequencies of the overtones are made up of two fundamental periods (the electric frequency νc and the magnetic Larmor frequency νm) and, therefore, protracted commensurabilities appear when the fields are varied, that is equations of the formτ1νe + τ2νm =0
with integral τ1, τ2. This means that arbitrarily small adiabatic changes, of the electric field, for instance, will produce degeneration. The validity of Ehrenfest’s adiabatic hypothesis is no longer certain and, therefore, the quantum rules become doubtful. All these difficulties disappear in
the new theory.
Pauli has also attacked the theory of fine structure (relativistic change of mass), without yet quite attaining his goal.
LECTURE 19
Connection with the theory of Hermitian forms—Aperiodic motions and continuous spectra.
Let us now inquire how aperiodic motions, such as hyperbolic orbits in the hydrogen atom, can be treated in the new theory. It is to be expected a priori that there is no essential difference in the treatment of periodic and aperiodic processes because the postulate of periodicity does not appear explicitly in the fundamental equations. The notion of matrix can be generalized at once so as to permit the representation of aperiodic processes. The indices n, m have only to be considered as continuous variables and the matrix product defined by the integral pq = (∫ p(nk)q(km)dk), pq = (∫ p(nk)q(km)dk),
but difficulties appear at once if we attempt to generalize the notion of unit matrix to include these continuous matrices. This must be done because the unit matrix enters in the commutation relation
(1)
That function f(nm) is to be taken as unit matrix which vanishes for n ≠ m and becomes infinite for n = m in such a way that the integrals∫ ∫ (nk)dk and ∫ ∫ (kn)dk
become unity, for thenqf = (∫ q (nk) f (k m )d k )) = (q(nk)) =q
and at the same time fq= q. It is clear that operating with such unusual functions is not convenient. In circumventing this difficulty a way indicated by entirely different lines of reasoning has been followed.
In classical mechanics the known theory of oscillation of a system is intimately related to the theory of quadratic forms. Oscillations occur when the potential energy is a “definite” quadratic form of the variables, i.e., one which does not change its sign. For two variables x, y, for instance,
Oscillations are obtained in the simplest way by transforming this form to a sum of squares by means of the linear transformation,
We now try to effect this transformation in such a way that the kinetic energy , which is already a sum of squares, retains this characteristic and is transformed into
As the velocities are transformed in the same way as the coordinates we have the condition that the linear transformation must leave the quantity x2 + y2 invariant, i.e.,x2 + y2 = ξ2 + η2 .
Such transformations are called “orthogonal.” They correspond geometrically to rotations of the coördinate system around the origin in the xy-plane, because for such a rotation the distance r or r2 = x2 + y2 is in fact invariant. Now an equation of the forma11 x2 + 2a12xy + a22y2 =2U =const.
with a definite left-hand side represents an ellipse with the origin at the center. This ellipse has two principal axes a, b. If these are chosen as ξη-axes the equation of the ellipse becomes
where a2 =2U /κ1 and b2 =2U/κ2, and we have the desired expression. The equations of motion are now
and, therefore, the frequencies are
Similar relations hold for any arbitrary number of degrees of freedom.
Formerly, in order to interpret line spectra, attempts were made to construct mechanical systems which should have just the observed lines as proper frequencies, but none of them gave rise to useful results; in other words, none led to oscillating systems built out of known elementary particles (protons and electrons) governed by known laws or reasonable modifications of them and having these frequencies.
In our new theory the relation between the principal axes of the quadratic form and the frequency enters again, except that instead of the observed frequencies the values of the terms or energy levels occur. These appear as the reciprocal axes of a certain Hermitian form. The frequencies appear later as differences between terms.
To each matrix a = (a (nm)) corresponds a bilinear form
(2)
of two systems of variables. If the matrix is Hermitian,
(3)
where the symbol ∼ indicates interchange of rows and columns and the symbol * change to conjugate complex quantities, then the form A assumes real values if we place for the variables y n the values conjugate to xn:
(4)
Let us recall the rule, easily proved, that (a b) = . Applying a linear transformation to xn ,
(5)
with the complex matrix v = (v(ln)) the bilinear form A is transformed into
where
or in matrix notation
(6)
The matrix b is said to be the transform of a . The matrix b is again Hermitian, for
(7)
The matrix v is said to be orthogonal if the corresponding transformation leaves the Hermitian unit form
invariant. According to the result just obtained this is true when, and only when,
(8)
For a finite number of variables the same theorems hold in general for Hermitian forms as for real quadratic forms. Here also there always exists an orthogonal principal-axis transformation by which A becomes a sum of squares,
For matrices this means that there exists a matrix v for which
(9)
where W = (Wn δmn ) is a diagonal matrix.
A similar theorem exists for infinite matrices in all cases so far investigated. It may happen that, in the right-hand side of these equations, n takes, besides a discrete series of values, a continuous series, to each of which correspond integral components in our formulas. The quantities Wn are called “characteristic values,” their totality constitutes the “mathematical” spectrum of the form, consisting of a “point” spectrum and an “interval” spectrum. This spectrum is, as pointed out before and as will be shown presently, identical with the “term spectrum” of physics, while the “frequency spectrum” is obtained from the former by difference relations.
The transformation along the principal axes gives at once the solution of the dynamical problem which can be formulated as follows: Let any system of coördinates and momenta q0k , p 0k be given satisfying the commutation relations, for instance those of a system of uncoupled resonators. A transformation (q0k p0k ) → (qk p k ) must be found leaving the commutation relations (1) invariant and transforming the energy into a diagonal matrix. According to the theorem above an orthogonal matrix S for whichSS * = 1 S* S = 1
exists such that by the transformation,
(10)
1. the Hermitian character of p0k , q0k is conserved for pk , qk ,
2. the commutation relations remain invariant,
3. the energy is transformed into a diagonal matrix
(11)
It is important to add that the transforming matrix and the series of W-values may have continuous parts. This has been shown by Hilbert and Hellinger for a certain class of infinite matrices belonging to the so-called “bounded forms.” The same must be expected a priori of our matrices which in general do not satisfy the condition of bounded forms. A continuous series of energy values W or of terms w/h is thus obtained. Accordingly there are evidently three kinds of elements in the coördinate matrices:(1) Those for which both m and n belong to the discrete series of values of W. These correspond to jumps between periodic orbits and give the line spectrum.
(2) Those for which n belongs to the discrete and m to the continuous series of values of W or conversely. These correspond to jumps between periodic and aperiodic orbits and give those known continuous spectra which exist beyond the limits of line series.
(3) Those for which both n and m belong to the continuous series of W-values. These correspond to jumps between two aperiodic orbits and give the continuous spectrum in the proper sense.
The actual mathematical calculation of the continuous spectrum on the basis of this theory is, however, impossible partly on account of the intricacy of the calculations and more particularly because of difficulties of convergence. The integrals are improper or altogether divergent. This is related to the fact that aperiodic motions approach uniform rectilinear motion asymptotically in the limit of infinite distance. This motion has evidently no period and represents the case of greatest singularity. It is not amenable to matrix repres
entation, even if continuous matrices are mustered for the purpose.
The Dreams That Stuff is Made of Page 86