In the case of atomic theory, we have certainly introduced, as fundamental constituents, magnitudes of very doubtful observability, as, for instance, the position, velocity and period of the electron. What we really want to calculate by means of our theory and can be observed experimentally, are the energy levels and the emitted light frequencies derivable from them. The mean radius of the atom (atomic volume) is also an observable quantity which can be determined by the methods of the kinetic theory of gases or other analogous methods. On the other hand, no one has been able to give a method for the determination of the period of an electron in its orbit or even the position of the electron at a given instant. There seems to be no hope that this will ever become possible, for in order to determine lengths or times, measuring rods and clocks are required. The latter, however, consist themselves of atoms and therefore break down in the realm of atomic dimensions. It is necessary to see clearly the following points: All measurements of magnitudes of atomic order depend on indirect conclusions; but the latter carry weight only when their train of thought is consistent with itself and corresponds to a certain region of our experience. But this is precisely not the case for atomic structures such as we have considered so far. I have already called attention to the points where the theory fails.
At this stage it appears justified to give up altogether the description of atoms by means of such quantities as “coördinates of the electrons” at a given time, and instead utilize such magnitudes as are really observable. To the latter belong, besides the energy levels which are directly measureable by electron impacts and the frequencies which are derivable from them and which are also directly measurable, the intensity and the polarization of the emitted waves. We therefore take from now on the point of view that the elementary waves are the primary data for the description of atomic processes; all other quantities are to be derived from them. That this standpoint offers more possibilities than the assumption of electronic motions is best understood by considering the Compton effect.
If an X-ray wave of frequency ν impinges on free or loosely bound electrons, it transmits to the latter impacts in every direction. At the same time a secondary X-radiation is emitted having a frequency ν’ dependent on the azimuth. According to Compton and Debye this can be quantitatively explained if the energies hν, hν’ and the m omenta are ascribed to the waves and then the laws of conservation of energy and of impulse are applied to the light-quanta and the electrons. But if the process is considered from the point of view of the wave theory, then the change of frequency must be interpreted as a Doppler effect. A calculation of the velocity of the wave-center gives then extremely large values in the direction of the primary X-ray and not in that of the electron. We have therefore struck upon a case in which motion of the electron and motion of the wave-center do not coincide. In the classical theory, where the emitted waves are determined by the harmonic components of the electronic motion, this is of course absolutely unexplainable. We therefore stand before a new fact which forces us to decide whether the electronic motion or the wave shall be looked upon as the primary act. After all theories which postulate the motion have proved unsatisfactory we investigate if this is also the case for the waves.
FIG. 15
To begin with consider processes which in the classical theory would correspond to a one-dimensional motion given by a Fourier series for the coördinate q.
(1)
We consider now, not the motion q (t), but the set of all the elementary oscillationsqre2πiνrt
and try to change them so that they are suitable for the representation, not of the higher harmonics of the motion, but of the real waves of an atom.
Frequencies are not therefore in general harmonics (ντ), but can be expressed according to Ritz’s combination principle as differences of every pair of terms of the series
We therefore write
(2)
To every jump n → m corresponds an amplitude and a phase that we denote by the complex amplitude,
(3)
The set of all possible oscillations is best expressed by ordering them in a square array of terms,
We abbreviate this to
(4)
In order that this array correspond to a real Fourier series q(t) the condition δ(mn) = −δ(nm) must be added, or its equivalent, that q(nm) be transformed into its conjugate q*(nm) by an interchange of m and n, i.e.,
(5)
because for a real Fourier series the corresponding relation holds.
The manifold of elementary oscillations is thus naturally represented by a two-dimensional array, while the manifold of harmonics of a motion is represented by a one-dimensional series,C1e2π νt, C2e2πi2νt, C3e2πi3νt....
It is for this reason that, in the theory presented so far, it was necessary to consider simultaneously a whole series of motions, i.e., the stationary states, which are distinguished by another index, i.e., the quantum number n, whereby C and ν become functions of n. The array found in this way has neither the correct frequencies nor a simple and unique correspondence to the jumps.
We must now find the laws determining the amplitudes q(nm) and the frequencies ν(nm). For this purpose we utilize the principle of making the new laws as similar as possible to those of classical mechanics, for the fact that the classical theory of conditioned-periodic motions is in a position to account qualitatively for many quantum phenomena shows that the essential point is not the overthrow of mechanics, but rather a change from classical geometry and kinematics to the new method of representation by means of elementary waves.
As simplest example in classical mechanics we consider the oscillator. We are already familiar with the fact that everything follows once the potential energy is known. The potential energy can also be expressed in terms of elmentary waves, because the square of a Fourier series is also a Fourier series
(6)
where
The set of quantities Dτ represents therefore the function q2(t) in quite the same way as the set Cτ represents the function q(t). This can be translated into our square array, as follows: We ask, is it possible to find a multiplication rule for q(nm) by which, out of every array q we can construct a new array which we shall write symbolically q2, but in which no new frequencies appear? The latter condition is essential and corresponds to the theorem of classical theory, that the square of a Fourier series, or the product of two such series with the same fundamental frequency, is also a Fourier series with the same fundamental frequency.
This question can be answered by looking upon the square array from the point of view of the mathematician, considering it as a matrix and applying the known rule for the multiplication of matrices. The product of the two matricesa = (a(nm)), b = (b(nm))
is defined by the matrix
(7)
If we apply this rule to our array of elementary waves q and multiply it by another array p which has the same frequencies ν (nm) we obtain
but we have
Therefore
(8)
that is, the symbolic product has the same frequencies as its factors. This formula is a profound generalization of the rule for obtaining the Fourier coefficients of the product of two Fourier series. We see that the multiplication rule of matrices is very closely connected with Ritz’s principle of combination.
We now give the fundamental rules of matrix calculus. Addition and subtraction are performed by carrying out the required operation on each element:
(9)
The notation can be simplified further by dropping the factors e2πiνt. The matrix q = (q(nm)) represents therefore one coördinate.
The derivative of a matrix with respect to time is the matrix
(10)
where again the exponential factor is dropped. The operation of differentiation can also be expressed in terms of multiplication of matrices. For this purpose we introduce the unit matrix
(11)
where
From this we form a diagonal matrix
(1
2)
We now multiply this matrix by the matrix (q(nm)). In this connection we note an extremely important theorem in the development of the theory, i.e., that the multiplication of matrices is not commutative. We have
If we take the difference,
(13)
we see that, from Ritz’s combination principle,
follows the formula,
(14)
LECTURE 11
The commutation rule and its justification by a correspondence consideration—Matrix functions and their differentiation with respect to matrix arguments.
We shall now try to translate classical mechanics, as slightly altered as possible, into matrix form. To each coördinate matrix q corresponds a momentum matrix p. We form out of these matrices, by matrix addition and multiplication, in some cases repeated an infinite number of times, the Hamiltonian function H and try to establish the analogue of the canonical differential equations. Here we again encounter the difficulty that products are now non-commutative; qp is not in general equal to pq. At this point the quantum theory makes its appearance. I maintain that the condition
(1)
must be introduced, whereby Planck’s constant h is bound up closely with the foundations of the theory. This relation can be made plausible by showing that, in the case of large quantum numbers, it becomes identical with the quantum condition for periodic systems. This limiting case can be described more accurately as follows: We consider large values of m and n, and assume that all q(mn), p(mn) are vanishingly small except if |m − n| = τ is small compared with m and n. For simplicity we consider only the case where p = µq, thereforep(mn) = 2πiµν(mn)q(mn).
Let us consider especially the diagonal elements of our quantum condition (1)
(2)
or
for which we can write
or, since ν(mn) = −ν(nm),
If we placefτ(n) = ν(n, n − τ)|q(n, n − τ )|2.
we may write
If we pass to the limit n >> τ, we obtain the classical formulas. Placing nh = I, we obtain
(3)
which is the classical frequency of τth harmonic. Further, the corresponding amplitude isq(n, n − τ ) → qτ (I).
Thereforefτ(n) → fτ(I) = ντ|qτ(I)|2
and
(4)
This formula, however, is the quantum condition of Bohr’s theory∫0pdq = I = h n,
for if we set
we obtain
and differentiating with respect to I
(5)
in agreement with the limit given above.
These correspondence considerations justify in a certain sense the diagonal elements of the fundamental relation (1). In order to approach as closely as possible to commutativity it is reasonable to set all elements except those on the diagonal equal to zero. Owing to this commutation law, calculations with matrices become determinate. We can therefore construct functions of p and q by repeated multiplications and additions.
We have for instance the energy-function of the harmonic oscillator (Mass = µ):
(6)
To form the canonical equations we must first introduce the operation of differentiation. The derivative of a matrix function f(x) with respect to the argument-matrix x is defined by
(7)
where α(mn) is the product of the unit matrix by a number α α(mn) = αδmn .
The multiplication by such a matrix, or its reciprocal
is commutative and therefore our definition has a unique meaning.
We have, for instance,
that is
Similarly
that is
The product rule
(8)
is proved as in ordinary calculus:
where it should be observed that the order φ, ψ must be conserved. From this we deduce at once
whence it follows that all the rules of ordinary differential calculus hold. The partial derivative of a matrix function of several argument-matrices f(x1,x2...) with respect to one of them, say x1, is obtained by applying our definition of differentiation to x1 only, while x2,x3 ... are held constant.
LECTURE 12
The canonical equations of mechanics—Proof of the conservation of energy and of the “frequency condition”—Canonical transformations—The analogue of the Hamilton-Jacobi differential equation.
We can now write the canonical equations
(1)
They form in reality an infinite number of equations for an infinite number of unknowns, for the matrices on the right and left-hand sides must be equal element by element.
To establish the law of conservation of energy we need the following lemmas: Let f(qp) be any matrix function of p and q. Then
(2)
To prove these relations we first assume that they are true for any two given functions φ and ψ and show that they are also true for φ + ψ and φψ . For φ + ψ this is trivial, for φψ a simple calculation gives
and an analogous relation for pφψ − φψp. But our relations hold for f = p and f = q and therefore hold for every function, as functions have already been defined by repeated application of the elementary operations. By Equations (14), Lecture 10, and (2), this Lecture, we may write the canonical equations (1)
(3)
or
W − H is therefore commutable with p and q, hence also with any function of p and q, in particular with H(pq). We thus have(W − H ) H − H (W − H ) = 0
or WH − HW = 0.
From this follows, by Equation (14), Lecture 10,
(4)
which proves the conservation of energy. H is thus seen to be a diagonal matrix
(5)
For the elements, the first of Equations (3) can be written,q(nm) (Wn—Wm) = q(nm) (Hn—Hm) .
Therefore,
(6)
whence Bohr’s frequency condition follows as a consequence of our postulates. By a suitable choice of an arbitrary constant we can place
(7)
and this gives to the Ritz combination principle the more precise meaning of the Einstein-Bohr frequency condition.
The whole proof can also be reversed. We know that the principle of conservation of energy and the frequency condition are correct. If, therefore, the energy-function H is given as an analytic function of any two variables P, Q, then, provided that
the canonical equations
hold. This is true because the expressions HP − PH and HQ − QH can always, as we have shown, be interpreted in two ways, either as partial derivatives of H or, as H is constant, as derivatives of Q or P with respect to time. Therefore, we understand by a canonic transformation pq → PQ one for which
(8)
for then the canonical equations hold for p, q as well as for P, Q.
A general transformation which satisfies this condition is
(9)
where S is any arbitrary matrix. Probably, this is the most general canonic transformation. It has the simple property that for any function f(PQ) the relation
(10)
holds, where f(pq) is formed from f(PQ) by replacing P by p and Q by q without changing the form of the function. We shall show that if this theorem is true for two functions φ, ψ, it is also true for φ + ψ and φψ. For φ + ψ it is evident. For φψ we haveφ(PQ)ψ (PQ) = Sφ(pq)S−1 Sψ(pq)S−1 = Sφ(pq)ψ(pq)S−1.
As the proposition holds for f = p or f = q, it holds in general for all analytic functions.
The importance of the canonic transformations is based on the following theorem: If any pair of variables p0, q0 is given which satisfies the condition
The Dreams That Stuff is Made of Page 83