Let us consider the general problem of conservative systems in classical mechanics. The Hamilton-Jacobi equation runs
(1)
W is the action function, i. e. the time integral of the Lagrange function T − V along a path of the system as a function of the end points and the time. qk is a representative position co-ordinate; T is the kinetic energy as function of the q’s and momenta, being a quadratic form of the latter, for which, as prescribed, the partial derivatives of W with respect to the q’s are written. V is the potential energy. To solve the equation put
(2)
and obtain
(1ʹ)
E is an arbitrary integration constant and signifies, as is known, the energy of the system. Contrary to the usual practice, we have let the function W remain itself in (1ʹ), instead of introducing the time-free function of the co-ordinates, S. That is a mere superficiality.
Equation (1ʹ) can now be very simply expressed if we make use of the method of Heinrich Hertz. It becomes, like all geometrical assertions in configuration space (space of the variables qk ), especially simple and clear if we introduce into this space a non-Euclidean metric by means of the kinetic energy of the system.
Let be the kinetic energy as function of the velocities qk, not of the momenta as above, and let us put for the line element
(3)
The right-hand side now contains dt only externally and represents (since kdt = dqk ) a quadratic form of the dqk’s.
After this stipulation, conceptions such as angle between two line elements, perpendicularity, divergence and curl of a vector, gradient of a scalar, Laplacian operation (= div grad) of a scalar, and others, may be used in the same simple way as in three-dimensional Euclidean space, and we may use in our thinking the Euclidean three-dimensional representation with impunity, except that the analytical expressions for these ideas become a very little more complicated, as the line element (3) must everywhere replace the Euclidean line element. We, stipulate, that in what follows, all geometrical statements in q-space are to be taken in this non-Euclidean sense.
One of the most important modifications for the calculation is that we must distinguish carefully between covariant and contravariant components of a vector or tensor. But this complication is not any greater than that which occurs in the case of an oblique set of Cartesian axes.
The dqk ’s are the prototype of a contravariant vector. The coefficients of the form 2 which depend on the qk’s, are therefore of a covariant character and form the covariant fundamental tensor. 2T is the contravariant form belonging to 2T, because the momenta are known to form the covariant vector belonging to the speed vector , the momentum being the velocity vector in covariant form. The left side of (1ʹ) is now simply the contravariant fundamental form, in which the are brought in as variables. The latter form the components of the vector,—according to its nature covariant, grad W.
(The expressing of the kinetic energy in terms of momenta instead of speeds has then this significance, that covariant vector components can only be introduced in a contravariant form if something intelligible, i.e. invariant, is to result.)
Equation (1ʹ) is equivalent thus to the simple statement
(1”)
or
(1”ʹ)
This requirement is easily analysed. Suppose that a function W, of the form (2), has been found, which satisfies it. Then this function can be clearly represented for every definite t, if the family of surfaces W = const, be described in q-space and to each member a value of W be ascribed.
Now, on the one hand, as will be shown immediately, equation (1”ʹ) gives an exact rule for constructing all the other surfaces of the family and obtaining their W-values from any single member, if the latter and its W-value is known. On the other hand, if the sole necessary data for the construction, viz. one surface and its W-value be given quite arbitrarily, then from the rule, which presents just two alternatives, there may be completed one of the functions W fulfilling the given requirement. Provisionally, the time is regarded as constant.—The construction rule therefore exhausts the contents of the differential equation; each of its solutions can be obtained from a suitably chosen surface and W-value.
Let us consider the construction rule. Let the value W0 be given in Fig. 1 to an arbitrary surface. In order to find the surface W0 + dW0, take either side of the given surface as the positive one, erect the normal at each point of it and cut off (with due regard to the sign of dW0) the step
(4)
The locus of the end points of the steps is the surface W0 + dW0. Similarly, the family of surfaces may be constructed successively on both sides.
The construction has a double interpretation, as the other side of the given surface might have been taken as positive for the first step. This ambiguity does not hold for later steps, i.e. at any later stage of the process we cannot change arbitrarily the sign of the sides of the surface, at which we have arrived, as this would involve in general a discontinuity in the first differential coefficient of W. Moreover, the two families obtained in the two cases are clearly identical; the W-values merely run in the opposite direction.
FIG. 1
Let us consider now the very simple dependence on the time. For this, (2) shows that at any later (or earlier) instant t + tʹ, the same group of surfaces illustrates the W-distribution, though different W-value are associated with the individual members, namely, from each W-value ascribed at time t there must be subtracted Et’. The W-values wander, as it were, from surface to surface according to a definite, simple law, and for positive E in the direction of W increasing. Instead of this, however, we may imagine that the surfaces wander in such a way that each of them continually takes the place and exact form of the following one, and always carries its W-value with it. The rule for this wandering is given by the fact that the surface W0 at time t + dt must have reached that place, which at t was occupied by the surface W0 + Edt. This will be attained according to (4), if each point of the surface W0 is allowed to move in the direction of the positive normal through a distance
(5)
That is, the surfaces move with a normal velocity
(6)
which, when the constant E is given, is a pure function of position.
Now it is seen that our system of surfaces W = const. can be conceived as the system of wave surfaces of a progressive but stationary wave motion in q-space, for which the value of the phase velocity at every point in the space is given by (6). For the normal construction can clearly be replaced by the construction of elementary Huygens waves (with radius (5)), and then of their envelope. The “index of refraction” is proportional to the reciprocal of (6), and is dependent on the position but not on the direction. The q-space is thus optically non-homogeneous but is isotropic. The elementary waves are “spheres”, though of course—let me repeat it expressly once more—in the sense of the line-element (3).
The function of action W plays the part of the phase of our wave system. The Hamilton-Jacobi equation is the expression of Huygens’ principle. If, now, Fermat’s principle be formulated thus,
(7)
we are led directly to Hamilton’s principle in the form given by Mau-pertuis (where the time integral is to be taken with the usual grain of salt, i.e. T + V = E = constant, even during the variation). The “rays”, i.e. the orthogonal trajectories of the wave surfaces, are therefore the paths of the system for the value E of the energy, in agreement with the well-known system of equations
(8)
which states, that a set of system paths can be derived from each special function of action, just like a fluid motion from its velocity potential.dm (The momenta pk form the covariant velocity vector, which equations (8) assert to be equal to the gradient of the function of action.)
Although in these deliberations on wave surfaces we speak of velocity of propagation and Huygens’ principle, we must regard the analogy as one between mechanics and geometrical optics, and not physical or undulatory optics. For the idea of
“rays”, which is the essential feature in the mechanical analogy, belongs to geometrical optics; it is only clearly defined in the latter. Also Fermat’s principle can be applied in geometrical optics without going beyond the idea of index of refraction. And the system of W-surfaces, regarded as wave surfaces, stands in a somewhat looser relationship to mechanical motion, inasmuch as the image point of the mechanical system in no wise moves along the ray with the wave velocity u, but, on the contrary, its velocity (for constant E) is proportional to . It is given directly from (3) as
(9)
This non-agreement is obvious. Firstly, according to (8), the system’s point velocity is great when grad W is great, i.e. where the W-surfaces are closely crowded together, i.e. where u is small. Secondly, from the definition of W as the time integral of the Lagrange function, W alters during the motion (by (T − V)dt in the time dt), and so the image point cannot remain continuously in contact with the same W-surface.
And important ideas in wave theory, such as amplitude, wave length, and frequency—or, speaking more generally, the wave form—do not enter into the analogy at all, as there exists no mechanical parallel; even of the wave function itself there is no mention beyond that W has the meaning of the phase of the waves (and this is somewhat hazy owing to the wave form being undefined).
If we find in the whole parallel merely a satisfactory means of contemplation, then this defect is not disturbing, and we would regard any attempt to supply it as idle trifling, believing the analogy to be precisely with geometrical, or at furthest, with a very primitive form of wave optics, and not with the fully developed undulatory optics. That geometrical optics is only a rough approximation for Light makes no difference. To preserve the analogy on the further development of the optics of q-space on the lines of wave theory, we must take good care not to depart markedly from the limiting case of geometrical optics, i.e. must choosedn the wave length sufficiently small, i.e. small compared with all the path dimensions. Then the additions do not teach anything new; the picture is only draped with superfluous ornaments.
So we might think to begin with. But even the first attempt at the development of the analogy to the wave theory leads to such striking results, that a quite different suspicion arises: we know to-day, in fact, that our classical mechanics fails for very small dimensions of the path and for very great curvatures. Perhaps this failure is in strict analogy with the failure of geometrical optics, i.e. “the optics of infinitely small wave lengths”, that becomes evident as soon as the obstacles or apertures are no longer great compared with the real, finite, wave length. Perhaps our classical mechanics is the complete analogy of geometrical optics and as such is wrong and not in agreement with reality; it fails whenever the radii of curvature and dimensions of the path are no longer great compared with a certain wave length, to which, in q-space, a real meaning is attached. Then it becomes a question of searchingdo for an undulatory mechanics, and the most obvious way is the working out of the Hamiltonian analogy on the lines of undulatory optics.
§ 2. “GEOMETRICAL” AND “UNDULATORY” MECHANICS
We will at first assume that it is fair, in extending the analogy, to imagine the above-mentioned wave system as consisting of sine waves. This is the simplest and most obvious case, yet the arbitrariness, which arises from the fundamental significance of this assumption, must be emphasized. The wave function has thus only to contain the time in the form of a factor, sin (. . . ), where the argument is a linear function of W. The coefficient of W must have the dimensions of the reciprocal of action, since W has those of action and the phase of a sine has zero dimensions. We assume that it is quite universal, i.e. that it is not only independent of E, but also of the nature of the mechanical system. We may then at once denote it by . The time factor then is
(10)
Hence the frequency ν of the waves is given by
(11)
Thus we get the frequency of the q-space waves to be proportional to the energy of the system, in a manner which is not markedly artificial.dp This is only true of course if E is absolute and not, as in classical mechanics, indefinite to the extent of an additive constant. By (6) and (11) the wave length is independent of this additive constant, being
(12)
and we know the term under the root to be double the kinetic energy. Let us make a preliminary rough comparison of this wave length with the dimensions of the orbit of a hydrogen electron as given by classical mechanics, taking care to notice that a “step” in q-space has not the dimensions of length, but length multiplied by the square root of mass, in consequence of (3). λ has similar dimensions. We have therefore to divide λ by the dimension of the orbit, a.cm., say, and by the square root of m, the mass of the electron. The quotient is of the order of magnitude of
where v represents for the moment the electron’s velocity (cm./sec.). The denominator mva is of the order of the mechanical moment of momentum, and this is at least of the order of 10−27 for Kepler orbits, as can be calculated from the values of electronic charge and mass independently of all quantum theories. We thus obtain the correct order for the limit of the approximate region of validity of classical mechanics, if we identify our constant h with Planck’s quantum of action—and this is only a preliminary attempt.
If in (6), E is expressed by means of (11) in terms of ν, then we obtain
(6ʹ)
The dependence of the wave velocity on the energy thus becomes a particular kind of dependence on the frequency, i.e. it becomes a law of dispersion for the waves. This law is of great interest. We have shown in § 1 that the wandering wave surfaces are only loosely connected with the motion of the system point, since their velocities are not equal and cannot be equal. According to (9), (11), and (6ʹ) the system’s velocity v has thus also a concrete significance for the wave. We verify at once that
(13)
i.e. the velocity of the system point is that of a group of waves, included within a small range of frequencies (signal-velocity). We find here again a theorem for the “phase waves” of the electron, which M. de Broglie had derived, with essential reference to the relativity theory, in those fine researches,dq to which I owe the inspiration for this work. We see that the theorem in question is of wide generality, and does not arise solely from relativity theory, but is valid for every conservative system of ordinary mechanics.
We can utilise this fact to institute a much more innate connection between wave propagation and the movement of the representative point than was possible before. We can attempt to build up a wave group which will have relatively small dimensions in every direction. Such a wave group will then presumably obey the same laws of motion as a single image point of the mechanical system. It will then give, so to speak, an equivalent of the image point, so long as we can look on it as being approximately confined to a point, i.e. so long as we can neglect any spreading out in comparison with the dimensions of the path of the system. This will only be the case when the path dimensions, and especially the radius of curvature of the path, are very great compared with the wave length. For, in analogy with ordinary optics, it is obvious from what has been said that not only must the dimensions of the wave group not be reduced below the order of magnitude of the wave length, but, on the contrary, the group must extend in all directions over a large number of wave lengths, if it is to be approximately monochromatic. This, however, must be postulated, since the wave group must move about as a whole with a definite group velocity and correspond to a mechanical system of definite energy (cf. equation 11).
So far as I see, such groups of waves can be constructed on exactly the same principle as that used by Debyedr and von Laueds to solve the problem in ordinary optics of giving an exact analytical representation of a cone of rays or of a sheaf of rays. From this there comes a very interesting relation to that part of the Hamilton-Jacobi theory not described in § 1, viz. the well-known derivation of the equations of motion in integrated form, by the differentiation of a complete integral of the Hamilt
on-Jacobi equation with respect to the constants of integration. As we will see immediately, the system of equations called after Jacobi is equivalent to the statement: the image point of the mechanical system continuously corresponds to that point, where a certain continuum of wave trains coalesces in equal phase.
In optics, the representation (strictly on the wave theory) of a “sheaf of rays” with a sharply defined finite cross-section, which proceeds to a focus and then diverges again, is thus carried out by Debye. A continuum of plane wave trains, each of which alone would fill the whole space, is superposed. The continuum is produced by letting the wave normal vary throughout the given solid angle. The waves then destroy one another almost completely by interference outside a certain double cone; they represent exactly, on the wave theory, the desired limited sheaf of rays and also the diffraction phenomena, necessarily occasioned by the limitation. We can represent in this manner an infinitesimal cone of rays just as well as a finite one, if we allow the wave normal of the group to vary only inside an infinitesimal solid angle. This has been utilised by von Laue in his famous paper on the degrees of freedom of a sheaf of rays.dt Finally, instead of working with waves, hitherto tacitly accepted as purely monochromatic, we can also allow the frequency to vary within an infinitesimal interval, and by a suitable distribution of the amplitudes and phases can confine the disturbance to a region which is relatively small in the longitudinal direction also. So we succeed in representing analytically a “parcel of energy” of relatively small dimensions, which travels with the speed of light, or when dispersion occurs, with the group velocity. Thereby is given the instantaneous position of the parcel of energy—if the detailed structure is not in question—in a very plausible way as that point of space where all the superposed plane waves meet in exactly agreeing phase.
The Dreams That Stuff is Made of Page 26