ON A RELATIVISTICALLY INVARIANT FORMULATION OF THE QUANTUM THEORY OF WAVE FIELDS
BY
SIN-ITIRO TOMANAGA
From Progress of Theoretical Physics, Vol. I, pg. 27 (1946)
1. THE FORMALISM OF THE ORDINARY QUANTUM THEORY OF WAVE FIELDS
Recently Yukawa(1) has made a comprehensive consideration about the basis of the quantum theory of wave fields. In his article he has pointed out the fact that the existing formalism of the quantum field theory is not yet perfectly relativistic.
Let v (xyz) be the quantity specifying the field, and λ(xyz) denote its canonical conjugate. Then the quantum theory requires commutation relations of the form:
(1)mg,mh
but these have quite non-relativistic forms.
The equations (1) give the commutation relations between the quantities at different points (xyz) and (x′y′z′) at the same instant of time t. The concept “same instant of time at different points” has, however, a definite meaning only if one specifies some definite Lorentz frame of reference. Thus this is not a relativistically invariant concept.
Further, the Schrödinger equation for the Ψ-vector representing the state of the system has the form:
(2)
where H is the operator representing the total energy of the field which is given by the space integral of a function of v and λ. As we adopt here the Schrödinger picture, v and λ are operators independent of time. The vector representing the state is in this picture a function of the time, and its dependence on t is determined by (2).
Also the differential equation (2) is no less non-relativistic. In this equation the time variable t plays a role quite distinct from the space coordinates x, y and z. This situation is closely connected with the fact that the notion of probability amplitude does not fit with the relativity theory.
As is well known, the vector Ψ has, as the probability amplitude, the following physical meaning: Suppose the representation which makes the field quantity v(xyz) diagonal. Let Ψ [v′(xyz)] denote the representative of Ψ in this representation.mi Then the representative ψ [v′(xyz)] is called probability amplitude, and its absolute square
(3)
gives the relative probability of v(xyz) having the specified functional form v′(xyz) at the instant of time t. In other words: Suppose a planemj which is parallel to the xyz-plane and intercepts the time axis at t. Then the probability that the field has the specified functional form v′(xyz) on this plane is given by (3).
As one sees, a plane parallel to the xyz-plane plays here a significant role. But such a plane is defined only by referring to a certain frame of reference. Thus the probability amplitude is not a relativistically invariant concept in the space-time world.
2. FOUR-DIMENSIONAL FORM OF THE COMMUTATION RELATIONS
As stated above, the laws of the quantum theory of wave fields are usually expressed as mathematical relations between quantities having their meanings only in some specified Lorentz frame of reference. But since it is proved that the whole contents of the theory are of course relativistically invariant, it must be certainly possible to build up the theory on the basis of concepts having relativistic space-time meanings. Thus, in his consideration, Yukawa has required with Dirac(2) a generalization of the notion of probability amplitude to fit with the relativity theory. We shall now show below that the generalization of the theory on these lines is in fact possible to the relativistically necessary and sufficient extent. Our results are, however, not so general as expected by Dirac and by Yukawa, but are already sufficiently general in so far as it is required by the relativity theory.
Let us suppose for simplicity that there are only two fields interacting with each other. The case of a greater number of fields can also be treated in the same way. Let v1 and v2 denote the quantities specifying the fields. The canonically conjugate quantities are λ1 and λ2 respectively. Then between these quantities the commutation relations
(4)
must hold. The Ψ -vector satisfies the Schrödinger equation
(5)
In this equation H1 and H2 mean respectively the energy of the first and the second field. H1 is given by the space integral of a function of v1 and λ1, H2 by the space integral of a function of v2 and λ2. Further, H12 is the interaction energy of the fields and is given by the space integral of a function of both v1, λ1 and v2, λ2. We assume (i) that the integrand of H12, i.e. the interaction-energy density, is a scalar quantity, and (ii) that the energy densities at two different points (but at the same instant of time) commute with each other. In general, these two facts follow from the single assumption: the interaction term in the Lagrangian does not contain the time derivatives of v1 and v2.
If this energy density is denoted by H12, then we have
(6)
As we adopt here the Schrödinger picture, the quantities v and λ in H1, H2 and H12 are all operators independent of time.
Thus far we have merely summarized the well-known facts. Now, as the first stage of making the theory relativistic, we suppose the unitary operator
(7)
and introduce the following unitary transformations of v and λ, and the corresponding transformation of Ψ:
(8)
As stated above, v and λ in (5) are quantities independent of time. But V and Λ obtained from them by means of (8) contain t through U. Thus they depend on t by
(9)
These equations must necessarily have covariant forms against Lorentz transformations, because they are just the field equations for the fields when they are left alone without interacting with each other.
Now, the solutions of these “vacuum equations,” the equations which the fields must satisfy, when they are left alone, together with the commutation relations (4), give rise to the relations of the following forms:
(10)
where Ars , Brs and Crs are functions which are combinations of the so-called four-dimensional δ-functions and their derivatives.(3) One denotes usually these four-dimensional δ-functions by Dr (xyzt), r = 1, 2. They are defined by
(11)
with
(12)
χr being the constant characteristic to the field r. It can be easily proved that these functions are relativistically invariant.mk
Since (10) gives, in contrast with (4), the commutation relations between the fields at two different world points (xyzt) and (x′y′z′t), it contains no more the notion of same instant of time. Therefore, (10) is sufficiently relativistic presupposing no special frame of reference. We call (10) four-dimensional form of the commutation relations.
One property of D(xyzt) will be mentioned here: When the world point (xyzt) lies outside the light cone whose vertex is at the origin, then D(xyzt) vanishes identically:
(13)
It follows directly from (13) that, if the world point (x′y′z′t′) lies outside the light cone whose vertex is at the world point (xyzt), the right-hand sides of (10) always vanish. In words: Suppose two world points P and Pr. When these points lie outside each other’s light cones, the field quantities at P and field quantities at P′ commute with each other.
3. GENERALIZATION OF THE SCHRÖDINGER EQUATION
Next we observe the vector ψ obtained from ψ by means of the unitary transformation U. We see from (5), (7) and (8) that this ψ, considered as a function of t, satisfies
(14)
One sees that t plays also here a role distinct from x, y and z: also here a plane parallel to the xyz-plane has a special significance. So we must in some way remove this unsatisfactory feature of the theory.
This improvement can be attained in the way similar to that in which Dirac(4) has built up the so-called many-time formalism of the quantum mechanics. We will now recall this theory.
The Schrödinger equation for the system containing N charged particles interacting with the electromagnetic field is given by
(15)
Here Hel means the energy of the electromagnetic field, Hn the energy of the nth
particle. Hn contains, besides the kinetic energy of the nth particle, the interaction energy between this particle and the field through a(qn), qn being the coordinates of the particle and a the potential of the field. pn in (15) means as usual the momentum of the nth particle.
We consider now the unitary operator
(16)
and introduce the unitary transformation of a:
(17)
and the corresponding transformation of Ψ:
(18)
Then we see that Φ satisfies the equation
(19)
In contrast with a, which was independent of time (Schrödinger picture), U contains t through u. To emphasize this, we have written t explicitly as the argument of U. We can prove that U satisfies the Maxwell equations in vacuo (accurately speaking, we need special considerations for the equation div ).
The equation (19) is the starting point of the many-time theory. In this theory one introduces then the function Φ(q1t1, q2 t2 , . . . , qN , tN) containing as many time variables t1, t2, . . . tN as the number of the particles in place of the function Φ (q1, q2 , . . . , qN, t) containing only one time variable,ml and suppose that this Φ(q1t1, q2t2, . . . , qNtN) satisfies simultaneously the following N equations:
(20)
This Φ(t1,t2,...,tN), which is a fundamental quantity in the many-time theory, is related to the ordinary probability amplitude Φ(t) by
(21)
Now, the simultaneous equations (20) can be solved when and only when the N2 conditions
(22)
are satisfied for all pairs of n and n′. If the world point (qntn) lies outside the light cone whose vertex is at the point we can prove that = 0. As the result, the function satisfying (20) can exist in the region where
(23)
is satisfied simultaneously for all values of n and n′.
According to Bloch(5) we can give Φ(q1t1, q2 t2 , . . . , qNtN) a physical meaning when its arguments lie in the region given by (23). Namely
(24)
gives the relative probability that one finds the value q1 in the measurement of the position of the first particle at the instant of time t1, the value q2 in the measurement of the position of the second particle at the instant of time t2, . . . and the value qN in the measurement of the position of the N th particle at the instant of time tN.
This is the outline of the many-time formalism of the quantum mechanics. We will now return to our main subject. If we compare our equation (14) with the equation (19) of the many-time theory, we notice a marked similarity between these two equations. In (19) stands the suffix n, which designates the particle, while in (14) stand the variables x, y and z, which designate the position in space. Further, Φ is a function of the N independent variables q1, q2, . . . , qN , qn giving the position of the nth particle, while Ψ is a functional of the infinitely many “independent variables” v1(xyz) and v2(xyz), v2(xyz) and v2(xyz) giving the fields at the position (xyz). Corresponding to the sum Σn Hn in (19) the integral ∫ H12dx dy dz stands in (14). In this way, to the suffix n in (19) which takes the values 1,2,3,...,N correspond the variables x, y and z which take continuously all values from−∞ to + ∞.
Such a similarity suggests that we introduce infinitely many time variables txyz, which we may call local time,mm each for one position (xyz) in the space, just as we have introduced N time variables, particle times, t1, t2, . . . , tN, each for one particle. The only difference is that we use in our case infinitely many time variables whereas we have used N time variables in the ordinary many-time theory.
Corresponding to the transition from the use of the function with one time variable to the use of the function of N time variables, we must now consider the transition from the use of Ψ (t) to the use of a functional Ψ [txyz] of infinitely many time variables txyz.
We now regard txyz as a function of (xyz) and consider its variation εxyz which differs from zero only in a small domain V0 in the neighbourhood of the point (x0y0z0). We will define the partial differential coefficient of the functional Ψ [txyz] with respect to the variable tx0y0z0 in the following manner:
(25)
We then generalize (14), and regard
(26)
the infinitely many simultaneous equations corresponding to the N equations (20), as the fundamental equations of our theory. In (26) we have written, for simplicity, H12(x, y, z, t) in place of H12(V1(xyz,t), V2(xyz, t),. . . ). In general, when we have a function F (V , Λ) of V and Λ, we will write simply F(x, y, z, t) for F(V(xyz, txyz), Λ (xyz, txyz)), or still simpler F(P), P denoting the world point with the coordinates (xyz, txyz). Thus F(P’) means F(x’, y’, z’, t’) or, more precisely, F(V(x’y’ z’, tx’y’z’), (x’y’z, tx’y’z’)).
We will now adopt the equation (26) as the basis of our theory. For V1(P), V2(P), Λ1(P) and Λ2(P) in H12 the commutation relations (10) hold, where D(xyzt), has the property (13). As the consequence, we have
(27)
when the point P lies a finite distance apart from P’ and outside the light cone whose vertex is at P. Further, from our assumption (ii) the relation (27) holds also when P and P’ are two adjacent points approaching in a space-like direction. Thus our system of equations (26) is integrable when the surface defined by the equations t = txyz, considering txyz as a function of x, y and z, is space-like.
In this way, a functional of the variable surface in the space-time world is determined by the functional partial differential equations (26). Corresponding to the relation (21) in case of many-time theory, ψ[txyz] reduces to the ordinary ψ(t) when the surface reduces to a plane parallel to the xyz-plane.
The dependent variable surface t = txyz can be of any (space-like) form in the space-time world, and we need not presuppose any Lorentz frame of reference to define such a surface. Therefore, this ψ [txyz] is a relativistically invariant concept. The restriction that the surface must be space-like makes no trouble, since the property that a surface is space-like or time-like does not depend on a special choice of the reference system. It is not necessary, from the standpoint of the relativity theory, to admit also time-like surfaces for the variable surface, as was required by Dirac and by Yukawa. Thus we consider that ψ[txyz] introduced above is already the sufficient generalization of the ordinary ψ-vector, and assume that the quantum-theoretical statemn of the fields is represented by this functional vector.
Let C denote the surface defined by the equation t = txyz . Then Ψ is a functional of the surface C. We write this as Ψ[C]. On C we take a point P, whose coordinates are (xyz, txyz), and suppose a surface C’ which overlaps C except in a small domain about P. We denote the volume of the small world lying between C and C’ by dωp. Then we may write (25) also in the form:
(28)
Then (26) can be written in the form:
(29)
This equation (29) has now a perfect space-time form. In the first place, H12 is a scalar according to our assumption (i); in the second place, the commutation relations between V(P) and Λ(P) contained in H12 has the four-dimensional forms as (10), and finally the differentiiation is defined by (28) quite independently of any frame of reference.
A direct conclusion drawn from (29) is that Ψ[C’] is obtained from Ψ[C] by the following infinitesimal transformation:
(30)
When there exist in the space-time world two surfaces C1 and C2 a finite distance apart, we need only to repeat the infinitesimal transformations in order to obtain Ψ[C2] from Ψ[C1]. Thus
(31)
The meaning of this equation is as follows: We divide the world region lying between C1 and C2 into small elements dωp (it is necessary that each world element be surrounded by two space-like surfaces). We consider for each world element the infinitesimal transformation . Then we take the product of these transformations, the order of the factor being taken from C1 to C2. This product transforms then Ψ[C1] into Ψ[C2].
The surfaces C1 and C2 must here be both space-like, but otherwise they may-have any form and any configur
ation. Thus C2 does not necessarily lie afterward against C1; C1 and C2 may even cross with each other.
The relation of the form (31) has been already introduced by Heisenberg.(7) It can be regarded as the integral form of our generalized Schrödinger equation (29).
The Dreams That Stuff is Made of Page 64