Monday, July 2, 2007

A peculiarly quantum measurement

It is often said that quantum theory introduces an inevitable, minimum,
disturbance into any measurement. This is true, but here I want to
describe something which at first sight appears to show exactly the
opposite effect, namely, how quantum theory enables us to make a
totally non-disturbing measurement of a type that is impossible in
classical physics.
We consider a two-state system which, in order to have a simple
picture, we regard as a box that can be either EMP?"Y (not contain a
particle) or FULL (contain a particle). From a large sample of such
boxes we are given the task of selecting one that we know is FULL.
The way to do this is to 'look' and see if the box contains a particle.
However, it turns out that one photon falling on the box will either pass
right through, if the box is EMPTY, or be absorbed and destroy the
particle, if the box is FULL. Since we require to use at least one photon
in order to look at the box it follows that, after we have looked, we
either confirm that the box is EMPTY, or we know that it was FULL,
but is so no longer. Clearly, it seems, we cannot select a box that
is certainly FULL. The act of verifying that it is Fuu would simply
destroy the particle.
Here, amazingly, quantum mechanics provides a way to accomplish
our task. We first construct a photon interferometer, as shown in
figure 30. The photons enter at A and reach a beam-splitter (halfsilvered
mirror) at B, where the wave separates into two parts of equal
magnitude travelling on the paths denoted by 1 and 2. They recombine
at a second beam splitter, C, where, by suitable choice of path lengths,
it is arranged that the two contributions to the output towards the D detector destructively interfere, so that D never records a photon. In
other words, the photons always take the E path. Next we suppose
that at a certain place on, say, path 1 we can place one of our boxes
in such a way that if it is FULL the photon will be absorbed, and the
particle in the box destroyed, whereas if it is EMPTY it will have no
effect. We then place each box in turn in the interferometer, and send
in one photon. If the photon does not appear in the detector D then we
discard the box and choose another. When we have a box for which
the detector does record a photon, then we know that we have a box
that is FULL.
It is easy to see why: if the box had been EMPTY, then it would
have no effect, and by construction of the interferometer, the photon
could not go to the detector at D. Thus if a photon is seen at D, the
box is necessarily FULL. Note, also, that a FULL box just acts as
another detector, so with beam splitters having equal probabilities of
transmission and reflection, half of the experiments with a FULL box
will result in the photon destroying the particle in the box. In the
other half, the photon will reach the second beam-splitter, at C, and
one-half of the time will pass through and reach the D detector. Thus
one-quarter of the FULL boxes will lead to a photon being seen at D,
and therefore will actually be selected as FULL. What we have here is is a perfect ‘non-disturbing’ measurement, because we can see that the
photon has actually gone on the other path (path 2); nevertheless, if it
appears at the detector, it has verified that the box is FULL.
The basic ideas behind the arguments of this section are due
to A.C.Elitzw and L.Vaidman in an unpublished article from the
University of Tel Aviv (1991). Other applications of similar ideas
are given by L. Hardy Physics Letters 167A 11 (1992) and Physical
Review Letters 68 2981 (1992).

The Bohm model

Perhaps the most significant recent development in the Bohm hiddenvariable
model (see $5.2) is that physicists outside of Bohm’s own
students (and John Bell) have begun to take the model seriously. One
group (D. Diirr, S. Goldstein and N. Zhangi, Physics Letters 172A 6,
1992) have invented the rather evocative name ‘Bohmia? mechanics’ to
describe it. This group have considered the requirement that the initial
distribution of positions should be consistent with the quantum theory
probability law, which, as we noted in $5.2, is necessary for the Bohm
model to agree with quantum theory. In particular, they have shown
that the the requirement is expected to be satisfied for any ‘typical’
initial conditions.
Although, given that the above initial requirement holds, the Bohm
model is guaranteed by construction to agree with the statistical
predictions of quantum theory for particle positions (and hence with all
known experiments), there has been a widespread reluctance to accept
this fact, presumably because of a variety of ‘impossibility theorems’
on the lines of that due to von Neumann mentioned in 55.1. One such
theorem is often known as the Kochen-Specker-Bell theorem, which
is a strange irony because John Bell actually gave his simplified proof
of the theorem (Reviews of Modern Physics 38 447, 1966) in order to
show why it was not relevant to the Bohm model! The essence of these theorems is very similar to the non-locality arguments discussed
in 55.4 and Appendix 9. For example, in Appendix 9 we seemed to
show that the performers could not carry cards containing the answers.
Since these answers are the analogues of the hidden variables, this at
first sight means that such things are forbidden if we wish to maintain
agreement with quantum theory. The ‘error’ in this argument is that
it requires the answers to be fixed, whereas in the Bohm model they
are dynamical things which change with time, and which change in a
way that can depend upon what question is being asked of the other
performer (which is where the non-locality enters). The situation here
is sometimes described by saying that measurements are ‘contextual’,
a fancy way of saying that quantum systems in general cannot be
separated into independent parts, and that the answer you get depends
upon the question (i.e., the result depends on the apparatus).
It should be emphasised that the Bohm model looks after all this
automatically. In fact, on re-reading the remarks I wrote at the end
of 55.1, I think I was being unfair to the Bohm model in saying
that it was ‘contrived’. This suggests that much effort was required
in order to devise something that would work, whereas, in fact,
trajectories are defined by one simple property, namely that if we
have many identical systems with identical wavefunctions, and with
particle positions distributed according to the quantum probability law
at a particular time to, then this fact will remain true at other times.
Actually this does not quite define the trajectory uniquely-the Bohm
model is just the simplest possibility.
1 shall now describe a very idealised experiment which shows how
all this works in practice. First, it is necessary to note that in most
versions of the Bohm model trajectories only exist for ‘matter’ particles,
in particular, for the electrons and nucleons that are the constituents of
matter. All these particles have spin equal to !j. Particles of spin zero
or one, e.g., the photon, do not have trajectories-so, in this sense, we
should say that the Bohm model does not have photons. Why then do
we apparently see ‘photons’? Specifically, refemng to the experiment
described in 31.4, why do detectors appear to say that a photon either
goes through the barrier of 51.4 or is reflected, when we know that the
wave does both? We shall see how the existence of matter trajectories
answers this question.
In order to make the calculations as simple as possible, we take
as the measuring device a single particle, moving in one dimension,
initially in a stationary, localised, wave-packet, and suppose that a photon wave-packet interacts with this to give it a momentum. The
details of this interaction are not important. If the detector is placed in,
say, the path of the transmitted wave and if the barrier is removed so
that there is only a transmitted wave, then it is easy to calculate that the
detector particle, initially at rest, will acquire a velocity. Observation
of this velocity will correspond to the photon having been detected.
Thus we have a detector that works properly: a photon wave comes
along and is detected through the motion of the detector particle, i.e.,
the movement of a pointer.
Now let us restore the barrier, so that the photon wave is a
superposition of transmitted and reflected parts (see figure 28). Again
it is possible to calculate what happens to the detector, and it turns out
that, for some initial positions of the detector particle, it moves, and
for others it does not. As indicated in figure 28, the important thing
here is the position of the detector particle, i.e. the hidden-variable,
relative to the position of the detector wave-packet, which of course
is what we refer to as the position of the detector. Thus, whether or
not the detector detects the photon depends on the initial position of its
particle. If it does, we would say that the photon has been transmitted;
if it does not we would say that the photon has been reflected. (Note
that, as in the collapse models discussed in the previous section, these
statements are really statements about the detector, rather than about
the photon). To be more explicit we consider, for simplicity, the case
where transmission and reflection are equally likely (so that PR = PT
in the equations of §4.5), and take a symmetrical initial wave-packet for
the detector. Then those initial starting positions that are on the near
side (relative to the incident photon) will not detect the photon; those
that are on the far side will. This actually follows simply from the
fact that trajectories cannot cross. Provided the distribution of initial
positions, in many repeats of the same experiment, are in accordance
with quantum theory (and hence in this case symmetrical between the
two sides), it follows that the photon will be detected in half of the
experiments, i.e., it will be transmitted with 50 per cent probability as
required. Symbolically, with suitable conventions, this means:

xo > 0 + transmission
and
xo .c 0 + reflection

where xo is the initial position of the particle in the detector and we
have taken the detector to be centred at the origin, x = 0 (see figure
28).
Clearly, very similar considerations hold if we put a detector instead
in the path of the reflected beam. Then we find the analogous results:
yo > 0 -+ reflection
and
yo < 0 -+ transmission
where here yo is the initial position of the particle in the ‘reflection’
detector, which is centred at y = 0.
Next we consider what happens if we have both detectors, one in
the path of the transmitted beam, and the other in the path of the
reflected beam, as shown in figure 29. If these detectors behaved
independently, i.e., acted as if the other were not present, then there
would be the possibility of violating the experimental results (and
also the predictions of quantum theory). For example, if the starting
positions of the detector particles happened to satisfy xg > 0 and yo > 0
then, according to what we saw above, both detectors would record the
photon, which would then appear to have been both transmitted and
reflected! In fact, however, this is where the contextuality becomes

evident. It is straightforward to calculate that the first detector records
the photon, which is therefore transmitted, if
XO - yo > 0.
Otherwise, the second detector records the photon, corresponding
to its being reflected. In general, it is the relative position of the
particles in the two detectors that determines whether a particular event
is observed as a transmitted or reflected photon.
We emphasise again that in this experiment, because we have
assumed there are no photon trajectories, it is the properties of the
detectors that give rise to the apparent existence of ‘photons’ which
appear in specific places. When we say, for example, that the photon
is transmitted we mean no more than that an appropriate detector has,
or has not, recorded a photon. The model is designed to agree with
the predictions of orthodox quantum theory at the level of the output
of detectors, because it is these that correspond to observations. This
last point is particularly significant if we consider experiments where
particles that do have trajectories are used to trigger detectors. In
certain rather special cases it can be shown that the detector records
the particle even though the particle trajectory did not pass through it,
and conversely. One can most easily regard this as being due to nonclassical
effects of the quantum potential (see B. Englert, M.O. Scully,
G. Sussman and H. Walther Z Nutulforsch. 47a 1175 (1992) and C.
Dewdney, L. Hardy and E.J. Squires Physics Letters 184A 6 (1993) for
further details).
Two books covering all aspects of the Bohm model have recently
been published. The Quantum Theory of Motion (Cambridge University
Press, 1993) by Peter Holland, an ex-student of David Bohm, gives
an extremely thorough and detailed treatment of the model and its
applications. The book by David Bohm and Basil Hiley, The Undivided
Universe (Routledge, London, 1993), which was completed just before
Bohm’s death, contains fewer details of calculations in the Bohm
model but more on the general problem of the interpretation of
quantum theory, and comparison with other suggested solutions of the
measurement problem.

Recent Developments of Quantum

Models with explicit collapse
In 53.7, and Appendix 7, we considered how the measurement problem
of quantum mechanics could be solved by changing the theory so that a
wavefunction would evolve with time to become a state corresponding
to a unique value of the observable that was being measured. Two
difficulties with this approach were noted, namely, it seemed to require
prior knowledge of what was to be measured (since a state cannot in
general correspond to a unique value of several observables), and also
it had to happen very quickly in circumstances involving observation,
but at most very slowly in the many situations where the Schrijdinger
equation is known to work very well.
An explicit model, in which both these difficulties were overcome,
was proposed by three Italians, GianCarlo Ghirardi, Albert0 Rimini
and Tullio Weber (now universally known as GRW), in a remarkable
article published in 1986 (Physical Review D 34 470). They noted, first,
that all measurements ultimately involve the position of a macroscopic
object. (The special role of position is already used implicitly in the de
Broglie-Bohm model, as was noted in 55.2). Thus the measurement
problem can be solved if wavefunctions evolve so as to ensure
that macroscopic objects quickly have well-defined positions. By a
macroscopic object we here mean something that can seen by the naked
eye, say, something with a mass greater than about gm. Similarly,
a well-defined position requires the spread of the wavefunction to be
less than an observable separation, say, less than about cm.
In order to achieve this end, GRW postulated that all particles suffer
(infrequent) random ‘hits’ by something that destroys (makes zero)
all their wavefunction, except that within a distance less than about lO-%m from some fixed position. This position is chosen randomly
with a probability weight proportional to the square magnitude of the
particle’s wavefunction, i.e., to the probability of its being found at that
position if its position were measured (see 82.2).
GRW assumed that the typical time between hits was of the order
of lo%, which ensures that the effects of the hits in the microscopic
world are negligible, and do not disturb the well established agreement
between quantum theory and experiment. However, even the small
macroscopic object referred to above, with mass gm, contains
about 10l8 electrons and nucleons, so typically about one hundred of
these will be hit every second. Although it might at first sight seem
that hitting a few particles out of so many would have a negligible
effect, it turns out that, in a measurement situation, just one hit is
enough to collapse the whole state: when one goes, they all go! This
is the real magic of the GRW proposal. To see how it comes about we
imagine that the macroscopic object represents some sort of detector
(a ‘pointer’) which tells us whether a particle has, or has not, passed
through a barrier (see Chapter 1). Explicitly, suppose the pointer is
in position 1, with wavefunction D’, if the particle has been reflected,
and in position 2, with wavefunction D2, if it has not. Note that, for
example, D’ corresponds to all the particles of the object being close
to position 1. We assume that, in a proper measurement, the separation
between the two positions is greater than both the size of the object
and the GRW size parameter lo4 cm. The wavefunction describing
this situation has the form (cf 54.5):
Now we suppose that one of the particles is hit. The centre of the
hit will most likely occur where the wavefunction is big, i.e., in the
neighbourhood of either position 1 or position 2 (with probabilities
IPR12,l P~lre~sp ectively). Suppose the random selection chooses the
former. Then the whole wavefunction given above will be multiplied
by a function which is zero everywhere except in the neighbourhood
of position 1. Since the second term in the above state is zero except
when all the particles are near position 2, it will effectively be removed
by this hit (there are no values for the position of the hit particle
for which both factors, the hitting function and the wavefunction D2,
simultaneously differ from zero). In other words the wavefunction
will have collapsed to the state in which the particle was reflected. Notice that it is something that happens in the detector that establishes
whether or not the particle is transmitted; without a detector no such
determination is made (except within a time of around 10l6 s, the
average collapse time for a single particle).
Since, as we have seen, even for a small detector the typical time
between the collapses is of the order of s, which is less than the
time it takes for a person to respond to an observation, it is clear that the
GRW model has the desired effect of giving outcomes to measurements.
As a working, realistic, model of quantum theory it is important. It
provides insight into the theory; it raises fascinating questions relating
to when a conscious observation has actually occurred, particularly
because the disappearance of the unwanted terms is only approximate
and so-called ‘tails’ always remain; it also gives a structure in which
questions like the relation with relativity can be discussed. Whether
it is true is another question. It seems very unnatural, although more
satisfying versions in which the hitting is replaced by a continuous
process (similar to that discussed in Appendix 7) have been developed
by GRW, Philip Pearle and others. A review of this work, and further
references, is given in the articles by Ghirardi and Pearle published in
Proceedings of the Philosophy of Science Foundation 2 pp 19 and 35
(row).
The predictions of collapse models do not agree exactly with those
of orthodox quantum theory; for example, they give a violation of
energy conservation. It is this that puts limits on the parameters-the
process must not happen too quickly. Any bound system, initially in
its stable, lowest energy state, will have a certain probability of being
excited to a higher energy state if one of the constituents is ‘hit’. Thus,
for example, hydrogen atoms will spontaneously emit photons. Philip
Pearle and I have recently shown that the best upper limit on the rate
(i.e., lower limit on T), probably comes from the fact that protons are
known to be stable up to something like years. These protons are
in fact bound states of three quarks, and every time a quark is ‘hit’
there is a very small probability that the proton will go to an excited
state which will spontaneously decay. The fact that such decays have
not been observed puts severe restrictions on GRW-type models (and
may even rule out some simple versions).
In one sense it is an advantage for a model that it gives clear,
distinctive predictions, because this allows the possibility that it might
be verified. On the other hand, in the absence of any positive evidence
for the unconventional effects, the fact that the free parameters of the model have to be chosen rather carefully-to make the process
happen fast enough in a measurement situation, but not too fast to give
unobserved effects elsewhere-is a negative feature; why should nature
have apparently conspired so carefully to hide something from us?
A partial answer to this last question might lie in the possibility that
the parameters of the collapse are not in fact independent of the other
constants of the physical world, but arise in particular from gravity, as
suggested in Appendix 7. In his wide-ranging book, The Emperor’s
New Mind (Oxford University Press, 1989), Roger Penrose gives other
reasons for believing that gravity might be associated with collapse.
He also develops the idea that the human mind’s ability to go beyond
the limits of ‘algorithmic computation’, i.e., the use of a closed set of
rules, shows that it can only be explained by really new physics, and
that such new physics, which would be ‘non-computable’, might well
be associated with the collapse of the wavefunction.

Early history and the Copenhagen interpretation

We have not, in this book, been greatly concerned with the
historical development of quantum theory. When an idea is new
many mistakes are made, blind alleys followed, and the really
significant features can sometimes be missed. Thus history is unlikely to be a good teacher. Nevertheless, it is of interest to look
back briefly on how the people who introduced quantum theory
into physics interpreted what they were doing.
Already we have noted that Einstein, surely the premier scientist
of this century, was always unhappy with quantum theory, which
he considered to be, in some way, incomplete. Initially his objections
seemed to be to the lack of causality implied by the theory,
and to the restrictions imposed by the uncertainty principle. He had
a long running controversy with Bohr on these issues, a controversy
which it is fair to say he lost. In addition, however, Einstein was
one of the first to realise the deeper conceptual problems. These he
was not able to resolve. Many years after the time when he was the
first to teach the world about photons, the particles of light, he admitted
that he still did not understand what they were.
Even more remarkable, perhaps, was the attitude of Schrodinger .
We recall that it was he who introduced the equation that bears his
name, and which is the practical expression of quantum theory,
with solutions that contain a large proportion of all science. In
1926, while on a visit to Copenhagen for discussions with Bohr and
Heisenberg, he remarked: ‘If all this damned quantum jumping
were really to stay, I should be sorry I ever got involved with quantum
theory.’ (This quote, which is of course a translation from the
original German, is taken from the book by Jammer, The
Philosophy of Quantum Mechanics, p 57). The ‘jumping’
presumably refers to wavefunction reduction, a phenomenon
Schrodinger realised was unexplained within the theory, which he,
like Einstein, therefore regarded as incomplete. To illustrate the
problem in a picturesque way he invented, in 1935, the
‘Schrodinger cat’ story, which we have already discussed in §4.4.
He considered it naive to believe that the cat was in an uncertain,
dead or alive, state until observed by a conscious observer, and
therefore concluded that the quantum theory could not be a proper
description of reality.
Next we mention de Broglie, who, it will be recalled, was the first
to suggest a wave nature for electrons. He was also unhappy with
the way quantum theory developed, and took the attitude that it
was wrong to abandon the classical idea that particles followed
trajectories. He believed that the role of the wavefunction was to
act as a pilot wave to guide these trajectories, an idea which paved
the way for hidden-variable theories. Thus, of the four people (Planck, Einstein, Schrodinger, de
Broglie) who probably played the leading roles in starting quantum
theory, three became, and remained, dissatisfied with the way it
developed and with its accepted ‘orthodoxy’. This orthodoxy is
primarily due to the other three major figures in the early development
of the theory, Bohr and, to a lesser extent, Heisenberg and
Born. It has become known as the ‘Copenhagen’ interpretation.
A precise account of what the Copenhagen interpretation actually
is does not exist. Quotations from Bohr’s articles do not always
seem to be consistent (which is not surprising in view of the fact
that the ideas were being developed as the articles were being
written). Almost certainly, two present-day physicists, who both
believe that they subscribe to the orthodox (Copenhagen) interpretation,
would give different accounts of what it actually means.
Nevertheless there are several key features which, with varying
degrees of emphasis, would be likely to be present. We shall
endeavour to describe these.
(i) Bohr made much use of the notion of ‘complementarity’:
particle and wave descriptions complement each other; one is
suitable for one set of experiments, the other for different
experiments. Thus, since the two descriptions are relevant to
different experiments, it does not make sense to ask whether they
are consistent with each other. Neither should be used outside its
own domain of applicability.
(ii) The interpretation problems of quantum theory rest on
classical ways of thinking which are wrong and should be abandoned.
If we abandon them then we will have no problems. Thus
questions which can only be asked using classical concepts are not
permitted. Classical physics enters only through the so-called ‘correspondence’
principle, which says that the results of quantum
theory must agree with those of classical mechanics in the region
of the parameters where classical mechanics is expected to work.
This idea, originally used by Planck, played an important role in
the discovery of the correct form of quantum theory.
(iii) The underlying philosophy was strongly ‘anti-realist’ in tone.
To Bohr: ‘There is no quantum world. There is only an abstract
quantum physical description. It is wrong to think that the task of
physics is to find out how nature is. Physics concerns what we can say about nature.’ Thus the Copenhagen interpretation and the
prevailing fashion in philosophy, which inclined to logical
positivism, were mutually supportive. The only things that we are
allowed to discuss are the results of experiments. We are not
allowed to ask, for example, which way a particle goes in the interference
experiment of 61.4, The only way to make this a sensible
question would be to consider measuring the route taken by the
particle. This would give us a different experiment for which there
would not be any interference. Similarly, Bohr’s reply to the alleged
demonstration of the incompleteness of quantum theory, based on
the EPR experiment, was that it was meaningless to speak of the
state of the two particles prior to their being measured. (It should
be noted that Einstein himself had made remarks which were in this
spirit. Indeed Heisenberg, a convinced advocate of the Copenhagen
interpretation, was apparently helped along this line by one such
remark: ‘It is the theory which decides what we can observe.’)
(iv) All this leaves aside the question of what constitutes a
‘measurement’ or an ‘observation’. It is possible that somewhere in
the back of everyone’s mind there lurked the idea of apparatuses
that were ‘classical’, i.e. that did not obey the rules of quantum
theory. In the early days the universality of quantum theory was
not appreciated, so it was more reasonable to divide the world into,
on the one hand, observed systems which obeyed the rules of
quantum mechanics, and, on the other, measuring devices, which
were classical.
These, then, are the ingredients of the Copenhagen interpretation.
It is very vague and answers few of the questions; anybody
who thinks about the subject today would be unlikely to find it
satisfactory: yet it became the accepted orthodoxy. We have
already, in $5.2, suggested reasons why this should be so. The
theory was a glorious success, nobody had any better answers to the
questions, so all relaxed in the comfortable glow of the fact that
Bohr had either answered them or told us that they should not be
asked.
1 was a research student in Manchester in the 1950s. Rosenfeld
was the head of the department and the Copenhagen interpretation
reigned unquestioned. One particular Christmas, the department
visited the theoretical physics department in Birmingham to sing
carols (that, at least, was the excuse). Some of the carols were parodied. In particular, I remember the words we used for the carol
that normally begins ‘The boar’s head in hand bear 1’. They were:
At Bohr’s feet I lay me down,
For I have no theories of my own
His principles perplex my mind,
But he is oh so very kind.
Correspondence is my cry, I don’t know why,
I don’t know why.
But we were all afraid to ask!

More quantum mystery

Quantum theory has been the basis of almost all the theoretical
physics of this century. It has progressed steadily, indeed gloriously.
The early years established the idea of quanta, particularly
for light, then came the applications to electrons which led to all
the developments in atomic physics and to the solution of
chemistry, so that already in 1929 Dirac could write that ‘The
underlying physical laws necessary for the mathematical theory of
a large part of physics and the whole of chemistry are thus com pletely known.. .’ (Proceedings of the Royal Society A123 714).
The struggle to combine quantum theory with special relativity,
discussed in the preceding section, occupied the period from the
1930s to the present, and its successes have ranged from quantum
electrodynamics to QCD, the theory of strong interactions. We are
now at the stage where much is understood and there is confidence
to tackle the remaining problems, like that of producing a quantum
theory of gravity.
The interpretation problem has been known since the earliest
days of the subject (recall Einstein’s remark mentioned in 0 1. l),
but here progress has been less rapid. The ‘Copenhagen’ interpretation,
discussed in the next section, convinced many people that the
problems were either solved or else were insoluble. The first really
new development came in 1935 with the EPR paper, which, as we
have seen, purported to show that quantum theory was incomplete.
We must then wait until the 1950s for Bell’s demolition of the von
Neumann argument regarding the impossibility of hidden-variable
theories, and, later, for his theorem about possible results of local
theories in the EPR experiment. Throughout the whole period there
were also steady developments leading to satisfactory hiddenvariable
theories. At present, attempts are being made to see if
these are, or if they can be made, compatible with the requirements
of special relativity.
What progress can we expect in the future? In the very nature of
the case, new insights and exciting developments are unlikely to be
predictable. We can, however, suggest a few areas where they
might occur.
Let us consider, first, possible experiments. There is much interest
at present in checking the accuracy of simple predictions of
quantum theory, in order, for example, to see whether there is any
indication of non-linear effects. No such indications have been seen
at the present time, but continuing checks, to better accuracy and
in different circumstances, will continue to be made.
Another area where there is active work being done is in the
possibility of measuring interference effects with macroscopic
objects, or at least with objects that have many more degrees of
freedom than electrons or photons. The best hope for progress here
lies in the use of SQUIDS (superconducting quantum interference
devices). These are superconducting rings, with radii of several
centimetres, in which it is hoped that interference phenomena, as predicted by quantum theory, between currents in the rings can be
observed. Such observations will verify (or otherwise) the predictions
of quantum theory for genuinely macroscopic objects. In
particular, it should be possible to see interference between states
that are macroscopically different, and thereby verify that a system
can be in a quantum mechanical superposition of two such states
(cf the discussion of Schrodinger’s cat, etc, in 44.3).
The success of quantum theory, combined with its interpretation
problems, should always provide an incentive to experimentalists to
find some result which it cannot predict. Many people would
probably say that they are unlikely to find such a result, but the
rewards for so doing would be great. If something could be shown
to be wrong with the experimental predictions of orthodox quantum
theory then we would, at last, perhaps have a real clue to
understanding it.
It must be admitted that the likelihood of there being any practical
applications arising from possible discoveries in this area is
extremely low. There are many precedents, however, that should
prevent us from totally excluding them. We have already noted in
$5.6 that genuine observation of wavefunctions, were it ever to be
possible, might lead to the possibility of instantaneous transmission
of signals. To allow ourselves an even more bizarre (some would
say ridiculous) speculation, we recall that, as long as the wavefunction
is not reduced, then all parts of it evolve with time according
to the Schrodinger equation. Thus, for example, the quantum
world contains the complete story of what happens at all subsequent
times to both the transmitted and reflected parts of the wavefunction
in a barrier experiment. Suppose then that a computer is
programmed by a non-reduced wavefunction which contains many
different programs. In principle this is possible; different input keys
could be pressed according to the results (‘unobserved’, of course)
of a selection of barrier type experiments, or, more easily, according
to the spin projections of particles along some axis. As long as
the wavefunction is not reduced, the computer performs all the
programs simultaneously. This is the ultimate in parallel processing!
If we observe the output answer by normal means we select one
set of results of the experiments, and hence one program giving a
single answer. The unreduced output wavefunction, however, contains
the answers to all the programs. It is unlikely that we will
ever be able to read this information, but . . . On the theoretical side, we have already mentioned the passibility
that the difficulties with making a quantum theory of gravity
just might be related to the defects of quantum theory. Maybe
some of our difficulties with non-locality suggest that our notions
of time and space are incomplete. If, for example, our three dimensions
of space are really embedded in a space of more dimensions
then we might imagine that points of space which seem to us to be
far separated are in reality close together (just as the points on a
ball of string are all close, except to an observer who, for some
reason, can only travel along the string).
Bearing in mind the issue of causality, we might ask why we
expect this to exist in the first place, in particular, why we believe
that the past causes the present. Indeed we could wonder why there
is such a difference between the past, which we remember, and the
future, which we don’t! In case we are tempted to think these things
are just obvious, we should note that the fundamental laws of
physics are completely neutral with regard to the direction of time,
i.e. they are unchanged if we change the sign of the time variable.
In this respect time is just like a space variable, for which it is clear
that one direction is not in any fundamental respect different from
any other. Concepts like ‘past’ and ‘present’, separated by a ‘now’,
do not have a natural place in the laws of physics. Presumably this
is why Einstein was able to write to a friend that the distinction
between past and present was only a ‘stubbornly persistent
illusion’.
It may well be that, in order to understand quantum theory, we
need totally new ways of thinking, ways that somehow go beyond
these illusions. Whether we will find them, or whether we are so
conditioned that they are for ever outside our scope is not at
present decidable.

Quantum theory and relativity

This is a difficult section, from which we shall learn little that has
obvious relevance to our theme. Nevertheless, the section must be
included since its subject is very important and is an extremely
successful part of theoretical physics. There is also the possibility,
or the hope, that it could one day provide the answers to our
problems.
The mysteries that we met in Chapter One arose from certain
experimental facts. We have learned that quantum theory predicts
the facts but does not explain the mysteries. Now we must learn
that quantum theory also meets another separate problem, namely
that it is not compatible with special relativity.
The reason for this'is that special relativity requires that the laws
of physics be the same for all observers regardless of their velocity
(provided this is uniform). This requirement implies that only relative velocities are significant, or, in other words, that there is no
meaning to absolute velocity. In practice this fact makes little
difference to physics at low velocity; it is only when velocities
become of the order of the velocity of light (3 x lo8 m s-’) that the
new effects of special relativity are noticed.
Quantum theory, as originally developed, did not have this
property of being independent of the velocity of the observer, and
is thus inconsistent with special relativity. Although the practical
effects of this inconsistency are very tiny for the experiments we
have discussed, there are situations where they are important, and
it is natural to ask whether quantum theory can be modified to take
account of special relativity, and even to ask whether such
modifications might provide some insight into our interpretation
problems. The answer to the first of these questions is a qualified
‘yes’; to the second it is a tentative ‘no’.
The relativistic form of quantum mechanics is known as relativistic
quantum field theory. It makes use of a procedure known
as second quantisation. To appreciate what this means we recall
that, in the transition from classical to quantum mechanics, variables
like position changed from being definite to being uncertain,
with a probability distribution given by a wavefunction, i.e. a
(complex) number depending upon position. In relativistic
quantum field theory we have a similar process taken one stage
further: the wavefunctions are no longer definite but are uncertain,
with a probability given by a ‘wavefunctional’. This wavefunctional
is again a (complex) number, but it depends upon the
wavefunction, or, in the case where we wish to talk about several
different types of particle, upon several wavefunctions, one for
each type of particle. Thus we have the correspondence:
First quantisation:
Second quantisation:
x, y, . . .
W(x), V(x ), . . .
replaced by W(x, y, . . . )
replaced by Z( W(x), U(x). . .),
The analogue of the Schrodinger equation now tells us how the
wavefunctional changes with time.
An important practical aspect of relativistic quantum field theory
is that the total number of particles of a given type is not a fixed
number. Thus the theory permits creation and annihilation of
particles to occur, in agreement with observation.
For further details of relativistic quantum field theory we must refer to other books. (Most of these are difficult and mathematical.
An attempt to present some of the features in a simple way
is made in my book To Acknowledge the Wonder: The story of
fundamental physics, referred to in the bibliography.) There is no
doubt that the theory has been enormously successful in explaining
observed phenomena, and has indeed been a continuation of the
success story of ‘non-relativistic’ quantum theory which we outlined
in 82.5. In particular, it incorporates the extremely accurate predictions
of quantum electrodynamics, has provided a partially unified
theory of these interactions with the so-called weak interactions,
and has provided us with a good theory of nuclear forces. In spite
of these successes there are formal difficulties in the theory. Certain
‘infinities’ have to be removed and the only way of obtaining results
is to use approximation methods, which, while they appear to
work, are hard to justify with any degree of rigour.
Do we learn anything in all this which might help us with the
nature of reality? Apparently not. If, in our previous, nonrelativistic,
discussion, we regarded the wavefunction as a part of
reality, we now have to replace this by the wavefunctional, which
is even further removed from the things we actually observe. The
wavefunctions have become part of the observer-created world, i.e.
things that become real only when measured.
We must now consider the problem of making quantum theory
consistent with general relativity. Since general relativity is the
theory of gravity, this problem is equivalent to that of constructing
a quantum theory of gravity. Much effort has been devoted to this
end, but a satisfactory solution does not yet exist. Maybe the lack
of success achieved so far suggests that something is wrong with
quantum theory at this level and that, if we knew how to put it
right, we would have some clues to help with our interpretation
problem. This is perhaps a wildly optimistic hope but there are a
few positive indications. Gravity is negligible for small objects, i.e.
those for which quantum interference has been tested, but it might
become important for macroscopic objects, where, it appears,
wavefunction reduction occurs. Could gravity somehow be the
small effect responsible for wavefunction reduction, as discussed in
$3.7?
Probably the correct answer is that it cannot, but if we want
encouragement to pursue the idea we could note that the magnitudes
involved are about right. The ratio of the electric force (which is responsible for the effects seen in macroscopic laboratory
physics) to the gravitational force, between two protons, is about
For larger objects the gravitational force increases (in fact it
is proportional to the product of the two masses), whereas this
tends not to happen with the electric force because most objects are
approximately electrically neutral, with the positive charge on
protons being cancelled by the negative charge on electrons.
Consider, then, the forces between two massive objects, each of
which has charge equal to the charge on a proton. The electric force
will be equal to the gravitational charge if the objects weigh about
10-6g. Thus we can see that gravitational forces become of the
same order as electrical forces only when the objects are enormously
bigger than the particles used in interference effects, but
that they are certainly of the same order by the time we reach
genuine macroscopic objects. (See also the remarks at the end of
Appendix 7.)
We end this section by noting a few other points. General
relativity is all about time and space, about the fact that our
apparently ‘flat’ space is only an approximation, about the
possibility that there are singular times of creation, and/or extinction,
about the existence of black holes with their strange effects.
Some of these facts could be relevant, but at the present time all
must be speculation. As an example of such speculation we mention
the suggestion of Penrose that there might be some sort of
trade-off between the creation of black holes and the reduction of
wave packets (see the acticle by Penrose, ‘Gravity and State Vector
Reduction’ in Quantum Concepts in Space and Time, ed C J Isham
and R Penrose [Oxford: Oxford University Press 19851).

The Mysteries of the Quantum World

Readers who have read this far are probably confused. Normally
this is not a good situation to be in at the start of the last chapter
of a book. Here, however, it could mean that we have at least
learned something: the quantum world is very strange. Certain
experimentally observed phenomena contradict any simple picture
of an external reality. Although such phenomena are correctly
predicted by quantum theory, this theory does not explain how they
occur, nor does it resolve the contradictions.
What else ought we to have learned? We have seen, again on the
basis of experiment, that a local picture of reality is false. In other
words, the assumption that what happens in a given region of space
is not affected by what happens in another, sufficiently distant,
region is contrary to observation.
Nothing else is certain. We have met questions which appear to
have several possible answers. None of these answers, however, are
convincing. Indeed, it is probably closer to the truth to say that all
are, to our minds, equally implausible. The quantum world teaches
us that our present ways of thinking are inadequate.
I have tried to give a quick survey of the questions and their
possible answers in tables 6.1 and 6.2 The first of these tables
presents the problem purely in terms of the potential barrier experiment
introduced in 81.3. No reference is made here to quantum
theory or its concepts.

Can signals travel faster than light?

According to the special theory of relativity, the velocity of light
(or, more generally, of electromagnetic radiation) in vacuum is a
fundamental property of time and space. The rules for combining
velocities, and the laws of mechanics, etc, ensure that nothing can
move with a velocity that exceeds this.
It would take us too far outside the scope of this book to explain
special relativity; we can, however, assert with confidence that it is
now firmly based on experimental observation and that it is a vital
ingredient of the structure of contemporary theoretical physics.
That its effects are not immediately obvious in our everyday
experience is due to the large size of the velocity of light,
c = 3 x 10' ms-'.
How then do we understand the fact that, according to quantum
theory, wavefunction reduction happens instantaneously over
arbitrarily large distances and, further, that such behaviour is
apparently confirmed by experiment?

The first thing to notice here is that we cannot actually use this
type of wavefunction reduction to transmit real messages from one
macroscopic object to another. To help us appreciate what is meant
by this statement we should distinguish the transmission of a
message between two observers from what happens when the two
observers both receive a message. For example, two people, one on
Earth and one on Mars, could make an agreement that they will
meet at a particular time either on Earth or on Mars. In order to
determine which, they might agree to measure spins, in a prearranged
direction, of electrons emitted in a particular EPR experiment.
If they obtained + 1/2 they would wait on their own planet,
whereas if they obtained -1/2 they would travel to the other’s
planet. The correlation between the results of their measurements,
noted in $5.4, would ensure that the meeting would take place. It
would be possible for them to make their measurements at the same
time, so they would receive the message telling them the place of
the meeting simultaneously. However this message would not have
been sent from one to the other.
We contrast this with the situation where the prior agreement is
that the person on Earth will decide the venue and then try to communicate
this to the person on Mars. How car, he use the EPR type
of experiment to transmit this message? The only option he has is
either to make a measurement of the spin of the electron or not to
make the measurement. A code could have been agreed: the
measurement of the spin of A along a previously decided direction
would mean that the meeting is to be on Earth, whereas no such
measurement would mean that Mars would be the venue. Thus, at
a particular time, he decides on his answer-he either makes the
measurement or he does not. Immediately the wavefunction of B
‘knows’ this answer; in particular, if it is Earth then B will have a
definite spin along the chosen direction, otherwise it will not.
The person on Mars, however, although he can observe the
particle B, cannot ‘read’ this information because he is not able to
measure a wavefunction. There is no procedure that the observer
could use that would allow him to know whether or not the spin
of B was definite or not.
The same conclusion is reached if we use, instead of a single
experiment, an ensemble of identical experiments. In this case,
if we decide on the venue Earth, then we would measure the spins
of all the A particles in the specified direction. This would

immediately mean that all the B particles had a definite spin in that
direction. Now, if these were all the same, e.g. if they were all
+ 1/2, then we could verify this by simply measuring them.
However, they would not all be the same, half would be + 1/2 and
half would be - 1/2, which is exactly the same distribution we
would have obtained if the spins were not definite, i.e. if the venue
had been Mars and no measurements of A had been made.
The situation could be very different if the quantum theory
description is incomplete and there are hidden variables. If these
could, by some as yet unknown means, be measured, then, since
measurements at A inevitably change these variables at B, the
possibility of sending messages at an infinite velocity would seem
to exist, in violation of the theory of special relativity. Such a violation
can be seen explicitly in some types of hidden-variable theories
where a quantum force is required to act instantaneously over
arbitrarily large distances. This contrasts with the known forces,
which in fact are due to exchange of particles and whose influence
therefore cannot travel faster than the velocity of light.
We here have another very unpleasant feature of hidden-variable
theories. It is not, however, possible to use this argument to rule
them out entirely. Special relativity has only been tested in experiments
that do not measure hidden variables; if we ever find ways
of measuring them then the theory might be shown to be wronggeneralising
results from one set of experiments to an entirely
different set has often led to mistakes.
Even within normal quantum mechanics the question of how a
wavefunction can reduce instantaneously, consistently with special
relativity, is one that requires an answer. To discuss it would take
us into relativistic quantum field theory, which is the method by
which quantum theory and special relativity are combined.
Although this theory has had many successes, it is certainly not
fully understood and at the present time does not appear to have
anything conclusive to say.

Experimental verification of the non-local predictions of quantum theory

As we discussed in 82.5, quantum theory has been successfully
applied to a truly enormous variety of problems, and its status as
a key part of modern theoretical physics, with applications ranging
from the behaviour of the early universe and the substructure of
quarks to practical matters regarding such things as chemical binding,
lasers and microchips, is unquestioned. New tests of such a
theory might therefore be seen as adding very little to our
knowledge. The reason why, in spite of this, the experiments which
we describe here have attracted so much attention is that they test
certain simple predictions of the theory which violate conditions (in
particular the Bell inequalities) that very general criteria of
localisability would lead us to expect.
Following the publication of the first of the Bell inequalities, in
1965, there have been a succession of attempts to test them against
real experiments. These experiments are, in fact, quite difficult to
do with sufficient accuracy and early attempts, although they
generally supported quantum theory, with one exception, were
rather inconclusive. We shall therefore confine our discussion to
the recent series of experiments which have been performed in
France by Aspect, Dalibard, Grangier and Roger.
In all these experiments a particle emits successively two photons
in such a way that their total spin is zero. We recall that photons
are the particles associated with electromagnetic radiation, e.g.
light. They are spin one particles, in contrast to the spin 1/2
particles which we have previously used in our discussion. It is
convenient to measure the ‘polarisation’ of the photons rather than
their spin projections. These are related in a way that need not concern us. The only difference we need to note is that in the
predicted expression for ( E ) the angle between the various directions
has to be doubled, i.e. we find
(E(a, b)) = -cos 2(a - b). (5.9)
In the first experiment the spin measurements were carried out in
such a way that a particle with spin + 1 in a chosen direction was
deflected into the detector and counted, whereas a particle with spin
- 1 in the same direction was deflected away from the detector and
not counted. The experiment then measured the number of coincident
counts, i.e. counts at both sides. Because of imperfections
in the detectors it could not be assumed that no count meant that
the particle had spin - 1, it could have had spin + 1 and just been
'missed'. To take this into account it was necessary to run the
experiment with one or both of the spin detectors removed, and
then to use a modified form of the Bell inequality. We refer to the
experimental papers, listed in the bibliography, for details.
The important quantity that is measured is a suitably normalised
coincidence counting rate, which is predicted by quantum theory to
be given by
(5.10)
The factor 0.984, rather than unity, arises from imperfections in
the detectors (some particles are missed). If this prediction holds
throughout the whole range of angles then the Bell inequality is
violated. In figure 26 we show the results. The agreement with
quantum theory is perfect.
To demonstrate how effectively these results violate the Bell
inequality, and hence forever rule out the possibility of a local
realistic description of the world, the authors measured explicitly
at the angles where the violation was maximum, namely with the
configuration shown in figure 27, i.e. with a - b = b - a' =
a' - b' = 22.5", and a - b' = 67.5". A particular quantity S which
according to the Bell inequality has to be negative, but which
according to quantum theory has to be 0.1 18 2 0.005, is measured
to be 0.126 2 0.014. It is very clear that quantum theory and not
locality wins.
In the next set of experiments both spin directions were explicitly
detected, so the set-up was closer to that envisaged in the proof of
the original Bell inequality. From the measurements, the value
of(F(a, a ’ , b, b’)), defined in the previous section, was calculated
as
Fexp=t 2.697 i 0.015 (5.11)
for the orientation given by figure 27. This exceeds the bound given
in the inequality by more than 40 times the uncertainty. On the
other hand it agrees perfectly with the prediction of quantum
theory which, again allowing for the finite size of the detectors, is
calculated to be
Fq‘ = 2.70 & 0.05 (5.12)
instead of 2,2, which is the result with perfect detectors.

The third experiment was designed to investigate the following
question. Quantum theory suggests that measurement at A, say,
causes an instantaneous change at B, and this seems to be confirmed
by experiment. It appears therefore that ‘messages’ are sent with
infinite velocity (see the next section for further discussion of this).
Such a requirement would, however, not be needed if it were
assumed that the spin detecting instruments somehow communi cate their orientations to each other prior to the emission of the
photons, rather than when a photon actually reaches a detector. In
order to eliminate this possibility it is necessary to arrange that the
orientations are ‘chosen’ after the photons have been emitted.
Clearly the time involved is too small to allow the rotation of
mechanical measuring devices, so the experiment had two spin
detectors at each side, with pre-set orientations, and used switching
devices to deflect the photons into one or the other detector. The
switches were independently controlled at random. Thus, when the
photons were emitted, the orientations that were to be used had not
been decided. We refer to the original paper for further details of
this experiment and here record only the result, which was again in
complete agreement with quantum theory, and in violation of the
Bell inequality. Of course, it could be that nothing is really random
and that the devices that controlled the switching themselves communicated
with each other prior to the start of the experiment.
Such bizarre possibilities are hard to rule out (though if we were
sufficiently clever we could arrange that the signals which switch the
detectors originate from distant, different, galaxies that, according
to present ideas of the evolution of the universe, can never
previously have been in any sort of communication).

In this series of experiments it was also possible to vary the
distance between the two detectors and so test whether the wavefunction
showed any sign of ‘reducing’ as a function of time, as it
would according to the type of theory discussed in 03.4. Even when the separation was such that the time of travel of the photons was
greater than the lifetime of the decaying states that produced them
(which might conceivably be expected to be the time scale involved
in such an effect), there was no evidence that this was happening.
Thus it appears that, once again, quantum theory has been
gloriously successful. Maybe most of the people who regularly use
it are not surprised by this; they have learned to live with its strange
non-locality. The experiments we have described confirm this
feature of the quantum world; no longer can we forget about it by
pretending that it is simply a defect of our theoretical framework.
We close this section by noting the interesting irony in the history
of the developments following the EPR paper. Einstein believed in
reality (as we do); quantum theory seemed to deny such a belief and
was therefore considered by Einstein to be incomplete. The EPR
thought experiment was put forward as an argument, in which the
idea of locality was implicitly used, to support this view. We now
realise, however, that the experiment actually demonstrates the
impossibility of there being a theory which is both complete and
local.

Bell’s theorem

This theorem, published in 1964 (Physics 1 195), expresses one of
the most remarkable results of twentieth century theoretical
physics. It exposes, in a clear quantitative manner, the real nature
of the conflict between ‘common sense’ and quantum theory which
exists in the EPR type of experiment. As we shall show, the
theorem is easy to prove (once one has seen it), but the fact that
nobody at the time of the early controversy following the publication
of the EPR paper realised that such a result could be found is
the real measure of the magnitude of John Bell’s achievement.
In order to appreciate properly the meaning of the theorem we
must first emphasise an important distinction; one which we have
indeed already met. The EPR experiment suggests that
measurements on one object (A) alter what we can predict for
subsequent measurements on another object (B), regardless of how
far apart the objects may be at the time of the measurements. There
are two completely different ways of explaining this, namely:
(i) it could be that measurements on A actually have an effect on
B, or alternatively,
(ii) it could be that measurements on A only affect our knowledge
of the state of B, i.e. they tell us something about B which was in
fact already true before the measurement.
The first of these possibilities, which Bell’s theorem shows is the
case in quantum theory, is totally contrary to the idea of locality.
The second, on the other hand, is an everyday occurrence and has
no great significance.
As a trivial example illustrating the last remark, we imagine that
a box is known to contain two billiard balls, one of which is black
and the other white. We then remove one ball, in the dark, and put
it on a rocket which flies off into space. At this stage all that we
know about the colour of this ball is that there is a 50% chance of
its being white, and a 50% chance of its being black (just like a spin
in a given direction might have a 50% chance of being either + 1/2
or - 1/2). We then look at the ball remaining in the box and if it
is black (white) we immediately know that the other ball is white
(black). Again this is superficially rather like our experiment with
two spin 1/2 particles. However we know that in no sense do we do
anything to the distant ball by looking in the box. It already was
either white or black. Because of our lack of knowledge, our
previous description of it was incomplete. A complete description
did however exist, and with such a complete description, the observation
of the colour of the remaining ball would clearly have no
effect.
The question now is whether such a complete description can
exist in the EPR spin experiment, i.e. is it possible that there is a
way of specifying the state of particle B such that measurements on
A have no effect on B? Bell’s theorem allows us to give a negative
answer to this question both on the basis of quantum theory, and
of experiment (see next section).
It is instructive to see exactly what is involved in the theorem, in
particular, how little, so we shall give the proof even though it
again involves a small amount of mathematical symbolism. (A
simpler form of the theorem, described in terms of the behaviour
of people rather than particles, can be found in Appendix 9.)
To begin, we suppose that the spin-measuring apparatus, at each
side, is connected to a machine that records the results of the
measurements. We arrange that these machines record + 1 for a
spin measurement of + 1/2 and - 1 for a measurement of - 1/2.
Let M and N be the values recorded for the A and B particles
respectively. In fact we shall be concerned only with the product of
M and N, which we denote by E. The appropriate experimental
arrangement is depicted in figure 23. (Note that throughout this section we are making the natural assumption that a measurement
gives only one result. Thus we are ignoring the many-worlds
possibility. For further discussion of this point see the article of
Stapp, ‘Bell’s theorem and the foundations of quantum physics’,
American Journal of Physics 53 306 1985.) Not surprisingly, in view of the statistical nature of quantum
theoretical predictions, the argument requires us to consider not
just one event but many, i.e. the decay of a large number of
identical spin zero particles. For each such event we can record a
value of E (always + or - l), and we then calculate the average
over all events. This will depend upon the orientation of the two
spindetectors, which are given by the angles a and b, so we write
it as ( E ( a , b ) ) . Thus,
@(a, b)) = Average value of E
= Average value of M - N. (5.1)
Clearly this number lies between + 1 and - 1.
Next we introduce the variable H which is supposed to give the
required complete description of the two spin 1/2 particles. It is not
important for our purpose whether H consists of a single number
or several numbers. However, for convenience we shall refer to it
as though it were just a single number. When we know the value
of H we know everything that can be known about the system.
Each event will be associated with a certain value of H and in a number of such events there will be a certain probability for any
particular value occurring. If the hidden-variable theory is deterministic
(a restriction we shall later drop), then the values of Mand
Nin a given event, and for given angles a and b, are uniquely determined
by the value of H.
Now we introduce the assumption of locality which is here
expressed by the assertion that the value of M does not depend on
b and the value of N does not depend on a. In other words, the
value we measure for the spin of the particle A cannot depend on
what we choose to measure about particle B, and vice versa. It
follows that M depends only on H and a, whilst N depends only
on W and b. We express these dependences by writing the values
obtained as M(H, a) and N(H, b) respectively. The resulting value
of E is then given by
E(H, a, b) = M(H, a)N(H, b). (5.2)
For a particular value of H this is a fixed number. Different values
of H can occur when we repeat the experiment many times, and the
average value of E that is measured will equal the average of
E(H, a, b) over these values of H, i.e. the hidden-variable theory
predicts
(E@,b ) )= Average over H of E(H,a , b). (5.3)
At this stage we do not appear to have got very far. Since we do
not know anything about the variation of M or N with H, or about
the distribution of the values of H, all that we can say about the
predicted value of (E(a,b)) is that it lies between + 1 and - 1. This
of course we already knew.
Now comes the clever part. We consider two different orientations
for each of the spin measuring devices. Let these be denoted
by the angles a and a’ for measurements on A and by b and b’ for
measurements on B. For a ked value of H, there are now two
values of Mand two values of N, i.e. four numbers, each of which
is either + 1 or - 1. In table 5.1 we show all possible sets of values
for these four numbers, We also show the corresponding values for
the quantity F(H, a, a ’ , b, b‘ ), defined by
F(H,a,a’b,, b ’ ) = E ( H , a , b )+ E ( H , a ’ , b ’ )
+E(H,a‘,b)-E(H,a,b’). (5.4)

In all cases this quantity is + or -2, from which it follows that
its average value over the (unknown) distribution of H lies between
- 2 and + 2. Hence our local hidden-variable theory predicts that
the particular combination of results defined by
(F(a,a',b,b'))= (E(a,b))+ ( E ( a ' , b ' ) )
+ (E@', b)) - (E@, b ' ) ) (5.5)
(5.6)
This is one form of the Bell inequality.
It is important to realise that locality rather than determinism is
the key ingredient of this proof. In order to demonstrate this, we
relax the assumption that H determines the values of M and N
uniquely, and suppose instead that each value of H determines a
probability distribution for M and N. The locality assumption is
now a little more subtle. It is that the probability distribution for
M does not depend on the value measured for N, and vice versa.
To appreciate why this is so we recall that measurement of N
cannot tell us anything further about particle A, since H is intended
to be the complete description of the state of the system; equally,
because of locality, it cannot change the state of A. Hence the
probability of obtaining any given value of M does not depend on
the value measured for N.
We can then define independent averages of M and N, for each
value of H. We denote these by M"'(H, a) and N"'(H, b). Because
of the assumption of independence, the average value of the
product of M and N, which we write as E"', is equal to the product
of the average values, i.e.
satisfies:
-2 < (F(a,a',b,b') ) < +2.
Eav(HU, , b)= M"'(H, a)N"'(H, b). (5.7) It is now possible to prepare a table similar to that above with
Ma’ replacing M, etc. Instead of taking values of + or - 1, these
quantities lie somewhere between these limits. It is then quite easy
to show that the particular combination defining F, which we now
denote by Fa’, always takes a value that lies between -2 and +2.
When we then average Fa’ over all values of H we again obtain the
Bell inequality.
For a more complete discussion of the circumstances in which the
inequality, or various alternative versions of it, can be proved we
refer to the review article of Clauser and Shimony listed in the
bibliography (86.5)
The significance of the Bell inequality lies in the fact that,
unlike the inequality we found for E, it does not have to be true
if the locality assumption is dropped. Indeed it turns out that the
inequality is violated by the predictions of quantum theory.
Before we discuss these predictions it is interesting to see why
quantum theory fails to satisfy the assumptions of the theorem. In
quantum theory the full specification of the state is the wavefunction,
so this plays the role of the quantity H. We can then define,
as above, the averages of M and N over many measurements.
However, these averages are not independent; the distribution of
values of Mdepends on what has been measured for N. As a simple
illustration of this we note that, with our wavefunction corresponding
to total spin zero, the averatge value of M or N measured
independently is zero (regardless of the angles a, b). However, in
the special case of a = b, then we know that M is always opposite
to N, so the product is always - 1. Thus the average value of MN
is - 1, which is not the product of the separate averages of M and
N.
In general, the quantum theoretical prediction for (E(a, b ) )
depends on the difference between the angles a and b. As we show
in Appendix 8 it is given by
(E(a, b)) = -cos(a - b). (5.8)
This function is drawn in figure 24. As expected it lies between - 1
and + 1. However, and this is the reason why it leads to a conflict
with the Bell inequality, it cannot be factorised into a product of
a function of a and a function of b.
The resulting prediction for (F(a, a‘, b, b ’ ) ) is now easily
found. A particularly simple case is when the angles are chosen as in figure 25.. Here the violation of the inequality is maximised; each
term in Fcontributing the same amount, (@)/2, to the sum. Hence,
F= 2 3 2: 2.83
which is in clear violation of the Bell inequality.

Thus the Bell inequality shows that any theory which is local
must contradict some of the predictions of quantum theory. The
world can either be in agreement with quantum theory or it can
permit the existence of a local theory; both possibilities are not
allowed. The choice lies with experiment; the experiments have
been done and, as we explain in the next chapter, the answer is
clear.

The Einstein-Podolsky-Rosen thought experiment

In 1935, Einstein, Podolsky and Rosen published a paper entitled
‘Can Quantum-Mechanical Description of Physical Reality Be
Considered Complete?’ (Physical Review 47 777), which has had,
and continues to have, an enormous influence on the interpretation
problem of quantum theory. In this paper, they proposed a simple
thought experiment and analysed the implications of the quantum
theory predictions for the outcome of the experiment. These made
explicit the essentially non-local nature of quantum theory and,
according to the authors, proved that the theory must be
incomplete, i.e. that a more complete (hidden-variable) theory
exists and might one day be discovered. Much later, as we discuss
in the next section, John Bell carried the analysis considerably
further and showed that no local hidden-variable theory could
reproduce all the predictions of quantum theory. Naturally this
work prompted experimentalists to turn the thought experiments
into real experiments, and so check whether these predictions are
correct, or whether the actual results deviated from them in such
a way as to permit the existence of a satisfactory local theory. These
experiments, which we discuss in 05.5, beautifully confirm
quantum theory.
We shall refer to the general class of experiments with the same
essential features as that proposed by Einstein, Podolsky and
Rosen as EPR experiments. The orginal work is sometimes called
the EPR paradox, or the EPR theorem.
The particular EPR experiment that we shall describe is
somewhat different from the original, but is more suited to our
later discussion. We consider the situation shown in figure 20, in
which a particle with zero spin at rest in the laboratory decays spontaneously
into two, identical, particles, each with spin 112. These
particles, which we call A and B respectively, will move apart with
velocities that are equal in magnitude and opposite in direction.
(This ensures that their momenta add to zero so that the total
momentum, which was initially zero, is conserved.)
The experiment now consists of measuring the spin components
of the two particles in any particular directions-in fact, for
simplicity, we consider only directions perpendicular to the direction
of motion. Thus we have an apparatus that will measure the
spin component of particle A in a direction we can specify by the
angle a. Similarly we have an apparatus to measure the spin component
of B in a direction specified by some angle b. The full
experiment is illustrated in figure 21. The form of the apparatus
used to measure the spin is irrelevant for our purpose, but in order
to demonstrate that the measurement is possible we could consider
the case in which the spin 1/2 particles are charged, e.g. electrons.
In that case the particles would have a magnetic moment which
would be in the same direction as the spin. Then to measure the spin along a specific direction we could have a varying magnetic
field in that direction which would deflect the electron, up or down
according to the value of the spin component In order to discuss the form of the results we must digress a little
to think about spin. We first recall, from the earlier discussion of
spin in 53.7 (also Appendix 8), that a measurement of a spin component
of a spin 1/2 particle in any given direction will always give
a value either + 1/2 or - 1/2, i.e. the spin is always either exactly
along the chosen direction or exactly contrary to it. Suppose, for
example, that we know the particle has a spin component + 1/2 in
a particular direction (see figure 22). Whereas according to classical
mechanics we would obtain some value in between + 1/2 and - 1/2
for this second measurement, in fact, according to quantum
theory, we will obtain either of the two extremes, each with a
calculable probability. This probability will depend on the angle
between the two directions, and will be such that the average value
agrees with that given by classical mechanics. Within quantum
theory it will not be possible to predict which value we will obtain
in a given measurement; the situation in fact will be very analogous
to the choice of reflection or transmission in the barrier experiment
of 51.3. Further details of all this are given in Appendix 8. For the following
discussion the important fact we shall need to remember is that,
in quantum theory, the spin of a particle can have a definite value
in only one direction. We are free to choose this direction, but once
we have chosen it and determined a value for the spin in that direction,
the spin in any other direction will be uncertain. The fact that
when we measure the spin in this new direction we automatically
obtain a precise value implies that the measurement does something
to the particle, i.e. it forces it into one or the other spin values along
the new line. This of course is an example of wavefunction reduction
about which we have already written much.
The next thing that we need to learn is that the total spin, in any
given direction, for an isolated system, remains constant, Readers who know about such things will recognise this as being related to
the law of conservation of angular momentum. It is true in quantum
mechanics, as well as in classical mechanics; in particular, it
is true for individual events and not just for averages, a fact which
has been experimentally confirmed.

The pilot wave

Z think that conventional formulations of quantum theory, and of
quantum field theory in particular, are unprofessionally vague and
ambiguous. Professional theoretical physicists ought to be able to do
better. Bohm has shown us a way.?
In the very early days of quantum theory, de Broglie, who had been
the first to associate a wavefunction with a particle, suggested that,
instead of being the complete description of the system, as in conventional
quantum theory, the true role of this wavefunction might
be to guide the motion of the particles. In such a theory the
wavefunction is therefore called a pilot wave. The particles would
always have precise trajectories, which would be determined in a
unique way from the equations of the theory. It is such trajectories
that constitute the ‘hidden variables’ of the theory.
These ideas were not well received; probably they were regarded
as a step backwards from the liberating ideas of quantum theory
to the old restrictions of classical physics. Nevertheless, and in spite
of von Neumann’s theorem discussed above, interest in hiddenvariable
theories did not completely die, and in 1952 David Bohm
produced a theory based on the pilot wave idea, which was deterministic
and yet gave the same results as quantum theory. It also
provided a clear counter-example to the von Neumann theorem.
In Bohm’s theory a system at any time is described by a
wavefunction and by the positions and velocities of all the particles.
(Since it is positions that we actually observe in experiments, it is
perhaps paradoxical that these are called the ‘hidden’ variables, in
contrast to the wavefunction.) To find the subsequent state of the
system, it is necessary first to solve the Schrodinger equation and
thereby obtain the wavefunction at later times. From this wavefunction a ‘quantum force’ can be calculated. This force is
added to the other, classical, forces in the system, e.g. those due
to electric charges, etc, and the particle paths are then calculated
in the usual classical way by Newton’s laws of motion. The quantum
force is chosen so that there is complete agreement with the
usual predictions of quantum mechanics. What we mean by this is
that, if we consider an ensemble of systems, with the same initial
wavefunction but different initial positions, chosen at random but
with a distribution consistent with that given by the wavefunction,
then at any later time the distribution of positions will again agree
with that predicted by the new wavefunction appropriate to the
time considered. It is beyond the scope of this book to discuss
further the technical details, and problems, associated with these
considerations.
In comparing the Bohm-de Broglie theory with ordinary quantum
theory we note first that, since they give the same results for all
quantities that we know how to measure, they are equally satisfactory
with regard to experiments. As far as we know both are,
in this sense, correct. The former has the added feature of being
deterministic, but with our present techniques this is not significant
experimentally. The degree to which it is regarded as a conceptual
advantage is a matter of taste.
A much more important advantage of the hidden-variable theory
is that it is precise. It is a theory of everything; no non-quantum
‘observers’ are required to collapse wavefunctions since no such
collapse is postulated. All the problems of Chapter Three
disappear.
In connection with this last observation, we should note two
points. First, it may be asked how we have been able to remove the
requirement for wavefunction collapse when, in Chapter Three, we
appeared to find it necessary. The answer lies in the fact that,
whereas previously the wavefunction was the complete description
of the system, so that there was no place for the difference between
transmission or reflection (for example) to show other than in the
wavefunction, now that we have additional variables to describe
the system this is no longer the case. The wavefunction can be identical
for both transmission and reflection, since the difference now
lies in the hidden variables, in particular in the positions of the
particle.
Secondly we should note a reservation to the remark above that the two theories always agree, Readers may indeed be wondering
how this can be, when in one case we have wavefunction collapse
but not in the other. The answer lies in the fact that wavefunction
collapse only happens in the orthodox interpretation when
macroscopic measuring devices are involved. It is only when the
wavefunction can be written as the sum of macroscopically different
pieces that some of them are dropped in the process of reduction.
Now the difference between keeping all the pieces, as in the
Bohm-de Broglie theory, and dropping some of them, as in the
orthodox theory, is only significant experimentally if they can be
made to interfere. However, such interference can only occur if the
pieces can be made identical, which as we have seen (03.6 and
Appendix 6) is so unlikely for macroscopic objects as to be effectively
impossible. The two theories are experimentally
indistinguishable because macroscopic processes are not reversible.
Nevertheless we should emphasis that, where interference can in
principle occur, it is indeed observed. There is no positive evidence
that wavefunction reduction actually happens, so, especially in
view of the problems of Chapter Three, theories that do not require
it have a real advantage.
Given this fact it is perhaps rather remarkable that hiddenvariable
theories are not held in high regard by the general
community of quantum physicists. Why is this so? More importantly,
are there any good reasons why we should be reluctant to
accept them?
We have already hinted at some of the possible answers to the
first question. The many successes of quantum theory created an
atmosphere in which it became increasingly unfashionable to question
it; the argument between (principally) Bohr and Einstein on
whether an experiment to violate the uncertainty principle could be
designed was convincingly won by Bohr (as the debate moved into
other areas the outcome, as we shall see, was less clear); the
elegance, simplicity and economy of quantum theory contrasted
sharply with the contrived nature of a hidden-variable theory which
gave no new predictions in return for its increased complexity; the
whole hidden-variable enterprise was easily dismissed as arising
from a desire, in the minds of those too conservative to accept
change, to return to the determinism of classical physics; the
significance of not requiring wavefunction reduction could only be
appreciated when the problems associated with it had been accepted and, for most physicists, they were not, being lost in the mumbojumbo
of the ‘Copenhagen’ interpretation; this interpretation, due
mainly to Bohr, acquired the status of a dogma. It appeared to say
that certain questions were not allowed so, dutifully, few people
as ked them.
With regard to the second of the questions raised above (namely,
are there any good reasons for rejecting the hidden-variable
approach?), it has to be said that the picture of reality presented
by the Bohm-de Broglie theory is very strange. The quantum force
has to mimic the effects of interference so, although a particle
follows a definite trajectory, it is affected by what is happening
elsewhere. The reflected particle in figure 2 somehow ‘knows about’
the left-hand mirror, though its path does not touch it; similarly,
the particle that goes through the upper slit in the double slit experiment
shown in figure 13 ‘knows’ whether the lower slit is open or
not. This ‘knowledge’arises through the quantum force which can
apparently operate over arbitrarily large distances. To show in
detail the effect of this force we reproduce in figure 19 the particle
trajectories for the double-slit experiment as calculated by Philippidis
et a1 (I/ Nuovo Cimento 52B 15, 1979). We remind ourselves
that, if we are to accept the Bohm theory, then we must believe the
particles really do follow these peculiar paths. Particles have
become real again, exactly as in classical physics, the uncertainty
has gone, but the price we have paid is that the particles behave
very strangely!
Another, perhaps mainly aesthetic, objection to hidden-variable
theories of this type is that, without wavefunction reduction, we
have something similar to the many-worlds situation, i.e. the
wavefunction contains all possibilities. Unlike the many-worlds
case, these are not realised, since the particles all follow definite,
unique, trajectories, but they are nevertheless present in the
wavefunction-waiting, perhaps, one day to interfere with what
we think is the truth! Thus, in our example discussed in Appendix
2, both scenarios act out their complete time development in the
wavefunction. It is all there. The real, existing wavefunction of the
universe is an incredibly complicated object. Most of it, however,
is irrelevant to the world of particles, which are the things that we
actually observe.
The unease we feel about such apparent redundancy can be made
more explicit by expressing the problem in the following way: the pilot wave affects the particle trajectories, but the trajectories have
no effect on the pilot wave. Thus, in the potential barrier experiment,
the reflected and transmitted waves exist and propagate in
the normal way, totally independent of whether the actual particle
is reflected or transmitted. This is a consequence of the fact that the
wavefunction is calculated from the Schrodinger equation which
does not mention the hidden variables. It is a situation totally contrary
to that normally encountered in physics, where, since the time
of Newton, we have become accustomed to action and reaction
occurring together.

Hidden Variables and Non-locality

Review of hidden-variable theories
In $1.3 we saw that it is possible to repeat an experiment several
times, under apparently exactly the same conditions, and yet obtain
different results. In particular, for example, we could direct identical
particles, all with the same velocities, at identical potential
barriers, and some would be reflected and some transmitted. The
initial conditions would not uniquely determine the outcome.
Quantum theory, as explained in Chapter Two, accepts this lack
of determinism; knowledge of the initial wavefunction only
permits probabilistic statements regarding the outcome of future
measurements.
Hidden-variable theories have as their primary motivation the
removal of this randomness. To this end they regard the
‘apparently’ identical initial states as being, in reality, different;
distinguished by having different values of certain new variables,
not normally specified (and therefore referred to as ‘hidden’). The
states defined in quantum theory would not correspond to precise
values of these variables, but rather to certain specific averages over
them. In principle, however, other states, which do have precise
values for these variables, could be defined and with such initial
states the outcome of any experiment would be uniquely
determined. Thus determinism, as understood in classical physics,
would apply to all physics. Particles would then have, at all times,
precise positions and momenta, etc. The wavefunction would not
be the complete description of the system and there would be the
possibility of solving the problems with wavefunction reduction
which we met in Chapter Three. This latter fact is, to me at least,
a more powerful motivation than the desire for restoration of
determinism.
Any satisfactory hidden-variable theory must, of course, agree
with experimental observations and therefore, in particular, with
all the verified predictions of quantum theory. Whether it should
agree exactly with quantum theory, or whether it might deviate
from it to a small degree, while still remaining consistent with
experiment, is an open question, The normal practice seems to have
been to seek hidden-variable theories for which the agreement is
exact. A hidden-variable theory will, of course, tell us more than
quantum theory tells us-for example, it tells us which particles will
pass through a given barrier. What we require is that it gives the
same, or very nearly the same, results for those quantities that
quantum theory can predict.
There have been, and still are, many physicists who would regard
the question of the possiblity of such a hidden-variable theory,
agreeing in all measurable respects with quantum theory, as being
an unimportant issue. Readers who are still with us, however, are
presumably convinced that the quest for reality is meaningful, so
they will take a different view. The question is interesting and
worthy of our attention. Indeed, there are even pragmatic grounds
for pursuing it: different explanations of a set of phenomena, even
though they agree for all presently conceivable experiments, may
ultimately themselves suggest experiments by which they could be
distinguished. There is also the hope that better understanding of
quantum theory might help in suggesting solutions to some of the
other unsolved problems of fundamental physics.
The subject of hidden-variable theories was for many years
dominated by an alleged ‘proof’, given by von Neumann in 1932
(in his book Mathematische Grundlagen der Quantenmechanik
[Berlin : Springer] , English translation published by Princeton
University Press, 1955), that such theories were impossible, i.e. that
no hidden-variable, deterministic, theory could agree with all the
predictions of quantum theory. The proof was simple and elegant;
its mathematics, though subject to much scrutiny, could not be
challenged. However, the mathematical theorem did not really
have any relevance to the physical point at issue. The reason for
this lay in one of the assumptions used to prove the theorem. We
shall give a brief account of this assumption in the following paragraph. Since this account is rather technical and not used in the
subequent discussion, some readers may prefer to omit it.
Let us suppose that two quantities, call them X and Y, can be
separately measured on a particular system, and that it is also
possible to measure the sum of the two quantities, X + Y, directly.
Then the assumption was that the average value of X + Y, over any
collection of identical systems, i.e. any ensemble, was equal to the
average value of X plus the average value of Y. Since, in general,
the variable X+ Y is of a different kind, measured by a different
apparatus, from either X or Y, there is no reason why such an
equality should hold. Von Neumann was led to assume it because
it happens to be true in quantum theory, i.e. for those ensembles
specified by a given wavefunction. In a hidden-variable theory,
however, other states, defined by particular values of the hidden
variables, can, at least in principle, exist, and for such states the
assumption does not have to be true. Although several people
seemed vaguely to have realised this problem with von Neumann’s
theorem, it was not until 1964 that John Bell finally clarified the
issue, and removed this theoretical obstacle to hidden-variable
theories. The article was published in Reviews of Modern Physics
38 447 (1966).
At this stage we should emphasise that, although hidden variable
theories are possible, they are, in comparison to quantum theory,
extremely complicated and messy. We know the answers from
quantum theory and then we construct a hidden-variable, deterministic,
theory specifically to give these answers. The resulting
theory appears contrived and unnatural. It must, for example, tell
us whether a given particle will pass through a potential barrier for
all velocities and all shapes and sizes of the barrier. It must also tell
us ihe results for any type of experiment; not only for the
reflection/transmission barrier experiment of 0 1.3, but also for the
experiment with the mirrors. In the latter case, there can now be
no question of interference being the real explanation of what is
happening, because a given particle is certainly either reflected or
transmitted by the barrier and hence can only follow one path to
the detectors. Nevertheless, although it reaches only one of the
mirrors, which reflects it to the detectors, the path it follows must
be influenced by the other mirror. This is brought about by the
introduction of a new ‘quantum force’ which can act over
arbitrarily large distances. This quantum force is constructed in
order to give the required results. For details of all the various hidden-variable theories that are
available we refer to the excellent book by Belinfante, A survey of
hidden-variable theories [Oxford: Pergamon 19731. Here, we shall
only discuss a particular class of such theories; they appear to be
the most plausible and are the topic of our next section.

The many-worlds interpretation

In 1957 H Everett I11 wrote an article entitled “‘Relative State”
Formulation of Quantum Mechanics’ (Reviews of Modern Physics
29 454) which introduced what has become known as the ‘manyworlds’
interpretation of quantum theory. He began by noting that
the orthodox theory requires wavefunctions to change in two
distinct ways; first, through the deterministic Schrodinger equation
and, secondly, through measurement, which causes the reduction
of the wavefunction to a new wavefunction which is not uniquely
determined. It is this second type of change that causes problems;
what is a ‘measurement’?, what are the non-quantum forces that
cause it?, how can it occur instantaneously over large distances?,
etc. Everett was in fact motivated in his work by yet another
problem: he was interested in applying quantum theory to the
whole universe, but how could he then have an ‘external’ observer
to measure anything?
The solution that Everett proposed to the problems of wavefunction
reduction was to say simply that it does not happen. Any
isolated system can be described by a wavefunction that changes
only as prescribed by the Schrodinger equation. If this system is
observed by an external observer then, in order to discuss what
happens, it is necessary to incorporate the observer into the system,
which then becomes a new isolated system. The new wavefunction,
which now describes the previous system plus the observer, is again
determined for all times by the Schrodinger equation.
To help us understand what this means we shall put it into
symbolic form. To this end we return to the barrier experiment, in
particular as this was discussed in $3.5. We write the wavefunction,
after interaction with the barrier, in the form:
PR( w ’ D ~ ~ +D P~T(W ~‘D~RO )FFD ~ON ) ,

This is not really as complicated as it might appear. The Ws
describe the particle, with the arrows indicating the direction, and
the Ds the two detectors. The first bracket is then the wavefunction
of the reflected wave and the second that of the transmitted wave.
Each of these wavefunctions is taken to be 'normalised' so that it
corresponds to one particle. Then the PR and PT are the parameters
that give the magnitudes of the two wavefunctions. The squares of
these numbers give the probability for reflection and transmission
respectively. We notice that this wavefunction correctly describes
the correlations between the states of the detectors and those of the
particle, e.g. that if the right-hand detector is ON then the particle
has been reflected, etc. This correlation exists because, as noted in
$3.4, the wavefunction is not simply a product (in fact in this case
it is the sum of two products).
According to the orthodox interpretation of quantum theory
such a wavefunction reduces, on being observed, to
w'DON OFF with probability Pi
R DL
or to
w-DOFF ON with probability Pf. R DL
(See figure 15.)
In the interpretation due to Everett, however, this reduction does
not occur. The true reality is always expressed by the full wavefunction
containing both terms. This is all very well, we are saying,
but did we not convince ourselves previously that the reduction had
to occur; that deterministic theories are not adequate to describe
observation? We certainly did, so we must examine the argument.
It relied on the fact that we, or more properly I, do not see both
pieces of the wavefunction. To me, either reflection or transmission
has occurred, not both. Clearly then, in order to understand what
is happening, it is necessary to introduce ME into the experiment
and to include ME in the wavefunction. Although my wavefunction
is very complicated the only relevant part for our purpose here is
whether I am aware of reflection or transmission. We denote these
two states of myself by ME"' and ME"^^ respectively. Thus the
complete wavefunction, according to Everett, is:
P~(WDD)-MEI~+* Pr(W DD)+ME"~"~
where we have simplified the notation in an obvious way. Notice that again the wavefunction contains the correct correlations: if the
particle is transmitted then I have observed transmission, etc.
Previously we argued (e.g. in 54.1) that, since we are only aware
of one possibility, one of the terms in the above expression must
be eliminated. Everett would argue instead that there are two MES,
both conscious but unaware of each other. Thus, through my
observation of what happens in the barrier experiment, I have split
the world into two worlds, each containing one possible outcome of
the observation.
Similar considerations apply to other types of observation. In all
cases the Everett interpretation requires that all possible outcomes
exist. Whenever a measurement is made we can think of the world
as separating into a collection of worlds, one for each possible
result of the measurement. It is through this way of thinking that
the name ‘many worlds’ has arisen. Such a name was not, however,
in the original Everett paper, and in some ways it is misleading. The
key point of this way of interpreting quantum theory is that
measurements are not different from other interactions; nothing
special, like wavefunction reduction, happens when a measurement
is made; everything is still described, in a deterministic way, by the
Schrodinger equation.
How can we reconcile this with our previous belief that
measurements were special? The previous argument was basically
as follows:
I am only aware of one outcome of a measurement, therefore there
is only one outcome.
Now we would argue differently:
I am only aware of one outcome of a measurement because the ME
that makes this statement, is the ME associated with one particular
outcome. There are other MES, which are associated with different
terms in the wavefunction, and which are aware of different outcomes.
The wavefunction given above for the barrier experiment
illustrates this: both of the terms exist, there are two MES but they
are not aware of each other.
It will be seen that, from the point of view of the many-worlds
interpretation, the ‘error’ we made earlier was that we inserted a
tacit assumption that our minds were able to look at the world from outside, and hence to conclude from our certainty of a particular
result that the other results had not occurred.
The ‘branching’ of the world into many worlds is therefore an
illusion of the conscious mind. The reality is a wavefunction which
always contains all possible results. A conscious mind is capable of
demanding a particular result (this is what we mean by making an
observation) and thereby it must select one branch in which it
exists. Since, however, all branches are equivalent, the conscious
mind must split into several conscious minds, one for each possibie
branch.
Is this then the answer to the problem of reality in the quantum
world? At first sight it appears more satisfactory than our previous
ideas where consciousness seemed to have to affect wavefunctions;
now this is not required. Nevertheless the general view of the
theoretical physics community has been to reject the many-worlds
interpretation. This of course is not in itself a strong argument
against it, particularly when we realise that many writers have
rejected it on grounds that suggest they have failed to understand
it. Here I should admit that the above discussion was an attempt
to describe what I think is the most plausible form of the Everett
interpretation. The original paper, and others mentioned in the
bibliography, contain mainly the formalism of orthodox quantum
theory with little comment on the interpretation.
It is probably fair to say that much of the ‘unease’ that most of
us feel with the Everett interpretation comes from our belief, which
we hold without any evidence, that our future will be unique. What
I will be like at a later time may not be predetermined or calculable
(even if all the initial information were available), but at least I will
still be one ‘1’. The many-worlds interpretation denies this. For an
example to illustrate this lack of uniqueness (some would say rather
to show how silly it is) we might return to the barrier experiment
and suppose that the right-hand detector is attached to a gun which
shoots, and kills, me if it records a particle. Then after one particle
has passed through the experiment, the wavefunction would contain
a piece with me alive and a piece with me dead. One ‘I’ would
certainiy be alive, so we appear to have a sort of Russian roulette,
in which we cannot really lose! Indeed, since all ‘aging’ or ‘decaying’
processes are presumably quantum mechanical in nature, there
is always a small part of the wavefunction in which they will not
have occurred. Thus, to be completely fanciful, immortality is guaranteed-Z will always be alive in the only part of the wavefunction
of which Z am aware!
It is important to realise that the fact that another observer does
not see two '1's is not an argument against this interpretation. As
soon as YOU, say, interact with me so that you can discover
whether I am alive or dead, you become two Yous, for one of
which I am dead and the other I am alive. In wavefunction
language, using the previous notation, we would have:
PR(WDD)*ME~~~YOU' + WDD)~ME"~~~YOU'.
Neither of the two YOUS is aware that there are two MEs.
Two final remarks in favour of the many-worlds interpretation
should be made here. It has long been known that, for many
reasons, the existence of 'life' in the universe seems to be an incredible
accident, i.e. if many of the parameters of physics had
been only a tiny bit different from their present values then life
would not have been possible. Even within the framework of
'design' it is hard to see how everything could have been correct.
However, it is possible that most of the parameters of physics were
fixed at some early stage of the universe by quantum processes, so
that in principle many values were possible. In a many-worlds
approach, anything that is possible happens, so we only need to be
sure that, for some part of the wavefunction, the parameters are
correct for life to form. It is irrelevant how improbable this is,
since, clearly, we live in the part of the wavefunction where life is
possible. We do not see the other parts. Thinking along these lines
is referred to as using the anthropicprinciple; for further discussion
we refer to articles listed in the bibliography.
The other remark concerns the origin of the observed difference
between past and future, i.e. the question of why the world exhibits
an asymmetry under a change in the direction of time when all the
known fundamental laws of physics are invariant under such a
change. One aspect of this asymmetry is psychological: we
remember the past but not the future. (Note that it is because of
this clear psychological distinction between past and future that we
sometimes find it hard to realise that there is a problem here, e.g.
it is possible to fool ourselves that we have derived asymmetric
laws, like that concerning the increase of entropy, from laws that
are symmetric.) The many-worlds interpretation gives an obvious
explanation of this psychological effect: my conscious mind has a unique past, but many different futures. Each time I make an
observation my consciousness will split into ‘as many branches as
there are possible results of the observation. Some readers may
wish to note that this might allow vague, shadowy, probabilistic,
‘glimpses’ into the future-thus, a prophecy is likely to be fulfilled,
but only for one of the future MES.