Monday, July 2, 2007

Bell’s theorem

This theorem, published in 1964 (Physics 1 195), expresses one of
the most remarkable results of twentieth century theoretical
physics. It exposes, in a clear quantitative manner, the real nature
of the conflict between ‘common sense’ and quantum theory which
exists in the EPR type of experiment. As we shall show, the
theorem is easy to prove (once one has seen it), but the fact that
nobody at the time of the early controversy following the publication
of the EPR paper realised that such a result could be found is
the real measure of the magnitude of John Bell’s achievement.
In order to appreciate properly the meaning of the theorem we
must first emphasise an important distinction; one which we have
indeed already met. The EPR experiment suggests that
measurements on one object (A) alter what we can predict for
subsequent measurements on another object (B), regardless of how
far apart the objects may be at the time of the measurements. There
are two completely different ways of explaining this, namely:
(i) it could be that measurements on A actually have an effect on
B, or alternatively,
(ii) it could be that measurements on A only affect our knowledge
of the state of B, i.e. they tell us something about B which was in
fact already true before the measurement.
The first of these possibilities, which Bell’s theorem shows is the
case in quantum theory, is totally contrary to the idea of locality.
The second, on the other hand, is an everyday occurrence and has
no great significance.
As a trivial example illustrating the last remark, we imagine that
a box is known to contain two billiard balls, one of which is black
and the other white. We then remove one ball, in the dark, and put
it on a rocket which flies off into space. At this stage all that we
know about the colour of this ball is that there is a 50% chance of
its being white, and a 50% chance of its being black (just like a spin
in a given direction might have a 50% chance of being either + 1/2
or - 1/2). We then look at the ball remaining in the box and if it
is black (white) we immediately know that the other ball is white
(black). Again this is superficially rather like our experiment with
two spin 1/2 particles. However we know that in no sense do we do
anything to the distant ball by looking in the box. It already was
either white or black. Because of our lack of knowledge, our
previous description of it was incomplete. A complete description
did however exist, and with such a complete description, the observation
of the colour of the remaining ball would clearly have no
effect.
The question now is whether such a complete description can
exist in the EPR spin experiment, i.e. is it possible that there is a
way of specifying the state of particle B such that measurements on
A have no effect on B? Bell’s theorem allows us to give a negative
answer to this question both on the basis of quantum theory, and
of experiment (see next section).
It is instructive to see exactly what is involved in the theorem, in
particular, how little, so we shall give the proof even though it
again involves a small amount of mathematical symbolism. (A
simpler form of the theorem, described in terms of the behaviour
of people rather than particles, can be found in Appendix 9.)
To begin, we suppose that the spin-measuring apparatus, at each
side, is connected to a machine that records the results of the
measurements. We arrange that these machines record + 1 for a
spin measurement of + 1/2 and - 1 for a measurement of - 1/2.
Let M and N be the values recorded for the A and B particles
respectively. In fact we shall be concerned only with the product of
M and N, which we denote by E. The appropriate experimental
arrangement is depicted in figure 23. (Note that throughout this section we are making the natural assumption that a measurement
gives only one result. Thus we are ignoring the many-worlds
possibility. For further discussion of this point see the article of
Stapp, ‘Bell’s theorem and the foundations of quantum physics’,
American Journal of Physics 53 306 1985.) Not surprisingly, in view of the statistical nature of quantum
theoretical predictions, the argument requires us to consider not
just one event but many, i.e. the decay of a large number of
identical spin zero particles. For each such event we can record a
value of E (always + or - l), and we then calculate the average
over all events. This will depend upon the orientation of the two
spindetectors, which are given by the angles a and b, so we write
it as ( E ( a , b ) ) . Thus,
@(a, b)) = Average value of E
= Average value of M - N. (5.1)
Clearly this number lies between + 1 and - 1.
Next we introduce the variable H which is supposed to give the
required complete description of the two spin 1/2 particles. It is not
important for our purpose whether H consists of a single number
or several numbers. However, for convenience we shall refer to it
as though it were just a single number. When we know the value
of H we know everything that can be known about the system.
Each event will be associated with a certain value of H and in a number of such events there will be a certain probability for any
particular value occurring. If the hidden-variable theory is deterministic
(a restriction we shall later drop), then the values of Mand
Nin a given event, and for given angles a and b, are uniquely determined
by the value of H.
Now we introduce the assumption of locality which is here
expressed by the assertion that the value of M does not depend on
b and the value of N does not depend on a. In other words, the
value we measure for the spin of the particle A cannot depend on
what we choose to measure about particle B, and vice versa. It
follows that M depends only on H and a, whilst N depends only
on W and b. We express these dependences by writing the values
obtained as M(H, a) and N(H, b) respectively. The resulting value
of E is then given by
E(H, a, b) = M(H, a)N(H, b). (5.2)
For a particular value of H this is a fixed number. Different values
of H can occur when we repeat the experiment many times, and the
average value of E that is measured will equal the average of
E(H, a, b) over these values of H, i.e. the hidden-variable theory
predicts
(E@,b ) )= Average over H of E(H,a , b). (5.3)
At this stage we do not appear to have got very far. Since we do
not know anything about the variation of M or N with H, or about
the distribution of the values of H, all that we can say about the
predicted value of (E(a,b)) is that it lies between + 1 and - 1. This
of course we already knew.
Now comes the clever part. We consider two different orientations
for each of the spin measuring devices. Let these be denoted
by the angles a and a’ for measurements on A and by b and b’ for
measurements on B. For a ked value of H, there are now two
values of Mand two values of N, i.e. four numbers, each of which
is either + 1 or - 1. In table 5.1 we show all possible sets of values
for these four numbers, We also show the corresponding values for
the quantity F(H, a, a ’ , b, b‘ ), defined by
F(H,a,a’b,, b ’ ) = E ( H , a , b )+ E ( H , a ’ , b ’ )
+E(H,a‘,b)-E(H,a,b’). (5.4)

In all cases this quantity is + or -2, from which it follows that
its average value over the (unknown) distribution of H lies between
- 2 and + 2. Hence our local hidden-variable theory predicts that
the particular combination of results defined by
(F(a,a',b,b'))= (E(a,b))+ ( E ( a ' , b ' ) )
+ (E@', b)) - (E@, b ' ) ) (5.5)
(5.6)
This is one form of the Bell inequality.
It is important to realise that locality rather than determinism is
the key ingredient of this proof. In order to demonstrate this, we
relax the assumption that H determines the values of M and N
uniquely, and suppose instead that each value of H determines a
probability distribution for M and N. The locality assumption is
now a little more subtle. It is that the probability distribution for
M does not depend on the value measured for N, and vice versa.
To appreciate why this is so we recall that measurement of N
cannot tell us anything further about particle A, since H is intended
to be the complete description of the state of the system; equally,
because of locality, it cannot change the state of A. Hence the
probability of obtaining any given value of M does not depend on
the value measured for N.
We can then define independent averages of M and N, for each
value of H. We denote these by M"'(H, a) and N"'(H, b). Because
of the assumption of independence, the average value of the
product of M and N, which we write as E"', is equal to the product
of the average values, i.e.
satisfies:
-2 < (F(a,a',b,b') ) < +2.
Eav(HU, , b)= M"'(H, a)N"'(H, b). (5.7) It is now possible to prepare a table similar to that above with
Ma’ replacing M, etc. Instead of taking values of + or - 1, these
quantities lie somewhere between these limits. It is then quite easy
to show that the particular combination defining F, which we now
denote by Fa’, always takes a value that lies between -2 and +2.
When we then average Fa’ over all values of H we again obtain the
Bell inequality.
For a more complete discussion of the circumstances in which the
inequality, or various alternative versions of it, can be proved we
refer to the review article of Clauser and Shimony listed in the
bibliography (86.5)
The significance of the Bell inequality lies in the fact that,
unlike the inequality we found for E, it does not have to be true
if the locality assumption is dropped. Indeed it turns out that the
inequality is violated by the predictions of quantum theory.
Before we discuss these predictions it is interesting to see why
quantum theory fails to satisfy the assumptions of the theorem. In
quantum theory the full specification of the state is the wavefunction,
so this plays the role of the quantity H. We can then define,
as above, the averages of M and N over many measurements.
However, these averages are not independent; the distribution of
values of Mdepends on what has been measured for N. As a simple
illustration of this we note that, with our wavefunction corresponding
to total spin zero, the averatge value of M or N measured
independently is zero (regardless of the angles a, b). However, in
the special case of a = b, then we know that M is always opposite
to N, so the product is always - 1. Thus the average value of MN
is - 1, which is not the product of the separate averages of M and
N.
In general, the quantum theoretical prediction for (E(a, b ) )
depends on the difference between the angles a and b. As we show
in Appendix 8 it is given by
(E(a, b)) = -cos(a - b). (5.8)
This function is drawn in figure 24. As expected it lies between - 1
and + 1. However, and this is the reason why it leads to a conflict
with the Bell inequality, it cannot be factorised into a product of
a function of a and a function of b.
The resulting prediction for (F(a, a‘, b, b ’ ) ) is now easily
found. A particularly simple case is when the angles are chosen as in figure 25.. Here the violation of the inequality is maximised; each
term in Fcontributing the same amount, (@)/2, to the sum. Hence,
F= 2 3 2: 2.83
which is in clear violation of the Bell inequality.

Thus the Bell inequality shows that any theory which is local
must contradict some of the predictions of quantum theory. The
world can either be in agreement with quantum theory or it can
permit the existence of a local theory; both possibilities are not
allowed. The choice lies with experiment; the experiments have
been done and, as we explain in the next chapter, the answer is
clear.

No comments: