― 126 ―
The Problem of Causality (1938)
Causality resembles the other main issues of logical investigation in that it presents the mind with puzzles. Hume's question, “Why a cause is always necessary”, and the question why the same cause should always have the same effect, are examples of difficulties which have recurred throughout the history of thought. This is not to say that such difficulties cannot be got over; it merely indicates at once the importance of an exact logic and the tendency of the human mind to depart from or fail to reach exactness. It will be argued here that by the use of certain logical considerations (and particularly by emphasis on the notion of a field) the outstanding difficulties can be removed and a straightforward theory of causality developed.
We may take our departure from the question, often asked, why it is not just as natural and defensible to think that the same phenomenon has different causes, or the same agent different effects, on the various occasions of its occurrence, as to suppose an invariable order of events. The preliminary answer (allowing for distinctions to be developed later) is that, on the assumption of variability, we could not say that there was any causal connection at all. We could, of course, point out some succession of phenomena in any given case; but we could not say that the later phenomenon in question was any more “the successor” of the earlier one than anything whatever that occurred at the later time — and similarly with the notion of a “predecessor”. It may be that, when X occurs, we rightly anticipate Y, but, since this anticipation may also be rightly made in the absence of X, we have no right to say that it was X, and not some other factor W, that was the occasion of Y's appearance in the former case. In fact, we have no right to say that anything is “the occasion of” anything else.
According to Mill, in the consideration of Plurality of Causes as a
case actually occurring in nature, “there is required no peculiar method.
When an effect is really producible by two or more causes, the process for
detecting them is in no way different from that by which we discover single
causes” (System of Logic, Bk. III, Ch. X, § 3). But
the process of “discovering single causes”, as he has expounded it, is one
of excluding the irrelevant, by the consideration that that in the absence
of which the phenomenon occurs, and that in the presence of which the
phenomenon does not occur, is not its cause. The admission of “plurality”
involves the abandonment of this position, and it would appear that any one
of the “two or more causes” could just as easily be broken up —
― 127 ―
so that we are left with no method of discovery or even of elimination.
The point is that, in distinguishing the relevant from the irrelevant, that is to say, the necessary from the unnecessary, we are concerned with general conditions (necessity being equivalent to universality), and, if we do not find a general condition of a given occurrence, we are not answering the question that has been raised. If it were not a general question, a question of “sorts of things” and not of “mere particulars”, we should have no right to speak of the irrelevant or, as already suggested, of a “connection”. But it always is a general question. When we ask, for example, what causes this fire, it is not its being this but its being fire that we are seeking to account for. There might, indeed, be a special question of what causes fire here rather than anywhere else — it will be seen, as the discussion develops, that there is a particular sense in which “plurality” must be admitted — but, even so, it is fire, a certain sort of thing, that is the effect in question, and, if any distinction is to be made among conditions of its production, it will be a distinction between different kinds of conditions. It is natural, then, that, to the question what causes a certain sort of thing, the answer should be “a certain sort of thing”; it appears that what we are all the time seeking to establish is a general connection, that is to say, a universal proposition, to assert which is to assert that something happens invariably.
It is a curious but commonly unremarked feature of Mill's logic that,
while, in what he calls Induction, he professes to have a method of arguing
from particulars to generals, he is all the time working with generals. His
exposition of his methods depends upon the assumption that it is possible to
enumerate the features of a situation — and, as these features are, of
course, general, the situation composed by a finite number of them would
also be general. This is a difficulty of a kind which is bound to arise on
any theory of induction, and is only one of the difficulties of Mill's
theory in particular. But it brings out the fact that, since we can never
say that we have completely analysed a situation (every factor in it being
complex, every feature having itself features), Mill's methods can at most
indicate how we can verify a hypothesis previously entertained and not how
we can establish a conclusion. It is illogical to speak, as Mill does, of a
number of situations having only one feature in
common or of our varying only one factor when we are experimenting. If, for
example, a chemist introduces hydrogen into some mixture, it is not “pure”
hydrogen, hydrogen as such, that he introduces, but a particular sample of
hydrogen, differing in some respects from other samples. And while, in
actual fact, these differences may be irrelevant to the result obtained,
this is not proved by the introduction, however
careful the experimenter may have been; there is only a verification of a
postulated general connection between a certain kind of antecedent (the
entry of hydrogen) and a certain kind of consequent (whatever it may be).
But, it may be remarked, on the theory of variability referred to at the
outset, there would not even be verification; the position would be one of
sheer guesswork — a position
― 128 ―
not relieved by the appeal to “probabilities” (the fashionable substitute for the “occult powers” of the primitive superstition or the good and bad luck of ordinary speech) in place of connections of kind.
The inconsistency in which we have seen Mill to be involved in connection with “plurality”, turns on whether a cause is a necessary as well as a sufficient condition of an event's taking place. In expounding his methods Mill implies that it is; thus, in his exposition of the Method of Agreement (l.c., ch. VIII, § 1), he says that “b and c are not effects of A, for they were not produced by it in the second experiment”, i.e., it is not sufficient for them; while the “phenomenon a cannot have been the effect of B or C, since it was produced where they were not”, i.e., they were not necessary for it. The theory of Plurality of Causes implies that a cause is only a sufficient condition of “its effect”, for, if it were necessary as well, this would imply that, in what we call bringing an effect about in different ways, we had, in introducing a second sufficient factor, also introduced the first sufficient factor, even though we were unaware of having done so — so that the event really comes about in the same way in every case.
Now it may be argued that we are frequently aware of different sufficient conditions of a certain type of event, without taking any or all of them to be necessary. To quote an example given by Mill in the chapter first cited: “One set of observations or experiments shows that the sun is a cause of heat, another that friction is a source of it, another that percussion, another that electricity, another that chemical action is such a source.” Leaving aside for the present the possibility that different questions are at issue in the different cases, we may note that on no theory will it be denied that there are various sufficient conditions of an event (indeed, on the theory of the infinite complexity of things, there will be various necessary and sufficient conditions of anything, these all being necessary and sufficient for one another). But this is not to say that sufficiency is all that is required for causality, or that there is not a question of finding a necessary feature which is common to all the sufficient conditions. That question arises at once from the consideration of relevance. If it were merely the case that, when A is given, X follows, and that it follows likewise in the presence of B and in that of C, then A, B and C might have nothing to do with its occurrence.
The point here is not the possibility of error, the absence of
conclusive proof. Generals require generals for their premises, necessity
and sufficiency can be inferred only from assertions of necessity and
sufficiency, and a causal connection can be proved only from causal
connections already known. Where we do not have such knowledge but have only
formed a hypothesis (e.g., that X always follows A), then we may get
verification of it, but verification is not proof and is quite consistent
with the falsity of the verified proposition. But, granting that
verification may be all we are looking for, the question is what sort of hypothesis we have formed in suggesting a
causal connection; and it is clear that we are at least distinguishing
conditions under which
― 129 ―
something occurs from conditions under which it does not. If X occurred in any case, then it would be idle to say that it was conditioned in turn by A, B and C, merely because it ensued upon each of these. But if it did so ensue, and if A, B and C covered all cases, i.e., if, in A's absence, B or C was bound to be present, then X would occur in any case. It appears, then, that, even in enumerating sufficient conditions, we take them to be restricted in scope; we assume that a certain set of them (say, for the sake of simplicity, three as above) cover all cases of X's occurrence. That means that the expression “A or B or C” gives a necessary and sufficient condition of the occurrence of X. It may be, of course, that we do not look for such a set, that we are content with the knowledge that certain things are sufficient; it may be again that we should regard “A or B or C” as a cumbrous and unsatisfactory solution of our problem, and that, as already suggested, we should look for some common feature of the three that would meet the case. But the important point is that, to whatever extent we may actually prosecute out inquiries, in the mere assertion of sufficiency the problem of finding a necessary and sufficient condition is already posed, and, if we deny that there need be such a condition, we make the very use of the term “condition” pointless.
If, then, it be contended, in accordance with Mill's first position,
that by a cause we mean a necessary and sufficient condition (more
particularly, a necessary and sufficient precedent
condition), it will appear that a hypothesis of causality really involves
two universal propositions, and, it may be said,
requires a double verification. This is what is implied in Mill's account of
the simplest use of the Direct Method of Difference, where we first observe
the absence of the factor in question as well as of the phenomenon it is
supposed to cause, this verifying the supposition that the factor is
necessary, and then observe the phenomenon ensuing upon the introduction of
the factor, this verifying the supposition that it is sufficient. It should
be observed, however, that, strictly speaking, the same observations verify
both suppositions; the observation of A and B in conjunction verifies the
supposition that A is sufficient for B and also the supposition that B is
sufficient for A or, what comes to the same thing, A is necessary for B (the
time-factor being neglected here for the sake of brevity). But the point
about the double verification is that it informs us of the existence of
“negative instances”. The difference between the suggestion that the
presence of B entails the presence of A and the suggestion that the absence
of A entails the absence of B is, it may be said, that the latter lays it
down that A is sometimes absent; and, accordingly, a
verification of its absence is of importance. But, again, in strict logic,
this is not correct. Unless A is just any condition whatever — in which case
to refer to it as a “factor” is quite off the mark — it will be a
differentiating condition; it will be sometimes present and sometimes
absent. In logical form, if A is something specific (and, if it is not, no
assertion is being made), the assertion “All B are A” is equivalent to the
assertion “All non-A are non-B”, each being the contrapositive of the other.
What is brought out
― 130 ―
by the consideration that there may not be “negative instances” is that we are concerned not with relations between A and B in general but with their relations within certain limits or in a certain “field”; and it is the consideration of the “field” that enables us to make the theory of causality precise and to clear up the difficulties in which Mill and others are involved.
For, while a verification of a proposition is necessarily a verification of any equivalent proposition and the two contrapositives formulated above are equivalent (or imply one another), the case is different when it is a question of a “virtual contrapositive” or “contraposition” within a field. An instance in which A and B are jointly present in the field X verifies the supposition that the presence of A entails the presence of B in the field and also the supposition that the presence of B entails the presence of A. But before we translate the latter assertion into the assertion that the absence of A entails the absence of B from the field, we have to be assured that A ever is absent from the field, and this assurance is given by the “negative instances”. If, however, we knew in advance that A is sometimes absent from the field, we could make the transition in question without having to examine these instances. That is, granted that some X are not A, we can pass from “All X which are B are A” to “All X which are non-A are non-B”, and we can go through the reverse process granted that some X are B; so that, in cases where we know that examples of two terms and their opposites are all to be found within a field, we can speak of propositions like the above pair as “virtual contrapositives”, remembering that they are not strictly contrapositives and are not strictly equivalent. We can now see what is meant by speaking of a necessary and sufficient condition of a certain type of occurrence within a certain field; and, assuming it to be a precedent condition, then it is what we call a “cause”. In other words, if there is any force in the line of argument so far pursued, a cause is always a cause within a field.
On this theory the difficulties of “plurality” disappear. A may be
necessary and sufficient for the occurrence of B within the field X, and yet
not be necessary or sufficient for its occurrence within the field Y. And
the fact that A cannot, as we say, make a Y become B, is nothing against its
having that effect on an X and suggests no variability in the causation of B
in the field X. Thus, what makes me angry may leave you quite indifferent,
but this does not mean that there are not perfectly definite conditions of
the occurrence of anger in me. Further, it does not mean that there are not
definite conditions of the occurrence of anger in men; for what is necessary and sufficient for its occurrence in
this wider field must be necessary and sufficient for its occurrence in me,
and in you, as part of the field, but what is necessary and sufficient for
its occurrence in me may not be necessary and sufficient for its occurrence
in other men. We could inquire, again, into the conditions of its occurrence
in the still wider field of animals. That is to say,
we can have many different problems, but no one of them is definite, and
― 131 ―
confusion can result, if we have not begun by specifying (a) the field, (b) the phenomenon which may or may not occur within the field (e.g., anger in me) and of whose occurrence we are seeking to determine the conditions.
The inquiry into causes, in fact, is only a special case (involving, as mentioned above, a time-factor) of the solution of problems in general. In trying to determine when a phenomenon is present, and when it is absent, in a given field, we are endeavouring to divide a genus (the field) into two species, one of which has a certain property, while the other has the opposite. We are asking what distinguishes the cases in which a G is P from the cases in which a G is not P; that is, in terms of the doctrine of predicables, we are looking for a difference (or differentia) which will solve the problem posed by the variable property in the genus (e.g., by the appearance and non-appearance of anger among men). And we have solved it, or at least proposed a solution, when we say that (a) all G which are D are P, and (b) all G which are non-D are non-P — that is, that D is a necessary and sufficient condition of the occurrence of P in the field G. In other words, we have a problem when we know that a G may or may not be P, and we have a solution when we can use D as a criterion determining absolutely whether it is or not.
Now it is important to observe that “necessary and sufficient” is a symmetrical relation (if A is necessary and sufficient for B, B is necessary and sufficient for A), this arising from the fact that “necessary” and “sufficient” are converse relations (if A is necessary for B, B is sufficient for A, and vice versa); and the same applies to a necessary and sufficient condition within a field. For the two propositions, all G which are D are P and all G which are non-D are non-P, assure us that D, P and their opposites all occur in G, so that we are entitled to pass to the “virtual contrapositives”, all G which are non-P are non-D and all G which are P are D, where D appears as the property and P as the difference. This reversibility, of course, raises no logical difficulty, and the fact that either may be taken as a criterion of the other is met in practice by our selecting the more readily observable or controllable, granted that we already know the solution, and, prior to that, by our starting from a specific problem wherein something is taken as the property, and the difference is what we are looking for. It is, however, a point to be remembered in view of rationalist attempts to represent some properties or conditions as “more fundamental” than others.
In the theory of causality this rationalism takes the form of
representing the cause as superior in reality or logical standing to the
effect. But, when a cause is taken as a necessary and sufficient precedent
condition of the occurrence of a phenomenon (its “effect”) in a certain
field, then it follows that the effect is a necessary and sufficient subsequent condition of the occurrence (or
operation) of the cause in the field. So that, granting the temporal
priority of the cause, there is no question of any logical priority; and
while, if our causal beliefs are true, we can with certainty, given the
cause, infer that the effect will occur, we can
― 132 ―
with equal force infer, given the effect, that the cause has occurred. In this way, the cause is no more a “reason” for the effect than the effect is for the cause. But, before we can be satisfied with this solution, we have to consider a difficulty in the very conception of a “necessary and sufficient precedent condition”, a difficulty which may appear to force us to recognise a difference in status between causes and effects.
This is that, if condition A exists for a time, however short, during which condition B does not exist (and, otherwise, A does not precede B — and, if both came into existence at the same time, there would be no reason for calling one cause and the other effect), then A is not sufficient for B; a lapse of time, at least, is also required. And this will appear all the more strikingly when it is observed that causing, conceived as above, must be a transitive relation (one such that if A has it to B and B to C, A has it to C), since both “necessity and sufficiency” and precedence are transitive relations; so that, if we find a number of terms in a causal series, the first can be said to cause the last, in exactly the same sense as the last but one does. The difficulty might be met by saying that the lapse of time should be included in the statement of the condition, that having been A for some time, or having been subject to A some time ago, should be taken as the occasion of an X's being B. Or, again, it may be said that, in using the very phrase “precedent condition” in this connection, we are signifying a condition of an X's going to be B, and that A may be sufficient for that, even if it is not at all times indicative of B's presence in the field X.
This brings us to an essential point of distinction between the
consideration of causal relations and that of relations of properties; in
the latter case, we are concerned with establishing conditions under which
an X is B or under which it is not B, whereas in the former case our inquiry
is into conditions under which an X becomes B. It is
this that marks the distinction between the direct and the indirect method
of difference, in Mill's theory — in the direct method (or, at least, in
what Mill admits to be the principal type of its application) a factor is
introduced and a certain change ensues; in the indirect method (also called
the joint method of agreement and difference) we have only observation of
the joint presence and joint absence of two properties, but no indication of
the temporal priority of either to the other — if there were, that would
involve the entry of a factor and the use of the
direct method. It appears, then, that the indirect method, which, like the
rest of Mill's methods, gives only verification and not proof, can at most
verify a hypothesis of difference, i.e., of a
criterion for distinguishing, within a genus, that species which has a
certain property from that which has not. In the case of the direct method,
on the other hand, the position is that a member of the genus (or part of
the field) acquires a character which it previously
had not; and it is this acquisition, or, more exactly, the thing's now
having the character, that we speak of as an effect. For example, when
something makes me blush, it is “my blushing now” that is said to be the
effect of its operation, and not my transition from
― 133 ―
non-blushing to blushing, though it has to be understood that I was not blushing before.
Thus the effect, in ordinary usage, corresponds to the “formal cause” in the Aristotelian classification — at any rate, to the acquisition by the “material cause” (which, here, is the field or the relevant part of the field) of the form or character in question. And, “the matter having the form” being actually the effect, we are left with the “efficient cause” as the cause proper — which is still in accordance with ordinary usage. From this point of view, the field is what is “acted upon”, and the contention that a cause is a cause in a field amounts to the assertion that any “causal law” embodies the statement both of what acts and of what is acted upon, so that the fact that something acts differently on different things implies no “exception” to law or variation in it — nor does the fact that it may act differently on the same thing at different times, for this merely indicates that the thing acted on has changed in some respects, i.e., has ceased to be a member of a certain genus and become a member of another. Nevertheless, we see that the above way of speaking puts cause and effect in different positions, since the effect characterises some member of the genus, whereas the cause, if not necessarily outside the genus (since members of the same genus do interact), at least may be so and is certainly outside the member affected, though the two enter into the same situation.
Such a difference of relation to the field does not, of course, imply any difference in logical standing, any division of reality into agents and patients; whatever we call the cause and whatever we call the effect are alike situations, and any situation can have “efficacy” in that it can be the sufficient (as well as necessary) condition of another situation. Thus, when something makes me angry, my anger (the effect) may cause amusement in someone else. The difficulty which arises is not whether the same thing can be a cause and an effect, but whether it can be a cause and an effect within the same field. It has, in any case, to be emphasised, in the consideration of causality as a transitive relation, that the same field should be in question throughout — otherwise, the argument to prove that a cause of a cause of a thing is a cause of the thing itself has an ambiguous middle. But, if the effect is regarded as characterising a member of the genus (or field) and the cause is not, it would appear that this ambiguity is always present, that there is no such thing as a “causal chain” (or transitive causality). It is quite certain that this expression is often used very loosely of cases where neither necessity nor sufficiency carries over from one link to the next.
At the same time, we commonly recognise stages
in the development of certain kinds of things, i.e., we find them to have a
regular succession of properties. But we do not say, in such cases, that the
earlier stages cause the later. And this is so not merely when, as in the
case of the “chains” which are not really linked, it is possible to have the
succession interfered with, e.g., in the “ages of man”, where the prior
stage of youth is necessary but not sufficient for the attainment of age. We
― 134 ―
do not use the term “cause” even when we know that the later development is unavoidable; we know that whatever is alive is going to die, but we do not say that being alive is the cause of death. When, in fact, we proceed to give a causal account of the development of anything, it is by considering the interactions of the minor systems which it comprises — in addition, of course, to its interactions with other systems. If, therefore, we say that an effect is a property (or a thing's having a property) while a cause is not but is an outside thing (a thing situated in such-and-such a way towards the first thing), we are not raising any obstacles to investigation. On the contrary, we have the advantage, in regarding causation as external action, of rejecting any rationalist doctrine of development from internal resources or by “unfolding of potentialities”; and, in discarding “causal chains”, we are recognising that there is no unilinear form of development but interaction at all points.
Further working out of the theory here outlined would undoubtedly lead on to fresh problems; the attempt to give a thorough treatment of any of the main questions of logic involves us sooner or later in the others. But it has at least been indicated how the theory of the “field”, without departing violently from common conceptions, enables us to combine scientific rigour with recognition of the actual plurality of things. And it provides a solution of all those minor puzzles in which Mill, in his discussion of the subject, becomes entangled. In regard to the “invariable sequence” of day and night, for example, Mill argues soundly enough that neither is an unconditional antecedent of the other; but, lacking the conception of the field, he is unable to clear up the question altogether. To make it more precise, we have to observe that the distinction, as regards any selected portion of the earth's surface, is between its being illuminated (by the sun) and its not being illuminated. Calling the region X and “being illuminated” B, we see that the assertion that night is the cause of day amounts to saying that X's not being B is the cause of its being B; in other words, the assertion is that a certain change's not having taken place is the cause of its taking place. Obviously, to become B, X must have been non-B; but this is not an account of the conditions of the change. To speak of sequence at all is to imply the occurrence of some change, and, if the passage from non-B to B were called an “invariable sequence”, every sequence would be invariable. Once we have made our problem precise, however, there is no difficulty about the answer. Taking “regions of the earth's surface” as the field, we find that the cause of the acquisition of the illuminated character (and, similarly, of the unilluminated character) is the rotation of the earth in relation to the sun's rays; or, if we include the rotation in the specification of the field, we have simply the sun's rays as the cause. Any number of problems can be raised in regard to any natural phenomenon, but, once we have specified field and property, we know, at any rate, the form that an unambiguous answer will take, and we shall not be misled into taking the problem, or part of it, as its own solution.
It is, again, through failure to specify the field that Mill falls
― 135 ―
confusion on the distinction between cause and conditions, and wishes to treat “the whole of the antecedents” as “the real cause”. Thus, when people say that the eating of a certain dish was the cause of a person's death, Mill thinks they are leaving out of account such conditions as “a particular bodily constitution, a particular state of health, and perhaps even a certain state of the atmosphere” (l.c., ch. V, § 3); whereas it is obvious that some of these conditions should be taken not as part of the cause operating but as part of the field operated upon, since no one supposes that the eating of that dish is the cause of death in general. It is likewise failure to specify the problem, or confusion between different problems, that leads Mill into difficulties (l.c., ch. VI) regarding “exceptions” to the principle of Composition of Causes, according to which “the joint effect of causes is the sum of their separate effects”. The real question is whether what has been taken to be the effect of a factor A occurs or not when a factor C is also operating — if it does not, then A is not sufficient for its supposed effect; if it does, the fact that C is also operating is beside the point. In the special case in which the causal hypotheses are stated in quantitative terms, the test of them is still whether the quantity specified occurs or not in the given instances. And, finally, the theory of “intermixture of effects” is open to similar objections to those which have already been urged against “plurality of causes”.
Mill's main error, however, lies in the assumption, which he holds in common with other rationalists, that a situation or “phenomenon” can be analysed into a number of simple factors — that science, indeed, consists in the reduction of facts to their simple laws of connection. The recognition of the infinite complexity of things, on the other hand, leads us to see that there will be many different laws “governing” the same process, that everything goes on in various, though interrelated, ways. And just as, on this view, there will be many “differences, each solving the problem of a certain variation within a genus, so, even allowing for the distinction that has been drawn between the two cases, there will be many causes of the acquisition of a character by a certain sort of thing, since any situation which is said to have this effect will be a complex of interrelated ways of working. Since, in fact, to have a character is itself to have a complex way of working, there will be no line of demarcation between the inquiry into differences and the inquiry into causes (and no distinction between classificatory and historical or developmental science), but the former will involve recognition of causal action within a thing (of the thing as a system), this being never unconnected with causal action without.
It will seem curious to some minds that one should say that there can
be many necessary and sufficient conditions of any
situation, since to say that something is sufficient is understood to mean
that nothing else is necessary. But there is no real difficulty here. The
recognition of equality of sides is sufficient to distinguish equilateral
from other triangles, and yet the recognition of equality of angles is an
equally sound basis of discrimination. That is to say, the recognition of equality of sides is not required for
our distinguishing the species of triangles, but the
― 136 ―
species must have that difference even when we use another criterion. Difficulty arises only on the assumption of simple characters or factors, as when Mill speaks of “the only” difference between two sets of circumstances. But, as we have observed, any specific difference is itself complex, and the fact that there are many ways of specifying it (since it has many ways of working) does not involve it in ambiguity or lead to the denial of “invariability” in the form of universal truths. It is, of course, possible for us to make mistakes in regard to universal connections, just as in regard to particular “collocations”; but it is not possible for us to think at all without believing in some “laws”. Errors can be corrected by the testing of beliefs, though even so it is by other beliefs that we test any given one. It is only the attempt to reduce them to “elements” or rest them on “ultimates” that makes error inevitable.