Counterexamples to Stalnaker's Thesis
I like a broadly Kratzerian account of conditionals. On this account, the function of if-clauses is to restrict the space of possibilities on which the rest of the sentence is evaluated. For example, in a sentence of the form 'the probability that if A then B is x', the if-clause restricts the space of possibilities to those where A is true; the probability of B relative to this restricted space is x iff the unrestricted conditional probability of B given A is x. This account therefore valides something that sounds exactly like "Stalnaker's Thesis" for indicative conditionals:
Thesis: P(if A then C) = P(C/A).
On the account I like, if you say 'P(if A then C)' in English, you almost inevitably end up saying something that denotes the conditional probability P(C/A), rather than the unconditional probability of some proposition expressed by 'if A then C'.
So it's interesting that Vann McGee and Stefan Kaufmann have found intuitive counterexamples to Stalnaker's Thesis. One of Kaufmann's examples in "Conditioning against the grain" goes as follows. There are two bags. In bag X, most balls are red, and most of the red balls have black spots. In bag Y, few balls are red, and few of those balls have black spots. You are 75% confident that the bag in front of you is bag Y. Now consider the statement:
(1) If you pick a red ball, it will not have black spots.
Many people apparently intuit (1) to have fairly high probability. I take that to mean that they would assent to
(1') Probably, if you pick a red ball, it will not have black spots.
This contradicts the Thesis, because getting a red ball is evidence that the bag in front of you is bag X, in which case it is rather likely that the ball has black spots.
As Kaufmann observes, if these facts are made salient -- if one points out that picking a red ball is much more likely if it's bag X rather than Y, and that most red balls in bag X have spots -- then people's intuitions switch and they deem (1) to have low probability. So it looks like the Thesis is right about some contexts, but not about others.
Kaufmann's explanation is that there are two ways of evaluating conditional probabilities, one "local" and one "global". Globally, 'P(if A then B)' denotes P(B/A); locally, 'P(if A then B)' denotes the expectation of P(B/A) relative to a certain parition, here the partition of bags { X, Y }:
(L) P(if A then B) = P(B/AX)P(X) + P(B/AY)P(Y).
The idea, which sounds plausible, is that when we judge (1) to be probable, we hold fixed that P(Y)=0.75 and note that P(No Spots / Red & Y) is high, which by (L) means that the probability of (1) is high.
But why would we use (L) to evaluate conditional probabilities? The "global" evaluation that conforms to Stalnaker's Thesis is predicted by the general Kratzer-style semantics of 'if' and 'probability'. Where does the "local" reading come from?
Kaufmann suggests that the two evaluations corresponds to different ways of supposing A, and also that the local evaluation can be understood as giving the expectated conditional chance of B given A, since chance is credence conditionalised on the true member of a relevant partition. Both of these remarks suggest that (L) could give the subjunctive conditional probability of B given A, P(A\B), rather than the indicative conditional probability P(A/B). Indeed, the kind of compartmentalised conditioning that figures in (L) is precisely what Lewis uses in "Causal Decision Theory" to define the imaging function for subjunctive conditional probabilities.
So maybe that's what's going on: when people judge (1) to be probable, they read the conditional as subjunctive. This isn't too implausible, I think, because in English the distinction between the subjunctive and indicative reading is usually only marked in the past tense. Read subjunctively, the intuitive judgement about (1) is correct, as can be seen if one enforces this reading by saying "if you were to pick a red ball, it would not have black spots".
The hypothesis that the subjunctive reading is in play might also be supported by the fact that the intuition about (1) becomes much weaker -- I think -- if the sentence is put into the past. Suppose you've drawn a ball but haven't looked at it yet. Consider:
(1'') If you picked a red ball, then it does not have black spots.
The hypothesis also fits the phenomenon that people's intuitions flip when it is pointed out that picking a red ball makes it more likely that it's bag X than bag Y: this context, where the topic is what is evidence for what, makes the indicative reading salient.
So far, so good. Unfortunately, the present story does not work for McGee's examples. Here is one Kaufmann discusses as well. Initially, you believe that Murdoch died in an accident. Then somebody who you think is probably Sherlock Holmes says that Murdoch was killed, that Brown is probably the murderer, and that in any case
(2) If Brown didn't kill Murdoch, then someone else did.
According to McGee, most people now regard (2) as highly probable. However, if it turns out that Brown didn't kill Murdoch, then you'd lose your confidence that the speaker is Holmes, and thus return to your judgment that Murdoch died in an accident. So the (indicative) conditional probability corresponding for (2) is low.
Kaufmann doesn't find this problematic, since it conforms to his local evaluation rule (L), this time using the partition { he's Holmes, he's not Holmes }. But this application of (L) cannot plausibly be taken to give the subjunctive conditional probability of someone else killing Murdoch given that Brown didn't kill him. The subjunctive probability is surely low. If you think that Brown probably killed Murdoch, you will not judge it very probable that if Brown hadn't killed him then someone else would have. Moreover, it is anyway implausible that people are reading (2) subjunctively, because it is in the past tense.
The reason why Kaufmann's rule (L) here doesn't yield subjunctive conditional probability is that it uses a bad partition { Holmes, not Holmes }. (This also makes it implausible to describe (L) as computing expected conditional chance.) Roughly speaking, the cells of a good partition would say enough about the the world and its causal structure so that, combined with either the assumption that Brown did kill Murdoch or that he didn't, each cell would entail whether someone else killed Murdoch. Applying (L) to such a partition yields a low conditional probability.
(L) is partition-dependent: the "local" probability of a conditional depends on the chosen partition. By choosing a suitable partition, we can let the local probability have almost any value we like. Kaufmann stresses that not all partitions are acceptable for (L), and that the right partitions must somehow encode the "causal structure of the scenario" [p.598]. But it isn't clear why this makes { Holmes, not Holmes } acceptable.
Let's redescribe Kaufmann's first example with a different partition. Again, you get to draw a ball from either bag X or bag Y; X contains mostly red balls with mostly black spots, Y has few red balls, few of which have black spots; based on your evidence, you are 75% certain that the bag in front of you is bag Y. If the contents of the bags are precisely specified (as Kaufmann does), it is possible to calculate your probability for the hypothesis that you draw a red ball from bag Y. Let this hypothesis be called RY. Given your evidence, the probability of RY is quite low, say 0.05. So you're very confident that not-RY is true. Moreover, if not-RY is indeed true and you draw a red ball, then the ball can only come from bag X, in which case it probably has black spots. Now consider
(1) If you pick a red ball, it will not have black spots.
I suspect many would judge (1) to have low probability in this context, lower than P(No Spots/Red) and much lower than the subjunctive P(No Spots\Red). But the scenario is exactly the same as Kaufmann's -- I've just made a different partition salient.
Here is one lesson we might draw. There aren't just two kinds of conditional probabilities, indicative and subjunctive, but infinitely many, one for each choice of a partition. Every partition induces an imaging function and thereby a type of subjunctive supposition. We could then also fold indicative conditional probability into the subjunctive kind, induced by the single-membered partition. Context usually determines which partition is salient for statements about conditional probability (i.e. for statements that look like statements about the probability of a conditional).
Maybe. But if that's true, I'd like it to follow from the general semantics of 'if' and 'probability'. Neither of these, by itself, seems to be sensitive to the contextually salient partition -- at least not to the extent required for the present proposal to work.
I prefer another, perhaps more obvious, explanation: people who intuit that (2) is probable and (1) very improbable (in the revised context) have made a mistake.
Where does the mistake come from? In part, it may come from the fact that the (standard, indicative) conditional probability is a bit hard to determine in these cases, because one has to keep track of two factors that pull in opposite directions. For example, in the case of (2), the hypothesis that Brown didn't kill Murdoch supports that someone else did it within the "Holmes" cell of the partition, but simultaneously lowers the probability of that cell and thereby the probability that Murdoch was killed.
More importantly, I think the mistake comes from the grammatical illusion that a question about the the probability of a conditional is a question about the probability of a certain proposition. If A is a proposition and { X, Y } a partition, then of course
P(A) = P(A/X)P(X) + P(A/Y)P(Y).
So we can always evaluate the probability of a proposition by considering its probability under different hypotheses and then take the weighted average. The result never depends on the chosen partition. When asked about the probability that if A then B, we mistakenly apply the same recipe, not realising that 'the probability that if A then B on the assumption that X' denotes P(B/AX) rather than something of the form P(A->B/X).
Consider another of McGee's examples. Quantum mechanics entails that
(3) If all atoms in this table decay within the next second, then Z amount of energy is released,
for some particular value Z. McGee suggests that if we trust quantum mechanics, then we will assign high probability to (3). However, P(Z released / table decays) is low, since finding the table suddenly decay would dramatically lower our confidence in quantum mechanics.
If the probability of (3) is the probability of a certain proposition that's entailed by quantum mechanics, then it is clear why trusting quantum mechanics requires assigning high probability to (3). But on the Kratzerian account, there is no such proposition, at least not if the conditional is read indicatively. (It could also be read as a nomologically strict conditional, in which case the failure of Stalnaker's Thesis is unproblematic.) On the indicative reading, there is no proposition that is (i) entailed by quantum mechanics and (ii) whose probability is in question when we ask about the probability of (3). Perhaps it is the prima facie plausibility that there is a proposition satisfying (i) and (ii) that explains why we mistakenly think the probability of (3) must be high, even on the indicative reading.
Neat post, Wo. One interesting thing about the X/Y bags example in Kaufmann's paper is that it does not specify the probability of the following conjunction:
(C) the ball is white and the ball has a black dot.
It can be shown that there will be a negative correlation (viz., a disconfirmatory relation) between "the ball is red" and "the ball has a black dot" iff Pr(C) > 2/5. Perhaps subjects are somehow assuming/presupposing that Pr(C) > 2/5? If they are, then a "disconfirmation effect" could help to explain the responses. Douven has a nice paper on (dis)confirmation and the probabilities of conditionals. See:
http://www.springerlink.com/content/n22u58h119317106/fulltext.pdf
BTW -- Igor also has a short paper criticizing Kauffmann's account (on independent grounds -- of probabilistic inconsistency), here:
http://www.springerlink.com/content/l87u553916j6741m/fulltext.pdf