From Chance to Credence
Lewis argues that any theory of chance must explain the Principal Principle, which says that if you know that the objective chance for a certain proposition is x, then you should give that proposition a credence close to x. Anyone who proposes to reduce chance to some feature X, say primitive propensities, must explain why knowledge of X constrains rational expectations in this particular way.
How does Lewis's own theory explain that?
On Lewis's theory, the chance of an event (or proposition) is the probability-value assigned to the event by the best theory. Those 'probability-values' are just numerical values: they are not hypothetical values for some fundamental property; they need not even deserve the name "probability". However, one requirement for good theories is that they assign high probability-values to true propositions. Other requirements for good theories are simplicity and strength. The best theory is the one that strikes the best compromise between all three requirements. So the question becomes: why should information that the best theory assigns probability-value x to a proposition constrain rational expectations in the way the Principal Principle says?
First, a dummy case. Here is one version of the Principal Principle:
PP) C(E|TH) = P(E),
E is some event, H arbitrary historical information, and T a proposition that says that the objective chance of E is P(E). That is (on Lewis's analysis), T says that the probability-value assigned to E by the best theory is P(E).
Now suppose fit and strength are the only requirements for good theories. Then the best theory will simply assign 1 to all occurrent events and 0 to all others. So we know that E occurs iff the best theory assigns probability-value 1 to E:
C(E|TH) = | 1 iff T says that P(E) = 1, |
0 iff T says that P(E) = 0. |
Which is just the Principal Principle restricted to chances 0 and 1. For other values, C(E|TH) and P are undefined, as then C(T) = 0. Presumably the Principle must be restricted to values for which C(E|TH) is defined anyway (consider inconsistent T), so that shouldn't worry us.
On Lewis's account, the simplicity condition on good theories gives rise to real chances (in between 0 and 1). Roughly, the more weight we put on simplicity, the more values in between 0 and 1 we'll get. It isn't obvious why these values should correspond to rational credence. One might even argue that our interest in simple theories derives from our contingent cognitive limitations, and therefore should not be built into general constraints on rationality. So how do we get from Lewisian chance (with simplicity taken into account) to credence?
One way goes via frequencies:
Suppose the world contains lots of coin tosses, half of which come up heads. Suppose also that no reasonably simple theory predicts all the individual results; all acceptable theories assign the same probability-value to all coin tosses. Then the closer this value is to 0.5, the better the theory fits the world. That is, in many cases, the probability-values assigned by the best theory roughly match the relative frequencies.
Now suppose you know that the best theory assigns value 0.5 to heads for all coin tosses. Since a good theory should fit the world, you may thus assume that about half of all coin tosses fall heads. And that is surely relevant to your rational expectations about any particular coin toss. Absent further information, you should arguably give credence 0.5 to the proposition that the next coin will come up heads.
But we don't get the full Principal Principle that way. For one, we can't take for granted that all probabilistic events have lots of duplicates to which any acceptable theory will assign the same probability-value. Secondly, the Principal Principle says that historical information does not affect the rational expectations based on knowledge of objective chance. But mere knowledge of relative frequencies is not that resilient: suppose you know that so far, 70% of all coin tosses were heads. Then if you also know that roughly 50% of all coin tosses throughout history are heads, you should presumably expect future coins to land tails more often than heads.
Here is another, perhaps more promising path from Lewisian chance to credence. It is based on the following version of the Principal Principle:
PP*) C(T|EH) = P(E) * C(T|H)/C(E|H)
As before, T is a proposition that assigns chances to events; P(E) the chance assigned to E by T, and H historical background information. We get (PP*) by plugging (PP) into Bayes' Theorem
C(T|EH) = C(E|TH) * C(T|H)/C(E|H).
(PP*) tells us how to update our credence in T upon finding evidence E: Roughly, E confirms T to the degree that E is surprising and T assigned high chance to E.
(Again, if one takes chance as fundamental and independent of actual occurrences, it is hard to explain why that should be so: Why should observation of E confirm a theory that assigned to E a high propensity, rather than one that assigned to E a low propensity? How can information about actual occurrences be evidence for a property that is independent of actual occurrences?)
On Lewis's account of chance, (PP*) roughly says that observation of E confirms the assumption that the best theory assigns high probability-value to E. And that is no mystery if by definition of "good", a good theory assigns comparatively high probability-values to occurrent events.
How does simplicity enter the picture? Let's assume, as seems plausible, that the evidence E tells us nothing about the simplicity (and strength) of our candidate theories. It only tells us how they fit the world. So simplicity must enter into the priors: a simpler theory gets higher prior credence. Here is also where the weighing of simplicity and fit takes place: if you want to make simplicity more important than fit, give complex theories far lower priors.
Once the priors are set, (PP*) in effect tells us to update our credence in the candidate theories proportionally to how they fit reality. (Somewhat more precisely, it tells us that upon learning E we should increase out credence that theory T gives the correct chances by the amount that the value T assigned to E exceeded our prior credence in E.) If two theories have the same prior credence, we should give higher credence to the one that assigned a higher value to E.
This is exactly what one would expect if fit is a condition on good theories, and if evidence is irrelevant for the other conditions.
I guess this should be made more precise, but it seems to me that Lewis's analysis of chances does indeed entail something close to the Principal Principle. (We know that it can't entail the full Principal Principle, because it is inconsistent with it, but near enough is good enough.) What's nice is that the converse isn't required. In particular, (PP*) doesn't tell us how to weigh simplicity against fit. Something close to (PP*) will be (non-trivially) satisfied as long as fit has at least some weight in the mix.