Isaacs and Russell on updating without evidence
Isaacs and Russell (2023) proposes a new way of thinking about evidence and updating.
The standard Bayesian picture of updating assumes that an agent has some ("prior") credence function Cr and then receive some (total) new evidence E. The agent then needs to update Cr in light of E, perhaps by conditionalizing on E. There is no room, in this picture, for doubts about E. The evidence is taken on board with absolute certainty.
The standard picture thereby assumes that the agent's cognitive system is perfectly sensitive to a certain aspect of the world: if E is true, the agent is certain to update on E; if E is false, the agent is certain to not update on E.
This is unrealistic. Real agents aren't perfectly sensitive to any interesting aspect of the world. Suppose an agent has, say, a high-quality temperature sensor. The state of the sensor won't be perfectly correlated with the actual temperature, across all possible conditions. In the same way, the agent's new belief function might not be perfectly sensitive to the state of its sensor.
In a more realistic (or more general) model, there may be nothing that plays the traditional role of evidence. Instead of asking how an agent should update their credence Cr in light of new evidence, we may ask how the agent should update their credence Cr given that they have such-and-such (possibly imperfect) sensitivity to such-and-such aspects of the world.
This is "updating without evidence".
I have defended a particular model of updating without evidence in Schwarz (2018). Roughly, my proposal is that an agent with an imperfect temperature sensor should Jeffrey conditionalize on a partition of different hypotheses about the temperature (not becoming certain of any). To determine the Jeffrey weights of the partition cells, the agent's credence function should be extended by a new dimension of "imaginary" propositions that are connected to sensor states in such a way that each state makes the agent certain of an imaginary proposition.
Isaacs and Russell (2023) propose a range of other models for updating without evidence. They offer something I don't offer in my paper: proofs that the proposed models are optimal in the sense that the relevant updates maximize expected future accuracy.
The most accurate future credence function is the omniscient credence function that assigns 1 to all truths and 0 to all falsehoods. The truly optimal update, in terms of expected future accuracy, is therefore a hypothetical update process that is guaranteed to make the agent omniscient. But such a process isn't available for agents whose cognitive system isn't perfectly sensitive to the whole truth about the world. What we want to find is the accuracy-optimal update that is compatible with the agent's limited sensitivity.
At this point, we would need a good formal model of limited sensitivity, and of what it means for an update process to be compatible with limited sensitivity. This isn't easy. I have no such model to offer.
Isaacs and Russell have a suggestion. In their model, the available updates (for an agent with imperfect sensitivity to the world) are determined by all possible plans to follow any possible update rule. An "update rule" is defined as a function that maps every possible world to a posterior credence function. An agent follows such a rule f in a world w iff their new credence is f(w). But when an agent plans to follow the rule, they may end up with a difference credence function.
Consider the truth rule that maps every world w to the credence function that is omniscient about w. In Isaacs and Russell's terminology, you can "follow" this rule. You could, for example, choose any world with maximal prior credence and give it credence 1. If you're lucky, the chosen world is the actual world. You have then "followed" the truth rule.
The strategy of choosing a world with maximal prior credence and giving it credence 1 is what Isaacs and Russell call a plan to follow the truth rule. The idea is that if you wanted to follow the truth rule, you could do no better than make a wild guess about which world is actual and give it all your credence. We can compute the expected accuracy of the credence function resulting from this plan. It's not great.
In Isaacs and Russell's terminology, the choice of a world is a guess. In the plan I have described, your guess doesn't depend at all on which world is actual. But suppose you're about to learn, with certainty, whether some proposition E is true. In this case, you could plan to apply the truth-rule to a randomly chosen world (with positive credence) that is compatible with what you will learn. Your "guess" would then depend on the world.
Many other update policies determine a function from worlds to posterior credences. For example, if you conditionalize on whichever of E and ¬E you learn, then you follow the condi rule that maps every E-world to Cr(*/E) and every ¬E-world to Cr(*/¬E). Accordingly, we can model conditionalization in terms of "guesses": to conditionalize, you first choose an arbitrary world w that's compatible with what you learn and then apply condi to w.
This bizarre way of modelling conditionalization allows us to compare different update plans. For any update rule f, planning to update by f can be modelled as applying f to one's guess of the world. Isaacs and Russell show (in appendix B) that in cases where the guess deterministically depends on the actual world, the rule one should ideally plan to follow (in terms of expected future accuracy) is to conditionalize on the set of worlds in which one would make the guess that one actually makes.
If one's guess doesn't deterministically depend on the actual world, then planning to conditionalize may come apart from actually conditionalizing. Suppose, for example, that you are prone to conflating your actual evidence E with some nearby alternative E', so that you might end up conditionalizing on E' if you plan to conditionalize on E. Gallow (2021) suggests that this is what's going on in the unmarked clock case of Williamson (2014).
The condi rule still maps every E world to Cr(*/E) and every E' world to Cr(*/E'). But when you now plan to follow this rule, you can't be sure that you will "guess" an E-world whenever E is true. You might instead guess an E'-world. Isaacs and Russell model this by assuming that your guess depends only stochastically on which world is actual. Surprisingly, they suggest that planning to conditionalize on E is still the optimal strategy, even though the plan they prove to be optimal in appendix C looks more like Gallow's excondi rule. I haven't tried to figure out what is going on here.
At any rate, there are other cases in which planning to follow condi isn't optimal.
Isaacs and Russell consider a scenario, originally from Lasonen-Aarnio (2015), in which you are sure to recognize a friend if she walks in, but you might also mistake a stranger for your friend if a stranger walks in. Lasonen-Aarnio says that your evidence in the "good case", where it's your friend, is that the person is your friend, while your evidence in the "bad case", where it's a stranger, is neutral on whether the person is your friend.
The optimal plan here is not to conditionalize on this evidence. Instead, you should plan to be certain that the person is a stranger if your evidence is neutral, and you should plan to give credence 2/3 to the person being your friend if your evidence says they are your friend. This is easy to see intuitively: the person is your friend in two thirds of possible cases in which your evidence says that they are your friend; the person is a stranger in every case in which your evidence is neutral.
This nicely illustrates a more general point made in Gallow (2021) and elsewhere: that conditionalisation is not optimal in the kinds of cases that are assumed to support evidential externalism.
The friend case also illustrates that it might be better not to talk about your "evidence". In the friend scenario, your cognitive system is imperfectly sensitive to whether the person is your friend, in such a way that you can't reliably implement a plan that calls for different posterior beliefs depending on whether they are your friend (although you can reliably implement a plan that calls for a specific belief whenever the person is your friend). That's all we need to say. If the person is your friend, do you have conclusive evidence that they are your friend? Not if "evidence" is something you learn beyond doubt. If you follow the optimal plan, then your credence in this proposition is only 2/3.
Isaacs and Russell discuss yet another case in which you shouldn't plan to conditionalize. The case involves an agent who is incapable of proportioning their beliefs to their evidence: the only credence values they can give to a certain proposition are 0, 0.5, and 1. They argue that the optimal plan here depends on how exactly we measure accuracy, but it's not going to be conditionalization.
I'm sympathetic to all of this. But I have some quibbles.
For one, I don't find the guess models helpful. Intuitively, conditionalizing doesn't involve any guess about which world is actual. Even if an update rule requires some such guess – like the truth rule – why should all available plans involve the very same guess? In fact, Isaacs and Russell admit that guess models only work for certain cases. In the most general case, "planning to follow a rule" doesn't mean applying the rule to one's guess. The rule is just a parameter for the update process, they say. But why should we even think of all available updates as parameterized by such rules? In other words, why should we think that all update processes that are available to an agent with imperfect sensitivity to the world can be understood as planning to follow some rule?
I would also want to distinguish between "primary" and "secondary" norms of rationality. If your evidence supports some proposition to degree 0.8, but you are bound to be either overconfident or underconfident, so that the only credences you can give to the proposition are 0, 0.5, and 1, then we can talk about which of the available belief states would be optimal. But we should also say that all these states are all wrong. The credence you really, ideally ought to do have is 0.8. Having imperfect temperature sensors, by contrast, is not a sign of irrationality.
The question I've explored in Schwarz (2018) is how an ideally rational agent with imperfect sensors might function. I don't think my proposal can be modelled in terms of a "guess" about which world is actual, nor can it be usefully understood as planning to follow some update rule, in Isaacs and Russell's sense. I'd like to think that the update I propose maximizes expected future accuracy among all updates available to an agent with imperfect sensors. But as I said above, I'm not sure how to delineate the space of available updates. Isaacs and Russell manage to prove some results, but only by imposing implausible restrictions on that space.
One last thought. Isaacs and Russell appear to be moved by Williamsonian "externalist" considerations – by Williamsons's clock case, or Lasonen-Aarnio's friend case. The picture they end up with, however, is rather anti-Williamsonian.
For one thing, if we take seriously our imperfect sensitivity to the world, and we should stop talking about evidence when thinking about cases of imperfect sensitivity, then we should also stop talking about knowledge, on Williamson's E=K view.
Also, a central aspect of Williamson's epistemology is a "good case"-"bad case" asymmetry. If your friend walks in, and you recognize her, then (on the Williamsonian account) you know that the person is your friend. You can be rationally sure that she is. The disposition you thereby manifest may lead to suboptimal results in counterfactual bad cases where a stranger walked in, but that's OK. Isaacs and Russell assume that we should balance the accuracy of our future beliefs in good and bad cases in proportion to their previous probability. Good cases are not intrinsically favoured over bad cases, even if they are actual.
A truly externalist account of updating might say that how you should update depends on the world. If the person who is coming in is your friend then you should have a different update disposition than if they are a stranger. Such externalist accounts are not even on the table for Isaacs and Russell.