< 585 older entriesHome196 newer entries >

Review of Tyler Burge: Origins of Objectivity

Earlier this year, I read Tyler Burge's Origins of Objectivity. It's a very long book. Here is an abridged version. A few comments below.

Origins of Objectivity

Representation is a basic explanatory kind in psychology that should be distinguished from mere information-carrying. The most fundamental type of representational state is perception. In perception, an organism attributes properties to objects in its environment. To do this, the organism does not need linguistic capacities, nor does it need to know (or otherwise represent) necessary and sufficient conditions for being the relevant object. Instead, the science of perception reveals that it is sufficient that the organism stands in a suitable causal relation to the object and that its perceptual state involves certain constancies (for shape or colour or distance or whatever) which characterize the object "objectively", abstracting away from contingencies of the present stimulus.

I like the starting point — to think of intentional states as explanatory scientific kinds. Burge doesn't say what exactly he means by this. I would put it as a kind of functionalism: intentional states are characterized (at least in part) by their functional inter-connections and their relationship to environmental causes, behaviour and other psychologically relevant facts.

Some counterexamples to the Benchmark Theory

In "Gandalf's solution to the Newcomb problem" (2013), Ralph Wedgwood proposes a new form of decision theory, Benchmark Theory, that is supposed to combine the good parts of Causal and Evidential Decision Theory.

Like many formulations of Causal Decision Theory, Benchmark Theory (BT) assumes a privileged partition of states that are outside the agent's causal control. Like Evidential Decision Theory, BT only considers the probability of these states conditional on a given act A. However, what is weighted by the conditional probabilities P(S_i/A) is not the absolute utility of S_i & A, but the comparative utility of S_i & A, which is determined by the difference between the absolute utility U(S_i & A) and the average absolute utility U(S_i & A') for all options A'. (This average is the benchmark B(S_i).) So the degree of choiceworthiness of an act A is given by

The lure of free energy

There's an exciting new theory in cognitive science. The theory began as an account of message-passing in the visual cortex, but it quickly expanded into a unified explanation of perception, action, attention, learning, homeostasis, and the very possibility of life. In its most general and ambitious form, the theory was mainly developed by Karl Friston -- see e.g. Friston 2006, Friston and Stephan 2007, Friston 2009, Friston 2010, or the Wikipedia page on the free-energy principle.

Indefinites and implicatures

Suppose I say (*), with respect to a particular gambling occasion.

(*) A gambler lost some of her savings. Another lost all of hers.

There is an implicature here that the first gambler, unlike the second, didn't lose all her savings. How does this implicature arise?

On the standard account of scalar implicatures, we should consider certain alternatives to the uttered sentences. In particular, I could have said 'A gambler lost all of her savings' instead of 'A gambler lost some of her savings'. If true, this alternative would have been more informative. Since I chose the weaker sentence, you can infer that I wasn't in a position to assert the stronger sentence. Assuming I am well-informed, you can further infer that the stronger sentence is false.

Against countable additivity

Imagine the universe has a centre that regularly produces new stars which then drift away at a constant speed. This has been going on forever, so there are infinitely many stars. We can label them by age, or equivalently by their distance from the centre: star 1 is the youngest, then comes star 2, then star 3, and so on, without end. The stars in turn produce planets at regular intervals. So the older a star, the more planets surround it. Today, something happened to one (and only one) of the planets. Let's say it exploded. Given all this, what is your credence that the unfortunate planet belonged to the first 100 stars? What about the second 100? It would be odd to think that the event is more likely to have happened at one of the first 100 stars than at one of the next 100, since the latter have far more planets. Similarly if we compare the first 1000 stars with the next 1000, or the first million with the next million, and so on. But there is no countably additive (real-valued) probability measure that satisfies this constraint.

Conditional chance and rational credence

Two initially plausible claims:

  1. Sometimes, a possible chance function conditionalized on a proposition A yields another possible chance function.
  2. Any rational prior credence function Cr conditional on the hypothesis Ch=f that f is the (actual, present) chance function should coincide with f; i.e., Cr(A / Ch=f) = f(A) for all A (provided that Cr(Ch=f)>0).

Claim 1 is a supported by the popular idea that chances evolve by conditionalizing on history, so that the chance at time t2 equals the chance at t1 conditional on the history of events between t1 and t2. Claim 2 is a weak form of the Principal Principle and often taken to be a defining feature of chance.

Moving to Germany

Inga got a postdoc in Hamburg, so it looks like we'll be moving back to Germany at the end of the year. It's sad to leave the ANU, but we'll probably return here for at least a few months in 2014. (If only because I don't have another job yet.)

Second-order logic and Newman's problem

How much can you say about the world in purely logical terms? In first-order logic with identity, one can construct formulas like '(Ex)(Ey)~(x=y)'. But arguably, this doesn't yet mean anything. As we learned in intro logic, formulas of first-order logic have no fixed interpretation; they mean something only once we provide a domain of quantification and an assignment of values to predicate and function symbols. As it happens, '(Ex)(Ey)~(x=y)' doesn't contain any non-logical predicate and function symbols, so to make it mean anything we just need to specify a domain of quantification. For example, if the domain is the class of Western black rhinos, then the formula says that there are at least two Western black rhinos.

The input problem for Jeffrey conditioning

You can't predict the stock market by looking at tea leaves. If an episode of looking at tea leaves makes you believe that the stock market will soon collapse, then -- assuming your previous beliefs did not support the collapse hypothesis, nor the hypothesis that tea leaves predict the stock market -- your new belief is unjustified and irrational. So there are epistemic norms for how one's opinions may change through perceptual experience.

Such norms are easily accounted for in the traditional Bayesian picture where each perceptual experience is associated with an evidence proposition E on which any rational agent should condition when they have the experience. But what if perceptual experiences don't confer absolute certainty on anything? Jeffrey pointed out that if there is a partition of propositions { E_i } = E_1,...,E_n such that (1) an experience changes their probabilities to some values { p_i } = p_1,...,p_n, and (2) the experience does not affect the probabilities conditional on any member of the partition, then the new probability assigned to any proposition A is the weighted average of the old probability conditional on the members of the partition, weighted by the new probability of that partition. This rule is often called "Jeffrey conditioning" and sometimes "generalised conditioning", but unlike standard conditioning it isn't a dynamical rule at all: it is a simple consequence of the probability calculus. To get genuine epistemic norms on the dynamics of belief through perceptual experience, Jeffrey's rule must be supplemented with a story about how a given experience, perhaps together with an agent's previous belief state, may fix the partition { E_i } and values { p_i } that determine a Jeffrey update. This is the "input problem" for Jeffrey conditioning.

Bayes factors

Suppose a rational agent makes an observation, which changes the subjective probability she assigns to a hypothesis H. In this case, the new probability of H is usually sensitive to both the observation and the prior probability. Can we factor our the prior probability to get a measure of how the experience bears on the probability of H, independently of the prior probability?

A common answer, going back to Alan Turing and I.J.Good, is to use Bayes factors. The Bayes factor B(H) for H is the ratio (P'(H)/P'(not-H))/(P(H)/P(not-H)) of new odds on H to old odds. Thus the new odds on H are the old odds multiplied by the Bayes factor. For example, if the prior credence in H was 0.25 and the posterior is 0.5, then the odds on H changed from 1:3 to 1:1, and so the Bayes factor of the update is 3. The same Bayes factor would characterise an update from probability 0.01 to about 0.03 (odds 1:99 to 1:33) or from 0.9 to about 0.96 (odds 9:1 to 27:1).

< 585 older entriesHome196 newer entries >