Earlier this year, I read Tyler Burge's Origins of
Objectivity. It's a very long book. Here is an abridged version. A few comments below.
Origins of Objectivity
Representation is a basic explanatory kind in psychology that should
be distinguished from mere information-carrying. The most fundamental
type of representational state is perception. In perception, an
organism attributes properties to objects in its environment. To do
this, the organism does not need linguistic capacities, nor does it
need to know (or otherwise represent) necessary and sufficient
conditions for being the relevant object. Instead, the science of
perception reveals that it is sufficient that the organism stands in a
suitable causal relation to the object and that its perceptual state
involves certain constancies (for shape or colour or distance or
whatever) which characterize the object "objectively", abstracting
away from contingencies of the present stimulus.
I like the starting point — to think of intentional states as
explanatory scientific kinds. Burge doesn't say
what exactly he means by this. I would put it as a kind of
functionalism: intentional states are characterized (at least
in part) by their functional inter-connections and their relationship
to environmental causes, behaviour and other psychologically relevant
facts.
In
"Gandalf's
solution to the Newcomb problem" (2013), Ralph Wedgwood
proposes a new form of decision theory, Benchmark Theory, that
is supposed to combine the good parts of Causal and Evidential
Decision Theory.
Like many formulations of Causal Decision Theory, Benchmark Theory
(BT) assumes a privileged partition of
states that are outside the agent's causal control. Like
Evidential Decision Theory, BT only considers the probability of these
states conditional on a given act A. However, what is weighted by the
conditional probabilities P(S_i/A) is not the absolute utility of S_i
& A, but the comparative utility of S_i & A, which is
determined by the difference between the absolute utility U(S_i &
A) and the average absolute utility U(S_i & A') for all options
A'. (This average is the benchmark B(S_i).) So the degree of
choiceworthiness of an act A is given by
There's an exciting new theory in cognitive science. The theory began
as an account of message-passing in the visual cortex, but it quickly
expanded into a unified explanation of perception, action, attention,
learning, homeostasis, and the very possibility of life. In its most
general and ambitious form, the theory was mainly developed by Karl
Friston -- see
e.g. Friston
2006, Friston
and Stephan 2007,
Friston
2009,
Friston
2010,
or the
Wikipedia page on the free-energy principle.
Suppose I say (*), with respect to a particular gambling
occasion.
(*) A gambler lost some of her savings. Another lost all of hers.
There is an implicature here that the first gambler, unlike the
second, didn't lose all her savings. How does this implicature
arise?
On the standard account of scalar implicatures, we should consider
certain alternatives to the uttered sentences. In particular, I could
have said 'A gambler lost all of her savings' instead of 'A
gambler lost some of her savings'. If true, this alternative
would have been more informative. Since I chose the weaker sentence,
you can infer that I wasn't in a position to assert the stronger
sentence. Assuming I am well-informed, you can further infer that the
stronger sentence is false.
Imagine the universe has a centre that regularly produces new stars
which then drift away at a constant speed. This has been going on
forever, so there are infinitely many stars. We can label them by age,
or equivalently by their distance from the centre: star 1 is the
youngest, then comes star 2, then star 3, and so on, without end. The
stars in turn produce planets at regular intervals. So the older a
star, the more planets surround it. Today, something happened to one
(and only one) of the planets. Let's say it exploded. Given all this,
what is your credence that the unfortunate planet belonged to the
first 100 stars? What about the second 100? It would be odd to think
that the event is more likely to have happened at one of the first 100
stars than at one of the next 100, since the latter have far
more planets. Similarly if we compare the first 1000 stars with the
next 1000, or the first million with the next million, and so on. But
there is no countably additive (real-valued) probability measure that
satisfies this constraint.
Two initially plausible claims:
- Sometimes, a possible chance function conditionalized on a proposition A yields another possible chance function.
- Any rational prior credence function Cr conditional on the hypothesis Ch=f
that f is the (actual, present) chance function should coincide with
f; i.e., Cr(A / Ch=f) = f(A) for all A (provided that Cr(Ch=f)>0).
Claim 1 is a supported by the popular idea that chances evolve by
conditionalizing on history, so that the chance at time t2 equals the
chance at t1 conditional on the history of events between t1 and
t2. Claim 2 is a weak form of the Principal Principle and often taken
to be a defining feature of chance.
Inga got a postdoc in Hamburg, so it looks like we'll be moving back to Germany at the end of the year. It's sad to leave the ANU, but we'll probably return here for at least a few months in 2014. (If only because I don't have another job yet.)
How much can you say about the world in purely logical terms? In
first-order logic with identity, one can construct formulas like
'(Ex)(Ey)~(x=y)'. But arguably, this doesn't yet mean anything. As we
learned in intro logic, formulas of first-order logic have no fixed
interpretation; they mean something only once we provide a domain of
quantification and an assignment of values to predicate and function
symbols. As it happens, '(Ex)(Ey)~(x=y)' doesn't contain any
non-logical predicate and function symbols, so to make it mean
anything we just need to specify a domain of quantification. For
example, if the domain is the class of Western black rhinos, then the
formula says that there are at least two Western black rhinos.
You can't predict the stock market by looking at tea leaves. If an
episode of looking at tea leaves makes you believe that the stock
market will soon collapse, then -- assuming your previous beliefs did
not support the collapse hypothesis, nor the hypothesis that tea
leaves predict the stock market -- your new belief is unjustified and
irrational. So there are epistemic norms for how one's opinions may
change through perceptual experience.
Such norms are easily accounted for in the traditional Bayesian
picture where each perceptual experience is associated with an
evidence proposition E on which any rational agent should condition
when they have the experience. But what if perceptual experiences
don't confer absolute certainty on anything? Jeffrey pointed out that
if there is a partition of propositions { E_i } = E_1,...,E_n such
that (1) an experience changes their probabilities to some values {
p_i } = p_1,...,p_n, and (2) the experience does not affect the
probabilities conditional on any member of the partition, then the new
probability assigned to any proposition A is the weighted average of
the old probability conditional on the members of the partition,
weighted by the new probability of that partition. This rule is often
called "Jeffrey conditioning" and sometimes "generalised
conditioning", but unlike standard conditioning it isn't a dynamical
rule at all: it is a simple consequence of the probability
calculus. To get genuine epistemic norms on the dynamics of belief
through perceptual experience, Jeffrey's rule must be supplemented
with a story about how a given experience, perhaps together with an
agent's previous belief state, may fix the partition { E_i } and
values { p_i } that determine a Jeffrey update. This is the "input
problem" for Jeffrey conditioning.
Suppose a rational agent makes an observation, which changes the
subjective probability she assigns to a hypothesis H. In this case,
the new probability of H is usually sensitive to both the observation
and the prior probability. Can we factor our the prior probability to
get a measure of how the experience bears on the probability of H,
independently of the prior probability?
A common answer, going back to Alan Turing and I.J.Good, is to use
Bayes factors. The Bayes factor B(H) for H is the ratio
(P'(H)/P'(not-H))/(P(H)/P(not-H)) of new odds on H to old odds. Thus
the new odds on H are the old odds multiplied by the Bayes factor. For
example, if the prior credence in H was 0.25 and the posterior is 0.5,
then the odds on H changed from 1:3 to 1:1, and so the Bayes factor of
the update is 3. The same Bayes factor would characterise an update
from probability 0.01 to about 0.03 (odds 1:99 to 1:33) or from 0.9 to
about 0.96 (odds 9:1 to 27:1).