David Rohde's Web Page
PhD Candidate in Computer Science / Physics at University of
Queensland. Supervisors Marcus
Gallagher (Information Technology and
Electrical Engineering and Michael Drinkwater Physics
My PhD Project is about the combination of different astronomical
datasets. In many astronomical datasets positional information alone
may not be sufficient to determine different observations of the same
objects in different datasets.
This PhD project comes under the umbrella of the Virtual Observatory
which is a world wide effort to produce tools for handling the great
influx of astronomical data. The Australian website is here - for worldwide contributers see International Virtual Observatory
Alliance Web site. My project is about a statistical problem in
matching datasets other computer science problems aspects of the same
problem are also being widely discussed see OpenSkyQuery.
My PhD research centres upon the statistical problem of 'doing
science' when the correct match cannot be easilly determined. My
initial response to this question was to treat it as a pattern
classification task to identify the correct matches.
D. J. Rohde, M. R. Gallagher, M. J. Drinkwater and
K. A. Pimbblet. Matching of Catalogues by Probabilistic Pattern
Classification. Monthly Notices of the Royal Astronomical Society,
369, 2-14, 2006. (Preprint see MNRAS for original)
D. Rohde, M. Drinkwater, M. Gallagher, T. Downs
and M. Doyle. Applying Machine Learning to Catalogue Matching in
Astrophysics. Monthly Notices of the Royal Astronomical Society
360(1): 69-75, June 2005. (Preprint see MNRAS for original)
In more recent work I am considering the unkown status of matches to
be a nuisance parameter and am handling the problem in a Bayesian
Framework. There are strong Philosophical reasons for prefering
Bayesian Statistics over Classical Statistics.
The first place that I became aware of the argument for Bayesian
Statistics was in Edwin Jaynes (incomplete) book available on the web Probability
Theory the logic of science. For such a technical book it is
extremly readable and has interesting diversions in to history and
philosophy which make it very attractive. It is also straight talking
and practical and mercilessly attacks classical statistics. When
reading it, it is important to note that in many ways Jaynes is a
maverick in that he thinks it is possible to consider inference to be
objective. Jaynes has a long running battle with Dawid, Stone and
Zidek over a technical issue (propriatary of priors) which seems to
seriously challenge the possibility of objective inference - a
widespread consensus seems to be that Jaynes got this wrong! See the
Unnofficial
Erratta. Despite this I think the Jaynes book is a fantastic
resource, and absolutely essential in trying to think about
statistical problems clearly!
The other famous (acutally more famous) figure in the philosophy of
statistics is Bruno de Finetti. His most famous works are
Foresight: Its Logical Laws, Its Subjective Sources and
Theory of Probability Volume 1 and 2. de Finetti is harder to
read than Jaynes particularly his Theory of Probability book. de
Finetti's main contribution is the discovery of the relationsip
between probability and frequency through the de Finetti
representation. This gives an alternative (purely subjective)
intepretation to probability. Jaynes discussing this is again very
readable in Some
Applications and Extensions of the de Finetti Representation
Theorem. The only introductory level text that embraces de
Finetti's philosphy is Operational Subjective Statistical Methods by
Frank Lad. Its a very good book. A book thats out very soon which is
similar in spirit is Bayes Linear Statistics by Michael Goldstein
and David Wooff. Bayes Linear is a very interesting compromise
position. It is easier to implement tha full Bayes, but still has
strong subjective foundations.
In one sense de Finetti and Jaynes are on opposite sides of the
spectrum, de Finetti arguing for subjectivity and Jaynes objectivity.
This issue is the subject of a lenghty discussion in the new Journal
Bayesian Analysis here although it seems
the majority even the entirety favour the subjectivity (at various
levels). The reason to use the term 'objective' is principally for
marketing - "If objective statistics exist Bayesian methods are
still superior".
An excellent article (with discussion) is The
Philosophy of statistics by Dennis
Lindley (JSTOR subscription required). A nice feature of stats journals is that they include
discussion from several other scientists - in this case the level of
agreement between statisticians seems extremely small. I think
scientists reading this article should feel liberated. The high level
of disagreement suggests that the empirical way any scientist presents their
results are almost certainly defendable within one statistical
paradigm or another. The review also suggests that many statisticians
(of all schools!) regard statistical inference as highly subjective
and problem dependant!
Another good overview of Bayesian Statistics is by Jose Bernardo
Machine Learning Bayesians
David Mackay
has both his thesis
and his new
book is available on the web.
Radford Neal Was the
first person to apply Markov Chain Monte Carlo numerical integration
techniques to machine learning models. There are a number of
interesting papers on his website. His MCMC neural network software
is also available for download.
A new book by Carl Rasmussen and Chris
Williams Gaussian Processes for Machine Learning
has some chapters available for free online.
Christopher
Bishop Is another leading Bayesian machine learning researcher and
the author of one of the standard neural network texts.
Criticism of Bayes
I think it is useful to starkly present the strongest anti-Bayesian
argument possible. I think reading these documents does a great deal to
displace the argument that there is a strong anti-Bayesian case.
Some
issues in the foundations of statistics David Freedman (with
discussion) Springer subsription required. In my view one of the best
Anti-Bayesian arguments is contained in this essay.
On
the consistency of Bayes Estimates. Persi Diaconis and David
Freedman (with discussion) JSTOR subscription required. A fairly
technical article - but it is easy to get the gist by reading the
discussion. The paper identifies situations where an infinite amount
of data may not bring two people into agreement. A similar situation
is identified in Probability
Theory the logic of science Chapter 5 which shows situations where
two people observing the same dataset become polarised into opposing
views. I think that large datasets do not produce concensus is a very
important issue to study and understand further however I can't see
how it is an anti-Bayesian argument....
The
Subjective Theory of Probability. Donald Gillies JSTOR
subscription required. Gillies is a close follower of Karl Popper
who's falseficationist ideas are incompatible with Bayes. However he
understands the subjectivist argument well, and this essay is a clear
expression of de Finetti's ideas.
Reflections
on Fourteen Cryptic Issues Concerning the Nature of Statistical
Inference. Sir David Cox (a frequentist) raises these questions (bias
should be obvious). It is a rare occasion where frequentist seriously
entertain de Finetti's ideas and attempts to reject them. There is
also a
Bayesian reply.
Controversies
in the Foundation of Statistics Why
isn't everyone a Bayesian. JSTOR subscription required. These two articles by Efron really
aren't really argumentative, they
just discuss the diffences between the schools from a frequentist
point of view.
The
Foundations of Statistics - are there any?. J. Kiefer. A serious
attempt at critisism of Bayes, with occasional degeneration into mud-slinging.
Revising
Previsions: A Geometric Intepretation. Michael Goldstein JSTOR
subscription required. Some of the most
penetrating critisism of Bayes are by Bayesians themselves, this is
one of the best examples and a proposal for a more flexible Bayesian paradigm.
The Logic of
Chance. John Venn. This work is important historically as the
first time the frequentist intepretation of probability is developed.
It presents some critism of the first Bayesian (Laplace).
Probability is SymmetryKrzysztof Burdzy. It is a bit
hard to categorise this work. It is definatly a serious attemt to
criticise the Bayesian approach. However it is not very connected to
the literature as a whole and the critisisms here are very different
to the ususal ones you will find. The disconnection with the
literature is not a good thing - but the original thought is of course
commendable. I hesitate in adding this to the list, because it is
unpublished material - and the document contains a large number of
errors, identifying these errors is a serious project. If Burdzy took
his ideas seriously enough to subject them to refering and peer review
and publish them in a journal this might be a worthy task. However at
the moment he hasn't.