David Rohde's Web Page

PhD Candidate in Computer Science / Physics at University of Queensland. Supervisors Marcus Gallagher (Information Technology and Electrical Engineering and Michael Drinkwater Physics

My PhD Project is about the combination of different astronomical datasets. In many astronomical datasets positional information alone may not be sufficient to determine different observations of the same objects in different datasets.

This PhD project comes under the umbrella of the Virtual Observatory which is a world wide effort to produce tools for handling the great influx of astronomical data. The Australian website is here - for worldwide contributers see International Virtual Observatory Alliance Web site. My project is about a statistical problem in matching datasets other computer science problems aspects of the same problem are also being widely discussed see OpenSkyQuery.

My PhD research centres upon the statistical problem of 'doing science' when the correct match cannot be easilly determined. My initial response to this question was to treat it as a pattern classification task to identify the correct matches.

D. J. Rohde, M. R. Gallagher, M. J. Drinkwater and K. A. Pimbblet. Matching of Catalogues by Probabilistic Pattern Classification. Monthly Notices of the Royal Astronomical Society, 369, 2-14, 2006. (Preprint see MNRAS for original)

D. Rohde, M. Drinkwater, M. Gallagher, T. Downs and M. Doyle. Applying Machine Learning to Catalogue Matching in Astrophysics. Monthly Notices of the Royal Astronomical Society 360(1): 69-75, June 2005. (Preprint see MNRAS for original)

In more recent work I am considering the unkown status of matches to be a nuisance parameter and am handling the problem in a Bayesian Framework. There are strong Philosophical reasons for prefering Bayesian Statistics over Classical Statistics.

The first place that I became aware of the argument for Bayesian Statistics was in Edwin Jaynes (incomplete) book available on the web Probability Theory the logic of science. For such a technical book it is extremly readable and has interesting diversions in to history and philosophy which make it very attractive. It is also straight talking and practical and mercilessly attacks classical statistics. When reading it, it is important to note that in many ways Jaynes is a maverick in that he thinks it is possible to consider inference to be objective. Jaynes has a long running battle with Dawid, Stone and Zidek over a technical issue (propriatary of priors) which seems to seriously challenge the possibility of objective inference - a widespread consensus seems to be that Jaynes got this wrong! See the Unnofficial Erratta. Despite this I think the Jaynes book is a fantastic resource, and absolutely essential in trying to think about statistical problems clearly!

The other famous (acutally more famous) figure in the philosophy of statistics is Bruno de Finetti. His most famous works are Foresight: Its Logical Laws, Its Subjective Sources and Theory of Probability Volume 1 and 2. de Finetti is harder to read than Jaynes particularly his Theory of Probability book. de Finetti's main contribution is the discovery of the relationsip between probability and frequency through the de Finetti representation. This gives an alternative (purely subjective) intepretation to probability. Jaynes discussing this is again very readable in Some Applications and Extensions of the de Finetti Representation Theorem. The only introductory level text that embraces de Finetti's philosphy is Operational Subjective Statistical Methods by Frank Lad. Its a very good book. A book thats out very soon which is similar in spirit is Bayes Linear Statistics by Michael Goldstein and David Wooff. Bayes Linear is a very interesting compromise position. It is easier to implement tha full Bayes, but still has strong subjective foundations.

In one sense de Finetti and Jaynes are on opposite sides of the spectrum, de Finetti arguing for subjectivity and Jaynes objectivity. This issue is the subject of a lenghty discussion in the new Journal Bayesian Analysis here although it seems the majority even the entirety favour the subjectivity (at various levels). The reason to use the term 'objective' is principally for marketing - "If objective statistics exist Bayesian methods are still superior".

An excellent article (with discussion) is The Philosophy of statistics by Dennis Lindley (JSTOR subscription required). A nice feature of stats journals is that they include discussion from several other scientists - in this case the level of agreement between statisticians seems extremely small. I think scientists reading this article should feel liberated. The high level of disagreement suggests that the empirical way any scientist presents their results are almost certainly defendable within one statistical paradigm or another. The review also suggests that many statisticians (of all schools!) regard statistical inference as highly subjective and problem dependant! Another good overview of Bayesian Statistics is by Jose Bernardo

Machine Learning Bayesians

David Mackay has both his thesis and his new book is available on the web.

Radford Neal Was the first person to apply Markov Chain Monte Carlo numerical integration techniques to machine learning models. There are a number of interesting papers on his website. His MCMC neural network software is also available for download.

A new book by Carl Rasmussen and Chris Williams Gaussian Processes for Machine Learning has some chapters available for free online.

Christopher Bishop Is another leading Bayesian machine learning researcher and the author of one of the standard neural network texts.

Criticism of Bayes

I think it is useful to starkly present the strongest anti-Bayesian argument possible. I think reading these documents does a great deal to displace the argument that there is a strong anti-Bayesian case.

Some issues in the foundations of statistics David Freedman (with discussion) Springer subsription required. In my view one of the best Anti-Bayesian arguments is contained in this essay.

On the consistency of Bayes Estimates. Persi Diaconis and David Freedman (with discussion) JSTOR subscription required. A fairly technical article - but it is easy to get the gist by reading the discussion. The paper identifies situations where an infinite amount of data may not bring two people into agreement. A similar situation is identified in Probability Theory the logic of science Chapter 5 which shows situations where two people observing the same dataset become polarised into opposing views. I think that large datasets do not produce concensus is a very important issue to study and understand further however I can't see how it is an anti-Bayesian argument....

The Subjective Theory of Probability. Donald Gillies JSTOR subscription required. Gillies is a close follower of Karl Popper who's falseficationist ideas are incompatible with Bayes. However he understands the subjectivist argument well, and this essay is a clear expression of de Finetti's ideas.

Reflections on Fourteen Cryptic Issues Concerning the Nature of Statistical Inference. Sir David Cox (a frequentist) raises these questions (bias should be obvious). It is a rare occasion where frequentist seriously entertain de Finetti's ideas and attempts to reject them. There is also a Bayesian reply.

Controversies in the Foundation of Statistics Why isn't everyone a Bayesian. JSTOR subscription required. These two articles by Efron really aren't really argumentative, they just discuss the diffences between the schools from a frequentist point of view.

The Foundations of Statistics - are there any?. J. Kiefer. A serious attempt at critisism of Bayes, with occasional degeneration into mud-slinging.

Revising Previsions: A Geometric Intepretation. Michael Goldstein JSTOR subscription required. Some of the most penetrating critisism of Bayes are by Bayesians themselves, this is one of the best examples and a proposal for a more flexible Bayesian paradigm.

The Logic of Chance. John Venn. This work is important historically as the first time the frequentist intepretation of probability is developed. It presents some critism of the first Bayesian (Laplace).

Probability is SymmetryKrzysztof Burdzy. It is a bit hard to categorise this work. It is definatly a serious attemt to criticise the Bayesian approach. However it is not very connected to the literature as a whole and the critisisms here are very different to the ususal ones you will find. The disconnection with the literature is not a good thing - but the original thought is of course commendable. I hesitate in adding this to the list, because it is unpublished material - and the document contains a large number of errors, identifying these errors is a serious project. If Burdzy took his ideas seriously enough to subject them to refering and peer review and publish them in a journal this might be a worthy task. However at the moment he hasn't.