Larry sent me this review of a book on the philosophy of statistics that Christian and I reviewed recently, which I'll paste in below. Then I'll offer a few comments of my own.
Larry writes:
After reading the reviews of Kris Burzdy's book "The Search for Certainty" that appeared on the blogs of Andrew Gelman and Christian Robert, I was tempted to dismiss the book without reading it. However, curiosity got the best of me and I ordered the book and read it. I am glad I did. I think this is an interesting and important book.
Both Gelman and Robert were disappointed that Burzdy's criticism ofphilosophical work on the foundations of probability did not seem tohave any bearing on their work as statisticians. But that wasprecisely the author's point. In practice, statisticians completelyignore (or misrepresent) the philosophical foundations espoused by deFinetti (subjectivism) and von Mises (frequentism). This is itself adamning criticism of the supposed foundational edifice of statistics.Burdzy makes a convincing case that the philosophy of probability is acomplete failure.
He criticizes von Mises because his theory, based on defining limits of sequences (or collectives) does not assign a probability to a given event. (There are also technical issues with the mathematical definition of a collective that von Mises was unable to resolve but these can be fixed rigorously using modern computational complexity theory. But that doesn't blunt the force of Burzdy's main criticism.)
His criticism of de Finetti is more thorough. There is the usualcriticism, namely, that subjective probability is unscientific as itis not falsifiable. Moreover, there is no guidance on how to actuallyset probabilities. Nor is there anything in de Finetti to suggest thatprobabilities should be based on informed prior opinion, as manyBayesians would argue. More surprising is Burdzy's claim thatsubjective probability has the same problem as von Mises' frequencytheory: it does not provide probability for an individual event. Thisclaim will raise the hackles of die-hard Bayesians. But he is right:de Finetti's coherence argument requires that you bet on severalevents. The rules of probability arise from the demand that you avoida sure losing bet (a Dutch book) on the collection of bets. Theargument does not work if we supply a probability only on a singleevent. The criticisms of de Finetti's subjectivism go beyond this andI will not attempt to summarize them.
Burdzy provides his own foundation for probability. His idea is thatprobability should be a science, not a philosophy, and that, as such,it should be falsifiable. Allow me to make an analogy. Open anyelementary book on quantum mechanics and you will find a set ofaxioms. These axioms can be used to make very specific predictions.If the predictions are wrong, (and they never have been), then theaxioms would be rejected. But to use the axioms, one must inject somespecifics. In particular, one must supply the Hamiltonian for theproblem. If the resulting predictions fail to agree with reality, wecan reject that Hamiltonian.
To make probability scientific, Burzdy proposes laws that lead tocertain predictions that are vulnerable to falsification. Moreimportantly, the specific probability assignments we make are open tobeing falsified. Before stating his laws, let me emphasize acrucial aspect of Burzdy's approach. Probability, he claims, is thesearch for certainty; hence the title of the book. That might seemcounter to how we think of probability but I think his idea iscorrect. In frequentist theory, we make deterministic predictionsabout limits of sequences. In subjectivist theory, we make thedeterministic claim that if we assign probabilities consistent withthe rules of probability then we are certain to be immune to a Dutchbook. A philosophy of probability, according to Burdzy, is the searchfor what claims we can make for certain.
Burdzy's proposal is to have laws -- not axioms -- of probability.Axioms, he points, merely encode fact we regard as uncontroversial.Laws instead, are proposals for a scientific theory that are open tofalsification. Here are his five proposed laws (paraphrased):
(L1) Probabilities are numbers between 0 and 1.
(L2) If A and B are disjoint then P(A or B) = P(A) + P(B).
(L3) If A and B are physically independent then they aremathematically independent meaning that P(A and B) = P(A)P(B).
(L4) If there exists a symmetry on the space of possible outcomeswhich maps an event A onto an event B then P(A)=P(B).
(L5) P(A)=0 if and only if A cannot occur. P(A)=1 if and only if it must occur.
Some comments are in order. (L1) and (L2) are standard of course.(L4) refers to ideas like independent and identically sequences, orexchangeability. It is not an appeal to the principle ofindifference. Quite the opposite. Burdzy argues that introducingsymmetry requires information, not lack of information.
(L3) and (L4) are taught in every probability course as add-ons. Butin fact they are central to how we actually construct probabilities inpractice. The author asks: Why treat them as follow-up ideas? Theyare so central to how we use probability that we should elevate themto the status of fundamental laws.
(L5) is what makes the theory testable. Here is how it works. Basedon our probability assignments, we can construct events A that haveprobability very close to 0 or 1. For example, A could be the eventthat the proportion of heads in many tosses is within .00001 of 1/2.If this doesn't happen, then we have falsified the probabilityassignment. Of course P(A) will rarely be exactly 0 or 1, rather, itwill be close to 0 or 1. But this is precisely what happens in allsciences. We can test prediction of general relativity of quantummechanics to a level of essential certainty, but never exactcertainty. Thus Burdzy's approach puts probability on a level thesame as other scientific theories.
To summarize, Burdzy's approach is to treat probability as ascientific theory. It has rules for making probability assignmentsand the resulting probabilities can be falsified. Not only is thissimple, it is devoid of the murkiness of subjectivism and the weaknessof von Mises' frequentism. And, perhaps most importantly, it reflectshow we use probability. It also happens to be easy to teach. My onlycriticism is that I think the implications of (L1)-(L5) could befleshed out in more detail. It seems to me that they work well forproviding a foundation for testable frequency probability. That is,it provides a convincing link between probability and frequency. Butthat could reflect my own bias towards frequency probability. Moredetail would have been nice.
My short summary of this book does not do justice to the author'sarguments. In particular, there is much more to his critique ofsubjective probability than I have presented in this review. The bestthing about this book is that it will offend and annoy bothfrequentists and subjectivists. I implore my friends on both sides ofthe philosophical divide to read the book with an open mind.
My reply:
1. Whatever von Mises's merits (or lack thereof) in general, I can't take him seriously as a philosopher of statistical practice (see pages 3-4 of this article).
2. As I wrote earlier, Burdzy's comments about subjectivism may or may not be accurate, but they have nothing to do with the Bayesian data analysis that I do. In that sense, I don't think that Larry's comment about "both sides of the philosophical divide" is not particularly helpful. I see no reason to choose between two discredited philosophies, and in fact in chapter 1 of BDA we are very clear about the position we take, which indeed is completely consistent with Popper's ideas of refutation and falsifiability.
As I wrote before, "My guess is that Burdzy would differ very little from Christian Robert or myself when it comes to statistical practice. . . . but I suppose that different styles of presentation will be effective with different audiences." Larry's review suggests that there are such audiences out there.
How far can exchangeability get us toward agreeing on individual
probability?
-
This is Jessica. What’s the common assumption behind the following?
Partial pooling of information over groups in hierarchical Bayesian models
In causal ...
5 小时前