Abstract
This paper discusses conditional probability \(P(A{\vert }B)\), or the probability of A given B. When \(P(B)>0\), the ratio formula determines \(P(A {\vert } B)\). When \(P(B)=0\), the ratio formula breaks down. The Borel–Kolmogorov paradox suggests that conditional probabilities in such cases are indeterminate or ill-posed. To analyze the paradox, I explore the relation between probability and intensionality. I argue that the paradox is a Frege case, similar to those that arise in many probabilistic and non-probabilistic contexts. The paradox vividly illustrates how an agent’s way of representing an entity can rationally influence her credal assignments. I deploy my analysis to defend Kolmogorov’s relativistic treatment of conditional probability.
Similar content being viewed by others
Notes
As Easwaran (2008) observes, decomposition of the sphere into meridians is not technically a partition, because any two meridians share the North and South Pole. There are various technical fixes here, such as associating the North and South Poles with one privileged meridian. I will ignore this issue. Taking it into account would muddy the exposition without affecting any essential features of my argument.
I follow Billingsley’s (1995, pp. 427–440) exposition of Kolmogorov’s theory.
Following standard practice, I use “\(X\in B\)” as shorthand for “\(\{\omega : X(\omega ) \in B\}\)” and “\(\{X=x\}\)” as shorthand for “\(\{\omega : X(\omega )=x\}\)”.
The example recalls van Fraassen’s box factory. For related discussion, see Bangu (2010).
Random variable \(\Psi \) differs from the random variable \(\Theta \) from Sect. 3. \(\Theta \) measures latitude in the same coordinate system as \(\Theta \), while \(\Psi \) measures latitude in a second coordinate system. Arc \(C\) is a meridian in the first coordinate system, while \(C\) occupies half the equator in the second coordinate system. Lines of constant \(\Theta \) are parallels, while lines of constant \(\Psi \) are half-parallels. \({range}(\Theta )=\left[ {-\frac{\pi }{2},\frac{\pi }{2}} \right] \), while \({range}(\Psi ) = (-\pi , \pi ]\). These changes are to ensure that some equation \(\Psi =\psi _{0}\) defines \(C\). It is natural to introduce an additional random variable \(\Gamma \) with \({range}(\Gamma ) = [0, \pi )\), where each equation \(\Gamma =\upgamma \) defines a great circle through the North Pole of the second coordinate system. I will describe \(\Gamma \) as measuring longitude, although strictly speaking a point that satisfies \(\Gamma =\upgamma \) may have either longitude \(\upgamma \) or longitude \(\upgamma +\pi \). One can show that \(p(\upgamma {\vert } \Psi =\psi )=\frac{1}{\pi }\).
Some authors claim that conditional probabilities depend not just upon the conditioning event but also upon the way one learns that the conditioning event occurred (Easwaran 2008, pp. 88–89), (Lindley 1997, p. 184), (Shafer 1985, p. 262). In my view, the key explanatory factor is not the way one learns that the conditioning event occurred but rather the way one represents the conditioning event. One can learn through many diverse avenues that an event as represented a single fixed way occurred. In some cases, a fixed way of representing the conditioning event may have constitutive ties to certain canonical verification procedures. But non-canonical verification procedures can also establish that the event as represented that same way occurred. For example, one can learn that \(\Phi =\varphi _{0}\) through direct measurement, deductive reasoning, testimony, abductive inference, or various other avenues. These variations in learning method do not seem relevant to conditionalization. What matters is simply that one represents the conditioning event in a certain way: namely as the event \(\Phi =\varphi _{0}\). The content of one’s knowledge, not its etiology nor its justificatory basis, is the relevant factor.
For ease of exposition, I will frequently attribute various doctrines and assumptions to Kolmogorov. However, I do not suggest that my treatment explicates Kolmogorov’s intentions. I claim only that it constitutes one fruitful way of relating Kolmogorov’s mathematical framework to the non-mathematical realm. Kolmogorov favored a frequentist viewpoint (1933/1956, p. 3), so presumably he would not have endorsed my subjectivist approach.
When I use the phrase “a constant density over longitude” here and in subsequent passages, I mean the conditional pdf \(p(\upgamma {\vert } \Psi =\psi )=\frac{1}{\pi }\) described in note 7.
My discussion involves normative claims about how ideal agents should update their credences. It thereby differs significantly from the idealizations employed in natural science, which simplify complex reality without making normative claims (Wimsatt 2007, pp. 15–25). For discussion of ideal agents as an epistemological tool, see (Shaffer 2007). For discussion of idealization in scientific modeling, see (Shaffer 2012) and (Wimsatt 2007, pp. 3–4, 26–36, 94–132, 152–154).
Thanks to an anonymous referee for suggesting that I discuss this example.
References
Arnold, B. C., & Roberston, C. A. (2002). The conditional distribution of X given X = Y can be almost anything!. In N. Balakrishnan (Ed.), Advances on theoretical and methodological aspects of probability and statistics. New York: Taylor and Francis.
Arntzenius, F., Elga, A., & Hawthorne, J. (2004). Bayesianism, infinite decisions, and binding. Mind, 113, 251–283.
Bangu, S. (2010). On Bertrand’s paradox. Analysis, 70, 30–35.
Bertrand, J. (1889). Calculs des Probabilités. Paris: Gauthier-Villars.
Billingsley, P. (1995). Probability and measure (3rd ed.). New York: Wiley.
Borel, E. (1909/1956). Elements of the theory of probability (J. Freund, Trans.). Englewood Cliffs: Prentice-Hall.
Burge, T. (2009). Five theses on de re states and attitudes. In J. Almog & P. Leonardi (Eds.), The philosophy of David Kaplan. Oxford: Oxford University Press.
Chalmers, D. (2011). Frege’s puzzle and the objects of credence. Mind, 120, 587–635.
Davidson, D. (2004). Problems of rationality. Oxford: Oxford University Press.
de Finetti, B. (1972). Probability, induction, and statistics. New York: Wiley.
Dubins, L. (1975). Finitely additive conditional probabilities, conglomerability and disintegrations. Annals of Probability, 3, 89–99.
Easwaran, K. (2008). The foundations of conditional probability. PhD dissertation, University of California, Berkeley. Ann Arbor: ProQuest/UMI (Publication No. 3331592).
Easwaran, K. (2011). Varieties of conditional probability. In P. Bandyopadhyay & M. Forster (Eds.), Philosophy of statistics. Burlington: Elsevier.
Field, H. (2001). Truth and the absence of fact. Oxford: Oxford University Press.
Gorroochurn, P. (2012). Classic problems of probability. Hoboken: Wiley.
Hájek, A. (2003). What conditional probability could not be. Synthese, 137, 273–323.
Hájek, A. (2011). Conditional probability. In P. Bandyopadhyay & M. Forster (Eds.), Philosophy of statistics. Burlington: Elsevier.
Hill, B. (1980). On some statistical paradoxes and non-conglomerability. Trabajos de Estadistica de Investigacion Operativa, 31, 39–66.
Howson, C. (2014). Finite additivity, another lottery paradox, and conditionalization. Synthese, 191, 989–1012.
Jaynes, E. T. (2003). Probability theory: The logic of science. Cambridge: Cambridge University Press.
Kadane, J., Schervish, M., & Seidenfeld, T. (1986). Statistical implications of finitely additive probability. In P. Goel & A. Zellner (Eds.), Bayesian inference and decision techniques. Amsterdam: North-Holland.
Kadane, J., Schervish, M., & Seidenfeld, T. (2001). Improper regular conditional distributions. Annals of Probability, 29, 1612–1624.
Kolmogorov, A. N. (1933/1956). Foundations of the theory of probability (2nd English ed.) (N. Morrison, Trans.). New York: Chelsea.
Lindley, D. (1982). The Bayesian approach to statistics. In J. T. de Oliveria & B. Epstein (Eds.), Some recent advances in statistics. New York: Academic Press.
Lindley, D. (1997). Some comments on Bayes factors. Journal of Statistical Planning and Inference, 61, 181–189.
Myrvold, W. (2014). You can’t always get what you want: Some considerations regarding conditional probabilities. Erkenntnis. doi:10.1007/s10670-014-9656-3.
Pfanzagl, J. (1979). Conditional distributions as derivatives. Annals of Probability, 7, 1046–1050.
Popper, K. (1959). The logic of scientific discovery. London: Hutchinson.
Proschan, M., & Presnell, B. (1998). Expect the unexpected from conditional expectation. The American Statistician, 52, 248–252.
Rao, M. M. (2005). Conditional measures and applications (2nd ed.). Boca Raton: CRC Press.
Rényi, A. (1955). On a new axiomatic theory of probability. Acta Mathematica Academiae Scientiarum Hungarica, 6, 285–335.
Shafer, G. (1985). Conditional probability. International Statistical Review, 53, 261–277.
Shaffer, M. (2007). Bealer on the autonomy of philosophical and scientific knowledge. Metaphilosophy, 38, 44–54.
Shaffer, M. (2012). Counterfactuals and scientific realism. New York: Palgrave-MacMillan.
Tjur, T. (1980). Probability based on radon measures. New York: Wiley.
van Fraassen, B. (1989). Laws and symmetry. Oxford: Clarendon Press.
von Mises, R. (1957). Probability, statistics, and truth (2nd English ed.). London: Allen and Unwin.
Wimsatt, W. (2007). Re-engineering philosophy for limited beings. Cambridge: Harvard University Press.
Acknowledgments
I am indebted to Tim Butzer, Kenny Easwaran, Greg Gandenberger, Colin Howson, Agustín Rayo, Frederick Paik Schoenberg, and two anonymous referees for this journal for helpful comments on this paper. I also thank Melanie Schoenberg for assistance in preparing Figs. 1, 2, 3, 4, 5, 6, and 7 and RJ Duran for creating the final Mathematica versions of the figures.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rescorla, M. Some epistemological ramifications of the Borel–Kolmogorov paradox. Synthese 192, 735–767 (2015). https://doi.org/10.1007/s11229-014-0586-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11229-014-0586-z