Bayesian probability: Difference between revisions

Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Added detail
 
imported>WeyerStudentOfAgrippa
 
Line 1: Line 1:
{{Short description|Interpretation of probability}}
{{broader|Bayesian statistics}}
{{broader|Bayesian statistics}}
{{Bayesian statistics}}{{Short description|Interpretation of probability}}'''Bayesian probability''' ({{IPAc-en|ˈ|b|eɪ|z|i|ə|n}} {{respell|BAY|zee|ən}} or {{IPAc-en|ˈ|b|eɪ|ʒ|ən}} {{respell|BAY|zhən}}){{refn|{{MerriamWebsterDictionary|=2023-08-12|Bayesian}}}} is an [[Probability interpretations|interpretation of the concept of probability]], in which, instead of [[frequentist probability|frequency]] or [[propensity probability|propensity]] of some phenomenon, probability is interpreted as reasonable expectation<ref>{{Cite journal |last=Cox |first=R.T. |author-link=Richard Threlkeld Cox |doi=10.1119/1.1990764 |title=Probability, Frequency, and Reasonable Expectation |journal=American Journal of Physics |volume=14 |issue=1 |pages=1–10 |year=1946 |bibcode=1946AmJPh..14....1C }}</ref> representing a state of knowledge<ref name="ghxaib">{{cite book |author=Jaynes, E.T. |year=1986 |contribution=Bayesian Methods: General Background |title=Maximum-Entropy and Bayesian Methods in Applied Statistics |editor=Justice, J. H. |location=Cambridge |publisher=Cambridge University Press|bibcode=1986mebm.conf.....J |citeseerx=10.1.1.41.1055 }}</ref> or as quantification of a personal belief.<ref name="Finetti, B. 1974">{{cite book |last1=de Finetti |first1=Bruno |title=Theory of Probability: A critical introductory treatment |year=2017 |publisher=John Wiley & Sons Ltd. |location=Chichester|isbn=9781119286370}}</ref>
{{Bayesian statistics}}
 
'''Bayesian probability''' ({{IPAc-en|ˈ|b|eɪ|z|i|ə|n}} {{respell|BAY|zee|ən}} or {{IPAc-en|ˈ|b|eɪ|ʒ|ən}} {{respell|BAY|zhən}}){{refn|{{MerriamWebsterDictionary|=2023-08-12|Bayesian}}}} is an [[Probability interpretations|interpretation of the concept of probability]], in which, instead of [[frequentist probability|frequency]] or [[propensity probability|propensity]] of some phenomenon, probability is interpreted as reasonable expectation<ref>{{Cite journal |last=Cox |first=R.T. |author-link=Richard Threlkeld Cox |doi=10.1119/1.1990764 |title=Probability, Frequency, and Reasonable Expectation |journal=American Journal of Physics |volume=14 |issue=1 |pages=1–10 |year=1946 |bibcode=1946AmJPh..14....1C }}</ref> representing a state of knowledge<ref name="ghxaib">{{cite book |author=Jaynes, E.T. |year=1986 |contribution=Bayesian Methods: General Background |title=Maximum-Entropy and Bayesian Methods in Applied Statistics |editor=Justice, J. H. |location=Cambridge |publisher=Cambridge University Press|bibcode=1986mebm.conf.....J |citeseerx=10.1.1.41.1055 }}</ref> or as quantification of a personal belief.<ref name="Finetti, B. 1974">{{cite book |last1=de Finetti |first1=Bruno |title=Theory of Probability: A critical introductory treatment |year=2017 |publisher=John Wiley & Sons Ltd. |location=Chichester|isbn=9781119286370}}</ref>


The Bayesian interpretation of probability can be seen as an extension of [[propositional logic]] that enables reasoning with [[Hypothesis|hypotheses]];<ref name="Hailperin, T. 1996">{{cite book |last1=Hailperin |first1=Theodore |title=Sentential Probability Logic: Origins, Development, Current Status, and Technical Applications |year=1996 |publisher=Associated University Presses|location=London|isbn=0934223459}}</ref><ref>{{cite book |first=Colin |last=Howson |chapter=The Logic of Bayesian Probability |pages=137–159 |editor-first=D. |editor-last=Corfield |editor2-first=J. |editor2-last=Williamson |title=Foundations of Bayesianism |location=Dordrecht |publisher=Kluwer |year=2001 |isbn=1-4020-0223-8 }}</ref> that is, with propositions whose [[truth value|truth or falsity]] is unknown. In the Bayesian view, a probability is assigned to a hypothesis, whereas under [[frequentist inference]], a hypothesis is typically tested without being assigned a probability.
The Bayesian interpretation of probability can be seen as an extension of [[propositional logic]] that enables reasoning with [[Hypothesis|hypotheses]];<ref name="Hailperin, T. 1996">{{cite book |last1=Hailperin |first1=Theodore |title=Sentential Probability Logic: Origins, Development, Current Status, and Technical Applications |year=1996 |publisher=Associated University Presses|location=London|isbn=0934223459}}</ref><ref>{{cite book |first=Colin |last=Howson |chapter=The Logic of Bayesian Probability |pages=137–159 |editor-first=D. |editor-last=Corfield |editor2-first=J. |editor2-last=Williamson |title=Foundations of Bayesianism |location=Dordrecht |publisher=Kluwer |year=2001 |isbn=1-4020-0223-8 }}</ref> that is, with propositions whose [[truth value|truth or falsity]] is unknown. In the Bayesian view, a probability is assigned to a hypothesis, whereas under [[frequentist inference]], a hypothesis is typically tested without being assigned a probability.
Line 21: Line 24:
{{Main|History of statistics#Bayesian statistics}}
{{Main|History of statistics#Bayesian statistics}}


The term ''Bayesian'' derives from [[Thomas Bayes]] (1702–1761), who proved a special case of what is now called [[Bayes' theorem]] in a paper titled "[[An Essay Towards Solving a Problem in the Doctrine of Chances]]".<ref>{{cite book |author=McGrayne, Sharon Bertsch |year=2011 |title=The Theory that Would not Die |url=https://archive.org/details/theorythatwouldn0000mcgr |url-access=registration |at={{Google books|_Kx5xVGuLRIC|&nbsp;|page=[https://archive.org/details/theorythatwouldn0000mcgr/page/10 10]}} }}</ref> In that special case, the prior and posterior distributions were [[beta distribution]]s and the data came from [[Bernoulli trial]]s. It was [[Pierre-Simon Laplace]] (1749–1827) who introduced a general version of the theorem and used it to approach problems in [[celestial mechanics]], medical statistics, [[Reliability (statistics)|reliability]], and [[jurisprudence]].<ref>{{cite book |author=Stigler, Stephen M. |year=1986 |title=The History of Statistics |chapter-url=https://archive.org/details/historyofstatist00stig |chapter-url-access=registration |publisher=Harvard University Press |chapter=Chapter&nbsp;3|isbn=9780674403406 }}</ref> Early Bayesian inference, which used uniform priors following Laplace's [[principle of insufficient reason]], was called "[[inverse probability]]" (because it [[Inductive reasoning|infer]]s backwards from observations to parameters, or from effects to causes).<ref name=Fienberg2006>{{cite journal |author=Fienberg, Stephen. E. |year=2006 |url=http://ba.stat.cmu.edu/journal/2006/vol01/issue01/fienberg.pdf |title=When did Bayesian Inference become "Bayesian"? |archive-url=https://web.archive.org/web/20140910070556/http://ba.stat.cmu.edu/journal/2006/vol01/issue01/fienberg.pdf |archive-date=10 September 2014 |journal=Bayesian Analysis |volume=1 |issue=1 |pages=5, 1–40|doi=10.1214/06-BA101 |doi-access=free }}</ref> After the 1920s, "inverse probability" was largely supplanted by a collection of methods that came to be called [[frequentist statistics]].<ref name=Fienberg2006/>
The term ''Bayesian'' derives from [[Thomas Bayes]] (1702–1761), who proved a special case of what is now called [[Bayes' theorem]] in a paper titled "[[An Essay Towards Solving a Problem in the Doctrine of Chances]]".<ref>{{cite book |author=McGrayne, Sharon Bertsch |year=2011 |title=The Theory that Would not Die |url=https://archive.org/details/theorythatwouldn0000mcgr |url-access=registration |at={{Google books|_Kx5xVGuLRIC|&nbsp;|page=[https://archive.org/details/theorythatwouldn0000mcgr/page/10 10]}} }}</ref> In that special case, the prior and posterior distributions were [[beta distribution]]s and the data came from [[Bernoulli trial]]s. It was [[Pierre-Simon Laplace]] (1749–1827) who introduced a general version of the theorem and used it to approach problems in [[celestial mechanics]], medical statistics, [[Reliability (statistics)|reliability]], and [[jurisprudence]].<ref>{{cite book |author=Stigler, Stephen M. |year=1986 |title=The History of Statistics |chapter-url=https://archive.org/details/historyofstatist00stig |chapter-url-access=registration |publisher=Harvard University Press |chapter=Chapter&nbsp;3|isbn=9780674403406 }}</ref> Early Bayesian inference, which used uniform priors following Laplace's [[principle of insufficient reason]], was called "[[inverse probability]]" (because it [[Inductive reasoning|infer]]s backwards from observations to parameters, or from effects to causes).<ref name=Fienberg2006>{{cite journal |author=Fienberg, Stephen. E. |year=2006 |url=http://ba.stat.cmu.edu/journal/2006/vol01/issue01/fienberg.pdf |title=When did Bayesian Inference become "Bayesian"? |archive-url=https://web.archive.org/web/20140910070556/http://ba.stat.cmu.edu/journal/2006/vol01/issue01/fienberg.pdf |archive-date=10 September 2014 |journal=Bayesian Analysis |volume=1 |issue=1 |pages=5, 1–40|doi=10.1214/06-BA101 |doi-access=free |bibcode=2006BayAn...1BA101F }}</ref> After the 1920s, "inverse probability" was largely supplanted by a collection of methods that came to be called [[frequentist statistics]].<ref name=Fienberg2006/>


In the 20th century, the ideas of Laplace developed in two directions, giving rise to ''objective'' and ''subjective'' currents in Bayesian practice.
In the 20th century, the ideas of Laplace developed in two directions, giving rise to ''objective'' and ''subjective'' currents in Bayesian practice.
Line 32: Line 35:


===Axiomatic approach===
===Axiomatic approach===
[[Richard Threlkeld Cox|Richard T. Cox]] showed that Bayesian updating follows from several axioms, including two [[functional equations]] and a hypothesis of differentiability.<ref name = "vkdmsn" /><ref>{{cite book |first1=C. Ray |last1=Smith |first2=Gary |last2=Erickson |chapter=From Rationality and Consistency to Bayesian Probability |pages=29–44 |title=Maximum Entropy and Bayesian Methods |editor-first=John |editor-last=Skilling |location=Dordrecht |publisher=Kluwer |year=1989 |isbn=0-7923-0224-9 |doi=10.1007/978-94-015-7860-8_2 }}</ref> The assumption of differentiability or even continuity is controversial; Halpern found a counterexample based on his observation that the Boolean algebra of statements may be finite.<ref>{{cite journal |author=Halpern, J. |title=A counterexample to theorems of Cox and Fine |journal=Journal of Artificial Intelligence Research |volume=10 |pages=67–85|url=http://www.cs.cornell.edu/info/people/halpern/papers/cox.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://www.cs.cornell.edu/info/people/halpern/papers/cox.pdf |archive-date=2022-10-09 |url-status=live|doi=10.1613/jair.536 |year=1999 |s2cid=1538503 |doi-access=free }}</ref> Other axiomatizations have been suggested by various authors with the purpose of making the theory more rigorous.<ref name="rbp">{{cite journal |author1=Dupré, Maurice J. |author2=Tipler, Frank J. |url=http://projecteuclid.org/download/pdf_1/euclid.ba/1340369856 |title=New axioms for rigorous Bayesian probability |journal=Bayesian Analysis |volume=4 |year=2009 |issue=3 |pages=599–606|doi=10.1214/09-BA422 |citeseerx=10.1.1.612.3036 }}</ref>
[[Richard Threlkeld Cox|Richard T. Cox]] showed that Bayesian updating follows from several axioms, including two [[functional equations]] and a hypothesis of differentiability.<ref name = "vkdmsn" /><ref>{{cite book |first1=C. Ray |last1=Smith |first2=Gary |last2=Erickson |chapter=From Rationality and Consistency to Bayesian Probability |pages=29–44 |title=Maximum Entropy and Bayesian Methods |editor-first=John |editor-last=Skilling |location=Dordrecht |publisher=Kluwer |year=1989 |isbn=0-7923-0224-9 |doi=10.1007/978-94-015-7860-8_2 }}</ref> The assumption of differentiability or even continuity is controversial; Halpern found a counterexample based on his observation that the Boolean algebra of statements may be finite.<ref>{{cite journal |author=Halpern, J. |title=A counterexample to theorems of Cox and Fine |journal=Journal of Artificial Intelligence Research |volume=10 |pages=67–85|url=https://www.cs.cornell.edu/info/people/halpern/papers/cox.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://www.cs.cornell.edu/info/people/halpern/papers/cox.pdf |archive-date=2022-10-09 |url-status=live|doi=10.1613/jair.536 |year=1999 |s2cid=1538503 |doi-access=free }}</ref> Other axiomatizations have been suggested by various authors with the purpose of making the theory more rigorous.<ref name="rbp">{{cite journal |author1=Dupré, Maurice J. |author2=Tipler, Frank J. |url=http://projecteuclid.org/download/pdf_1/euclid.ba/1340369856 |title=New axioms for rigorous Bayesian probability |journal=Bayesian Analysis |volume=4 |year=2009 |issue=3 |pages=599–606|doi=10.1214/09-BA422 |citeseerx=10.1.1.612.3036 }}</ref>


===Dutch book approach===
===Dutch book approach===
Line 45: Line 48:
A [[statistical decision theory|decision-theoretic]] justification of the use of Bayesian inference (and hence of Bayesian probabilities) was given by [[Abraham Wald]], who proved that every [[admissible decision rule|admissible]] statistical procedure is either a Bayesian procedure or a limit of Bayesian procedures.<ref>{{cite book |author=Wald, Abraham |title=Statistical Decision Functions |publisher=Wiley |year=1950}}</ref> Conversely, every Bayesian procedure is [[admissible decision rule|admissible]].<ref>{{cite book |author1=Bernardo, José M. |author2=Smith, Adrian F.M. |title=Bayesian Theory |publisher=John Wiley |year=1994 |isbn=0-471-92416-4}}</ref>
A [[statistical decision theory|decision-theoretic]] justification of the use of Bayesian inference (and hence of Bayesian probabilities) was given by [[Abraham Wald]], who proved that every [[admissible decision rule|admissible]] statistical procedure is either a Bayesian procedure or a limit of Bayesian procedures.<ref>{{cite book |author=Wald, Abraham |title=Statistical Decision Functions |publisher=Wiley |year=1950}}</ref> Conversely, every Bayesian procedure is [[admissible decision rule|admissible]].<ref>{{cite book |author1=Bernardo, José M. |author2=Smith, Adrian F.M. |title=Bayesian Theory |publisher=John Wiley |year=1994 |isbn=0-471-92416-4}}</ref>


==Personal probabilities and objective methods for constructing priors{{Anchor|subjective}}==
==Personal probabilities and objective methods for constructing priors==
Following the work on [[expected utility]] [[optimal decision|theory]] of [[Frank P. Ramsey|Ramsey]] and [[John von Neumann|von Neumann]], decision-theorists have accounted for [[optimal decision|rational behavior]] using a probability distribution for the [[Agent-based model|agent]]. [[Johann Pfanzagl]] completed the ''[[Theory of Games and Economic Behavior]]'' by providing an axiomatization of subjective probability and utility, a task left uncompleted by von Neumann and [[Oskar Morgenstern]]: their original theory supposed that all the agents had the same probability distribution, as a convenience.<ref>Pfanzagl (1967, 1968)</ref> Pfanzagl's axiomatization was endorsed by Oskar Morgenstern: "Von Neumann and I have anticipated ... [the question whether probabilities] might, perhaps more typically, be subjective and have stated specifically that in the latter case axioms could be found from which could derive the desired numerical utility together with a number for the probabilities (cf. p. 19 of The Theory of Games and Economic Behavior). We did not carry this out; it was demonstrated by Pfanzagl ... with all the necessary rigor".<ref>Morgenstern (1976, page 65)</ref>
<!--'Subjective probability' redirects here-->
{{Anchor|subjective}}
Following the work on [[expected utility]] [[optimal decision|theory]] of [[Frank P. Ramsey|Ramsey]] and [[John von Neumann|von Neumann]], decision-theorists have accounted for [[optimal decision|rational behavior]] using a probability distribution for the [[Agent-based model|agent]]. [[Johann Pfanzagl]] completed the ''[[Theory of Games and Economic Behavior]]'' by providing an axiomatization of '''subjective probability'''<!--boldface per WP:R#PLA--> and utility, a task left uncompleted by von Neumann and [[Oskar Morgenstern]]: their original theory supposed that all the agents had the same probability distribution, as a convenience.<ref>Pfanzagl (1967, 1968)</ref> Pfanzagl's axiomatization was endorsed by Oskar Morgenstern: "Von Neumann and I have anticipated ... [the question whether probabilities] might, perhaps more typically, be subjective and have stated specifically that in the latter case axioms could be found from which could derive the desired numerical utility together with a number for the probabilities (cf. p. 19 of The Theory of Games and Economic Behavior). We did not carry this out; it was demonstrated by Pfanzagl ... with all the necessary rigor".<ref>Morgenstern (1976, page 65)</ref>


Ramsey and [[Leonard Jimmie Savage|Savage]] noted that the individual agent's probability distribution could be objectively studied in experiments. Procedures for [[statistical hypothesis testing|testing hypotheses]] about probabilities (using finite samples) are due to [[Frank P. Ramsey|Ramsey]] (1931) and [[Bruno de Finetti|de Finetti]] (1931, 1937, 1964, 1970). Both [[Bruno de Finetti]]<ref>{{Cite journal |last=Galavotti |first=Maria Carla|author-link= Maria Carla Galavotti |date=1989-01-01 |title=Anti-Realism in the Philosophy of Probability: Bruno de Finetti's Subjectivism |journal=Erkenntnis |volume=31 |issue=2/3 |pages=239–261 |doi=10.1007/bf01236565 |jstor=20012239 |s2cid=170802937 |df=dmy-all}}</ref><ref name=":0">{{Cite journal |last=Galavotti |first=Maria Carla|author-link= Maria Carla Galavotti |date=1991-12-01 |title=The notion of subjective probability in the work of Ramsey and de Finetti |journal=Theoria |language=en |volume=57 |issue=3 |pages=239–259 |doi=10.1111/j.1755-2567.1991.tb00839.x |issn=1755-2567 |df=dmy-all}}</ref> and [[Frank P. Ramsey]]<ref name=":0" /><ref name=":1">{{Cite book |title=Frank Ramsey: Truth and Success |last1=Dokic |first1=Jérôme |last2=Engel |first2=Pascal |publisher=Routledge |year=2003 |isbn=9781134445936}}</ref> acknowledge their debts to [[pragmatic philosophy]], particularly (for Ramsey) to [[Charles Sanders Peirce|Charles S. Peirce]].<ref name=":0" /><ref name=":1" />
Ramsey and [[Leonard Jimmie Savage|Savage]] noted that the individual agent's probability distribution could be objectively studied in experiments. Procedures for [[statistical hypothesis testing|testing hypotheses]] about probabilities (using finite samples) are due to [[Frank P. Ramsey|Ramsey]] (1931) and [[Bruno de Finetti|de Finetti]] (1931, 1937, 1964, 1970). Both [[Bruno de Finetti]]<ref>{{Cite journal |last=Galavotti |first=Maria Carla|author-link= Maria Carla Galavotti |date=1989-01-01 |title=Anti-Realism in the Philosophy of Probability: Bruno de Finetti's Subjectivism |journal=Erkenntnis |volume=31 |issue=2/3 |pages=239–261 |doi=10.1007/bf01236565 |jstor=20012239 |s2cid=170802937 |df=dmy-all}}</ref><ref name=":0">{{Cite journal |last=Galavotti |first=Maria Carla|author-link= Maria Carla Galavotti |date=1991-12-01 |title=The notion of subjective probability in the work of Ramsey and de Finetti |journal=Theoria |language=en |volume=57 |issue=3 |pages=239–259 |doi=10.1111/j.1755-2567.1991.tb00839.x |issn=1755-2567 |df=dmy-all}}</ref> and [[Frank P. Ramsey]]<ref name=":0" /><ref name=":1">{{Cite book |title=Frank Ramsey: Truth and Success |last1=Dokic |first1=Jérôme |last2=Engel |first2=Pascal |publisher=Routledge |year=2003 |isbn=9781134445936}}</ref> acknowledge their debts to [[pragmatic philosophy]], particularly (for Ramsey) to [[Charles Sanders Peirce|Charles S. Peirce]].<ref name=":0" /><ref name=":1" />
Line 117: Line 122:
* {{cite book |author=Winkler, R.L. |title=Introduction to Bayesian Inference and Decision |publisher=Probabilistic |year=2003 |isbn=978-0-9647938-4-2 |edition=2nd |quote=Updated classic textbook. Bayesian theory clearly presented}}
* {{cite book |author=Winkler, R.L. |title=Introduction to Bayesian Inference and Decision |publisher=Probabilistic |year=2003 |isbn=978-0-9647938-4-2 |edition=2nd |quote=Updated classic textbook. Bayesian theory clearly presented}}
{{divcol end}}
{{divcol end}}
 
{{Statistics|inference}}
[[Category:Bayesian statistics|Probability]]
[[Category:Bayesian statistics|Probability]]
[[Category:Justification (epistemology)]]
[[Category:Justification (epistemology)]]