Geometric distribution
Template:Infobox probability distribution 2 In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:
- The probability distribution of the number Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} of Bernoulli trials needed to get one success, supported on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N} = \{1,2,3,\ldots\}} ;
- The probability distribution of the number Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y=X-1} of failures before the first success, supported on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0 = \{0, 1, 2, \ldots \} } .
These two different geometric distributions should not be confused with each other. Often, the name shifted geometric distribution is adopted for the former one (distribution of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} ); however, to avoid ambiguity, it is considered wise to indicate which is intended, by mentioning the support explicitly.
The geometric distribution gives the probability that the first occurrence of success requires Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k} independent trials, each with success probability Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} . If the probability of success on each trial is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} , then the probability that the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k} -th trial is the first success is
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Pr(X = k) = (1-p)^{k-1}p}
for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k=1,2,3,4,\dots}
The above form of the geometric distribution is used for modeling the number of trials up to and including the first success. By contrast, the following form of the geometric distribution is used for modeling the number of failures until the first success:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Pr(Y=k) = \Pr(X=k+1) = (1 - p)^k p}
for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k=0,1,2,3,\dots}
The geometric distribution gets its name because its probabilities follow a geometric sequence. It is sometimes called the Furry distribution after Wendell H. Furry.[1]: 210
Definition
[edit | edit source]The geometric distribution is the discrete probability distribution that describes when the first success in an infinite sequence of independent and identically distributed Bernoulli trials occurs. Its probability mass function depends on its parameterization and support. When supported on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}} , the probability mass function is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(X = k) = (1 - p)^{k-1} p} where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k = 1, 2, 3, \dotsc} is the number of trials and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} is the probability of success in each trial.[2]: 260–261
The support may also be Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} , defining Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y=X-1} . This alters the probability mass function into Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Y = k) = (1 - p)^k p} where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k = 0, 1, 2, \dotsc} is the number of failures before the first success.[3]: 66
An alternative parameterization of the distribution gives the probability mass function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(Y = k) = \left(\frac{P}{Q}\right)^k \left(1-\frac{P}{Q}\right)} where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P = \frac{1-p}{p}} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Q = \frac{1}{p}} .[1]: 208–209
An example of a geometric distribution arises from rolling a six-sided die until a "1" appears. Each roll is independent with a Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 1/6} chance of success. The number of rolls needed follows a geometric distribution with Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p=1/6} .
Properties
[edit | edit source]Memorylessness
[edit | edit source]The geometric distribution is the only memoryless discrete probability distribution.[4] It is the discrete version of the same property found in the exponential distribution.[1]: 228 The property asserts that the number of previously failed trials does not affect the number of future trials needed for a success.
Because there are two definitions of the geometric distribution, there are also two definitions of memorylessness for discrete random variables.[5] Expressed in terms of conditional probability, the two definitions are Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Pr(X>m+n\mid X>n)=\Pr(X>m),} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Pr(Y>m+n\mid Y\geq n)=\Pr(Y>m),}
where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle m} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n} are natural numbers, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} is a geometrically distributed random variable defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}} , and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} is a geometrically distributed random variable defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} . Note that these definitions are not equivalent for discrete random variables; Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} does not satisfy the first equation and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} does not satisfy the second.
Moments and cumulants
[edit | edit source]The expected value and variance of a geometrically distributed random variable Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}} is[2]: 261 Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \operatorname{E}(X) = \frac{1}{p}, \qquad \operatorname{var}(X) = \frac{1-p}{p^2}.} With a geometrically distributed random variable Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} , the expected value changes into Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \operatorname{E}(Y) = \frac{1-p} p,} while the variance stays the same.[6]: 114–115
For example, when rolling a six-sided die until landing on a "1", the average number of rolls needed is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{1}{1/6} = 6} and the average number of failures is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{1 - 1/6}{1/6} = 5} .
The moment generating function of the geometric distribution when defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N} } and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} respectively is[7][6]: 114 Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} M_X(t) &= \frac{pe^t}{1-(1-p)e^t} \\ M_Y(t) &= \frac{p}{1-(1-p)e^t}, t < -\ln(1-p) \end{align}} The moments for the number of failures before the first success are given by Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \mathrm{E}(Y^n) & {} =\sum_{k=0}^\infty (1-p)^k p\cdot k^n \\ & {} =p \operatorname{Li}_{-n}(1-p) & (\text{for }n \neq 0) \end{align} }
where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \operatorname{Li}_{-n}(1-p) } is the polylogarithm function.[8]
The cumulant generating function of the geometric distribution defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} is[1]: 216 The cumulants Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \kappa_r} satisfy the recursionFailed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \kappa_{r+1} = q \frac{\delta\kappa_r}{\delta q}, r=1,2,\dotsc} where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle q = 1-p} , when defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} .[1]: 216
Proof of expected value
[edit | edit source]Consider the expected value Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{E}(X)} of X as above, i.e. the average number of trials until a success. The first trial either succeeds with probability Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} , or fails with probability Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 1-p} . If it fails, the remaining mean number of trials until a success is identical to the original mean - this follows from the fact that all trials are independent.
From this we get the formula:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \operatorname \mathrm{E}(X) = p + (1-p)(1 + \mathrm{E}[X]) ,}
which, when solved for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{E}(X) } , gives:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \operatorname E(X) = \frac{1}{p}.}
The expected number of failures Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y} can be found from the linearity of expectation, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{E}(Y) = \mathrm{E}(X-1) = \mathrm{E}(X) - 1 = \frac 1 p - 1 = \frac{1-p}{p}} . It can also be shown in the following way:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \operatorname E(Y) & =p\sum_{k=0}^\infty(1-p)^k k \\ & = p (1-p) \sum_{k=0}^\infty (1-p)^{k-1} k\\ & = p (1-p) \left(-\sum_{k=0}^\infty \frac{d}{dp}\left[(1-p)^k\right]\right) \\ & = p (1-p) \left[\frac{d}{dp}\left(-\sum_{k=0}^\infty (1-p)^k\right)\right] \\ & = p(1-p)\frac{d}{dp}\left(-\frac{1}{p}\right) \\ & = \frac{1-p}{p}. \end{align} }
The interchange of summation and differentiation is justified by the fact that convergent power series converge uniformly on compact subsets of the set of points where they converge.
Summary statistics
[edit | edit source]The mean of the geometric distribution is its expected value which is, as previously discussed in § Moments and cumulants, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{1}{p}} or Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{1-p}{p}} when defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}} or Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} respectively.
The median of the geometric distribution is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \left\lceil -\frac{\log 2}{\log(1-p)} \right\rceil} when defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}} [9] and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \left\lfloor-\frac{\log 2}{\log(1-p)}\right\rfloor} when defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} .[3]: 69
The mode of the geometric distribution is the first value in the support set. This is 1 when defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}} and 0 when defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} .[3]: 69
The skewness of the geometric distribution is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{2-p}{\sqrt{1-p}}} .[6]: 115
The kurtosis of the geometric distribution is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 9 + \frac{p^2}{1-p}} .[6]: 115 The excess kurtosis of a distribution is the difference between its kurtosis and the kurtosis of a normal distribution, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 3} .[10]: 217 Therefore, the excess kurtosis of the geometric distribution is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 6 + \frac{p^2}{1-p}} . Since Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{p^2}{1-p} \geq 0} , the excess kurtosis is always positive so the distribution is leptokurtic.[3]: 69 In other words, the tail of a geometric distribution has more mass and decays slower than a Gaussian's.[11]
Entropy and Fisher's information
[edit | edit source]Entropy (geometric distribution, failures before success)
[edit | edit source]Entropy is a measure of uncertainty in a probability distribution. For the geometric distribution that models the number of failures before the first success, the probability mass function is:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(X = k) = (1 - p)^k p, \quad k = 0, 1, 2, \dots}
The entropy Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H(X)} for this distribution is defined as:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} H(X) &= - \sum_{k=0}^\infty P(X = k) \ln P(X = k) \\ &= - \sum_{k=0}^\infty (1 - p)^k p \ln \left( (1 - p)^k p \right) \\ &= - \sum_{k=0}^\infty (1 - p)^k p \left[ k \ln(1 - p) + \ln p \right] \\ &= -\log p - \frac{1 - p}{p} \log(1 - p) \end{align}}
The entropy increases as the probability Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} decreases, reflecting greater uncertainty as success becomes rarer.
Fisher's information (geometric distribution, failures before success)
[edit | edit source]Fisher information measures the amount of information that an observable random variable Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} carries about an unknown parameter Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} . For the geometric distribution (failures before the first success), the Fisher information with respect to Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} is given by:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I(p) = \frac{1}{p^2(1 - p)}}
Proof:
- The likelihood function for a geometric random variable Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} is: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle L(p; X) = (1 - p)^X p}
- The log-likelihood function is: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \ln L(p; X) = X \ln(1 - p) + \ln p}
- The score function (first derivative of the log-likelihood w.r.t. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} ) is: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{\partial}{\partial p} \ln L(p; X) = \frac{1}{p} - \frac{X}{1 - p}}
- The second derivative of the log-likelihood function is: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{\partial^2}{\partial p^2} \ln L(p; X) = -\frac{1}{p^2} - \frac{X}{(1 - p)^2}}
- Fisher information is calculated as the negative expected value of the second derivative: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} I(p) &= -E\left[\frac{\partial^2}{\partial p^2} \ln L(p; X)\right] \\ &= - \left(-\frac{1}{p^2} - \frac{1 - p}{p (1 - p)^2} \right) \\ &= \frac{1}{p^2(1 - p)} \end{align}}
Fisher information increases as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} decreases, indicating that rarer successes provide more information about the parameter Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} .
Entropy (geometric distribution, trials until success)
[edit | edit source]For the geometric distribution modeling the number of trials until the first success, the probability mass function is:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle P(X = k) = (1 - p)^{k - 1} p, \quad k = 1, 2, 3, \dots}
The entropy Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H(X)} for this distribution is the same as that of version modeling trials until failure,
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} H(X) &= - \log p - \frac{1 - p}{p} \log(1 - p) \end{align}}
Fisher's information (geometric distribution, trials until success)
[edit | edit source]Fisher information for the geometric distribution modeling the number of trials until the first success is given by:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I(p) = \frac{1}{p^2(1 - p)}}
Proof:
- The likelihood function for a geometric random variable Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X} is:
- Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle L(p; X) = (1 - p)^{X - 1} p}
- The log-likelihood function is:
- Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \ln L(p; X) = (X - 1) \ln(1 - p) + \ln p}
- The score function (first derivative of the log-likelihood w.r.t. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} ) is:
- Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{\partial}{\partial p} \ln L(p; X) = \frac{1}{p} - \frac{X - 1}{1 - p}}
- The second derivative of the log-likelihood function is:
- Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{\partial^2}{\partial p^2} \ln L(p; X) = -\frac{1}{p^2} - \frac{X - 1}{(1 - p)^2}}
- Fisher information is calculated as the negative expected value of the second derivative:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} I(p) &= -E\left[\frac{\partial^2}{\partial p^2} \ln L(p; X)\right] \\ &= - \left(-\frac{1}{p^2} - \frac{1 - p}{p (1 - p)^2} \right) \\ &= \frac{1}{p^2(1 - p)} \end{align}}
General properties
[edit | edit source]- The probability generating functions of geometric random variables Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X } and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y } defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N} } and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0 } are, respectively,[6]: 114–115 Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} G_X(s) & = \frac{s\,p}{1-s\,(1-p)}, \\[10pt] G_Y(s) & = \frac{p}{1-s\,(1-p)}, \quad |s| < (1-p)^{-1}. \end{align}}
- The characteristic function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \varphi(t)} is equal to Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle G(e^{it})} so the geometric distribution's characteristic function, when defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N} } and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0 } respectively, is[12]: 1630 Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \varphi_X(t) &= \frac{pe^{it}}{1-(1-p)e^{it}},\\[10pt] \varphi_Y(t) &= \frac{p}{1-(1-p)e^{it}}. \end{align}}
- The entropy of a geometric distribution with parameter Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} is[13]Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle -\frac{p \log_2 p + (1-p) \log_2 (1-p)}{p}}
- Given a mean, the geometric distribution is the maximum entropy probability distribution of all discrete probability distributions. The corresponding continuous distribution is the exponential distribution.[14]
- The geometric distribution defined on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0 } is infinitely divisible, that is, for any positive integer Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n} , there exist Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n} independent identically distributed random variables whose sum is also geometrically distributed. This is because the negative binomial distribution can be derived from a Poisson-stopped sum of logarithmic random variables.[12]: 606–607
- The decimal digits of the geometrically distributed random variable Y are a sequence of independent (and not identically distributed) random variables.[citation needed] For example, the hundreds digit D has this probability distribution: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Pr(D=d) = {q^{100d} \over 1 + q^{100} + q^{200} + \cdots + q^{900}},} where q = 1 − p, and similarly for the other digits, and, more generally, similarly for numeral systems with other bases than 10. When the base is 2, this shows that a geometrically distributed random variable can be written as a sum of independent random variables whose probability distributions are indecomposable.
- Golomb coding is the optimal prefix code[clarification needed] for the geometric discrete distribution.[13]
Related distributions
[edit | edit source]- The sum of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle r} independent geometric random variables with parameter Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} is a negative binomial random variable with parameters Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle r} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} .[15] The geometric distribution is a special case of the negative binomial distribution, with Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle r=1} .
- The geometric distribution is a special case of discrete compound Poisson distribution.[12]: 606
- The minimum of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n} geometric random variables with parameters Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p_1, \dotsc, p_n} is also geometrically distributed with parameter Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 1 - \prod_{i=1}^n (1-p_i)} .[16]
- Suppose 0 < r < 1, and for k = 1, 2, 3, ... the random variable Xk has a Poisson distribution with expected value rk/k. Then Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sum_{k=1}^\infty k\,X_k} has a geometric distribution taking values in Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} , with expected value r/(1 − r).[citation needed]
- The exponential distribution is the continuous analogue of the geometric distribution. Applying the floor function to the exponential distribution with parameter Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \lambda} creates a geometric distribution with parameter Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p=1-e^{-\lambda}} defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} .[3]: 74 This can be used to generate geometrically distributed random numbers as detailed in § Random variate generation.
- If p = 1/n and X is geometrically distributed with parameter p, then the distribution of X/n approaches an exponential distribution with expected value 1 as n → ∞, sinceFailed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \Pr(X/n>a)=\Pr(X>na) & = (1-p)^{na} = \left(1-\frac 1 n \right)^{na} = \left[ \left( 1-\frac 1 n \right)^n \right]^{a} \\ & \to [e^{-1}]^{a} = e^{-a} \text{ as } n\to\infty. \end{align} } More generally, if p = λ/n, where λ is a parameter, then as n→ ∞ the distribution of X/n approaches an exponential distribution with rate λ:Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Pr(X>nx)=\lim_{n \to \infty}(1-\lambda /n)^{nx}=e^{-\lambda x}} therefore the distribution function of X/n converges to Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 1-e^{-\lambda x}} , which is that of an exponential random variable.[citation needed]
- The index of dispersion of the geometric distribution is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{1}{p}} and its coefficient of variation is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{1}{\sqrt{1-p}}} . The distribution is overdispersed.[1]: 216
Statistical inference
[edit | edit source]The true parameter Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} of an unknown geometric distribution can be inferred through estimators and conjugate distributions.
Method of moments
[edit | edit source]Provided they exist, the first Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle l} moments of a probability distribution can be estimated from a sample Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_1, \dotsc, x_n} using the formulaFailed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle m_i = \frac{1}{n} \sum_{j=1}^n x^i_j} where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle m_i} is the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle i} th sample moment and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 1 \leq i \leq l} .[17]: 349–350 Estimating Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{E}(X)} with Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle m_1} gives the sample mean, denoted Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bar{x} } . Substituting this estimate in the formula for the expected value of a geometric distribution and solving for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p } gives the estimators Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \hat{p} = \frac{1}{\bar{x}} } and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \hat{p} = \frac{1}{\bar{x}+1} } when supported on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} respectively. These estimators are biased since Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{E}\left(\frac{1}{\bar{x}}\right) > \frac{1}{\mathrm{E}(\bar{x})} = p} as a result of Jensen's inequality.[18]: 53–54
Maximum likelihood estimation
[edit | edit source]The maximum likelihood estimator of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} is the value that maximizes the likelihood function given a sample.[17]: 308 By finding the zero of the derivative of the log-likelihood function when the distribution is defined over Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}} , the maximum likelihood estimator can be found to be Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \hat{p} = \frac{1}{\bar{x}}} , where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bar{x}} is the sample mean.[19] If the domain is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} , then the estimator shifts to Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \hat{p} = \frac{1}{\bar{x}+1}} . As previously discussed in § Method of moments, these estimators are biased.
Regardless of the domain, the bias is equal to
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle b \equiv \operatorname{E}\bigg[\;(\hat p_\mathrm{mle} - p)\;\bigg] = \frac{p\,(1-p)}{n} }
which yields the bias-corrected maximum likelihood estimator,[citation needed]
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \hat{p\,}^*_\text{mle} = \hat{p\,}_\text{mle} - \hat{b\,} }
Bayesian inference
[edit | edit source]In Bayesian inference, the parameter Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} is a random variable from a prior distribution with a posterior distribution calculated using Bayes' theorem after observing samples.[18]: 167 If a beta distribution is chosen as the prior distribution, then the posterior will also be a beta distribution and it is called the conjugate distribution. In particular, if a Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{Beta}(\alpha,\beta)} prior is selected, then the posterior, after observing samples Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k_1, \dotsc, k_n \in \mathbb{N}} , is[20]Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p \sim \mathrm{Beta}\left(\alpha+n,\ \beta+\sum_{i=1}^n (k_i-1)\right). \!} Alternatively, if the samples are in Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbb{N}_0} , the posterior distribution is[21]Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p \sim \mathrm{Beta}\left(\alpha+n,\beta+\sum_{i=1}^n k_i\right).} Since the expected value of a Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathrm{Beta}(\alpha,\beta)} distribution is Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \frac{\alpha}{\alpha+\beta}} ,[12]: 145 as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \beta} approach zero, the posterior mean approaches its maximum likelihood estimate.
Random variate generation
[edit | edit source]The geometric distribution can be generated experimentally from i.i.d. standard uniform random variables by finding the first such random variable to be less than or equal to Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} . However, the number of random variables needed is also geometrically distributed and the algorithm slows as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} decreases.[22]: 498
Random generation can be done in constant time by truncating exponential random numbers. An exponential random variable Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E} can become geometrically distributed with parameter Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} through Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \lceil -E/\log(1-p) \rceil} . In turn, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle E} can be generated from a standard uniform random variable Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle U} altering the formula into Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \lceil \log(U) / \log(1-p)\rceil} .[22]: 499–500 [23]
Applications
[edit | edit source]The geometric distribution is used in many disciplines. In queueing theory, the M/M/1 queue has a steady state following a geometric distribution.[24] In stochastic processes, the Yule Furry process is geometrically distributed.[25] The distribution also arises when modeling the lifetime of a device in discrete contexts.[26] It has also been used to fit data including modeling patients spreading COVID-19.[27]
See also
[edit | edit source]- Hypergeometric distribution
- Coupon collector's problem
- Compound Poisson distribution
- Negative binomial distribution
References
[edit | edit source]- ↑ 1.0 1.1 1.2 1.3 1.4 1.5 Johnson, Norman L.; Kemp, Adrienne W.; Kotz, Samuel (2005-08-19). Univariate Discrete Distributions. Wiley Series in Probability and Statistics (1 ed.). Wiley. doi:10.1002/0471715816. ISBN 978-0-471-27246-5.
- ↑ 2.0 2.1 Nagel, Werner; Steyer, Rolf (2017-04-04). Probability and Conditional Expectation: Fundamentals for the Empirical Sciences. Wiley Series in Probability and Statistics (1st ed.). Wiley. doi:10.1002/9781119243496. ISBN 978-1-119-24352-6.
- ↑ 3.0 3.1 3.2 3.3 3.4 Chattamvelli, Rajan; Shanmugam, Ramalingam (2020). Discrete Distributions in Engineering and the Applied Sciences. Synthesis Lectures on Mathematics & Statistics. Cham: Springer International Publishing. doi:10.1007/978-3-031-02425-2. ISBN 978-3-031-01297-6.
- ↑ Dekking, Frederik Michel; Kraaikamp, Cornelis; Lopuhaä, Hendrik Paul; Meester, Ludolf Erwin (2005). A Modern Introduction to Probability and Statistics. Springer Texts in Statistics. London: Springer London. p. 50. doi:10.1007/1-84628-168-7. ISBN 978-1-85233-896-1.
- ↑ Weisstein, Eric W. "Memoryless". mathworld.wolfram.com. Retrieved 2024-07-25.
- ↑ 6.0 6.1 6.2 6.3 6.4 Forbes, Catherine; Evans, Merran; Hastings, Nicholas; Peacock, Brian (2010-11-29). Statistical Distributions (1st ed.). Wiley. doi:10.1002/9780470627242. ISBN 978-0-470-39063-4.
- ↑ Bertsekas, Dimitri P.; Tsitsiklis, John N. (2008). Introduction to Probability. Optimization and Computation Series (2nd ed.). Belmont: Athena Scientific. p. 235. ISBN 978-1-886529-23-6.
- ↑ Weisstein, Eric W. "Geometric Distribution". MathWorld. Retrieved 2024-07-13.
- ↑ Aggarwal, Charu C. (2024). Probability and Statistics for Machine Learning: A Textbook. Cham: Springer Nature Switzerland. p. 138. doi:10.1007/978-3-031-53282-5. ISBN 978-3-031-53281-8.
- ↑ Chan, Stanley (2021). Introduction to Probability for Data Science (1st ed.). Michigan Publishing. ISBN 978-1-60785-747-1.
- ↑ Westfall, Peter (2014-08-11). "Kurtosis as Peakedness, 1905 – 2014. R.I.P." The American Statistician. 68. doi:10.1080/00031305.2014.917055.
- ↑ 12.0 12.1 12.2 12.3 Lovric, Miodrag, ed. (2011). International Encyclopedia of Statistical Science (1st ed.). Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-642-04898-2. ISBN 978-3-642-04897-5.
- ↑ 13.0 13.1 Gallager, R.; van Voorhis, D. (March 1975). "Optimal source codes for geometrically distributed integer alphabets (Corresp.)". IEEE Transactions on Information Theory. 21 (2): 228–230. doi:10.1109/TIT.1975.1055357. ISSN 0018-9448.
- ↑ Lisman, J. H. C.; Zuylen, M. C. A. van (March 1972). "Note on the generation of most probable frequency distributions". Statistica Neerlandica. 26 (1): 19–23. doi:10.1111/j.1467-9574.1972.tb00152.x. ISSN 0039-0402.
- ↑ Pitman, Jim (1993). Probability. New York, NY: Springer New York. p. 372. doi:10.1007/978-1-4612-4374-8. ISBN 978-0-387-94594-1.
- ↑ Ciardo, Gianfranco; Leemis, Lawrence M.; Nicol, David (1 June 1995). "On the minimum of independent geometrically distributed random variables". Statistics & Probability Letters. 23 (4): 313–326. doi:10.1016/0167-7152(94)00130-Z. hdl:2060/19940028569. S2CID 1505801.
- ↑ 17.0 17.1 Evans, Michael; Rosenthal, Jeffrey (2023). Probability and Statistics: The Science of Uncertainty (2nd ed.). Macmillan Learning. ISBN 978-1429224628.
- ↑ 18.0 18.1 Held, Leonhard; Sabanés Bové, Daniel (2020). Likelihood and Bayesian Inference: With Applications in Biology and Medicine. Statistics for Biology and Health. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-662-60792-3. ISBN 978-3-662-60791-6.
- ↑ Siegrist, Kyle (2020-05-05). "7.3: Maximum Likelihood". Statistics LibreTexts. Retrieved 2024-06-20.
- ↑ Template:Cite CiteSeerX
- ↑ "3. Conjugate families of distributions" (PDF). Archived (PDF) from the original on 2010-04-08.
- ↑ 22.0 22.1 Devroye, Luc (1986). Non-Uniform Random Variate Generation. New York, NY: Springer New York. doi:10.1007/978-1-4613-8643-8. ISBN 978-1-4613-8645-2.
- ↑ Knuth, Donald Ervin (1997). The Art of Computer Programming. 2 (3rd ed.). Reading, Mass: Addison-Wesley. p. 136. ISBN 978-0-201-89683-1.
- ↑ Daskin, Mark S. (2021). Bite-Sized Operations Management. Synthesis Lectures on Operations Research and Applications. Cham: Springer International Publishing. p. 127. doi:10.1007/978-3-031-02493-1. ISBN 978-3-031-01365-2.
- ↑ Madhira, Sivaprasad; Deshmukh, Shailaja (2023). Introduction to Stochastic Processes Using R. Singapore: Springer Nature Singapore. p. 449. doi:10.1007/978-981-99-5601-2. ISBN 978-981-99-5600-5.
- ↑ Gupta, Rakesh; Gupta, Shubham; Ali, Irfan (2023), Garg, Harish (ed.), "Some Discrete Parametric Markov–Chain System Models to Analyze Reliability", Advances in Reliability, Failure and Risk Analysis, Singapore: Springer Nature Singapore, pp. 305–306, doi:10.1007/978-981-19-9909-3_14, ISBN 978-981-19-9908-6, retrieved 2024-07-13
- ↑ Polymenis, Athanase (2021-10-01). "An application of the geometric distribution for assessing the risk of infection with SARS-CoV-2 by location". Asian Journal of Medical Sciences. 12 (10): 8–11. doi:10.3126/ajms.v12i10.38783. ISSN 2091-0576.
- Articles with unsourced statements from May 2012
- Wikipedia articles needing clarification from May 2012
- Articles with unsourced statements from July 2024
- Discrete distributions
- Exponential family distributions
- Infinitely divisible probability distributions
- Articles with example R code
- Pages with math errors
- Pages with math render errors