TheBestLinks.com
TheBestLinks.com
Negative binomial distribution, Binomial distribution, Binomial coefficient... Print friendly version | Tell a friend
 
Navigation
Search
Toolbox

Negative binomial distribution

From TheBestLinks.com

In probability theory, the negative binomial distribution is any member of a certain family of discrete probability distributions.

Table of contents

Two or three discrepant conventions

Three conventions are found in the literature, two of them conflicting with each other. The third generalizes the second rather than directly conflicting with it.

  • A third convention will be explained below. The third convention generalizes the second one, rather than actually conflicting with it. It allows r to be a non-integer.

The probability distributions called "negative binomial" according to the second and third conventions are infinitely divisible. The ones called "negative binomial" according to the first convenetion have the virtue that everyone's intuitive guess about their expected value is correct; for example, if you repeatedly throw an ordinary six-sided die until you get a "1", on average it takes six trials, if you keep trying until you get two "1"s, it takes 12 trials on average, and so on.

Example

Suppose we repeatedly throw a die, and consider a "1" to be a "success". The probability of success on each trial is 1/6. The number of trials needed to get three successes belongs to the infinite set { 3, 4, 5, 6, ... }. That number of trials is a negative-binomially distributed random variable according to the first convention. The number of failures before the third success belongs to the infinite set { 0, 1, 2, 3, ... }. That number of failures is a negative-binomially distributed random variable according to the second convention.

Parametrization

The family of negative binomial distributions is parametrized by two parameters: the fixed number r of successes and the probability p of success on each trial. The first parameter r is a positive integer; the second parameter p is a real number between 0 and 1. If r = 1, then we have a geometric distribution. (In the "third convention" referred to above, r will not necessarily be an integer.)

Formulas

In this section we adhere to the first convention above: the negative binomial distribution is the number of independent trials needed to get r successes, with probability p of success on each trial.

Parameters : r (number of successes) is an integer where 1 ≤ r; the special case r = 1 is the geometric distribution.
p = probability of success on each trial is a real number where 0 < p < 1.
Support (domain where probability mass > 0) = set of all integers ≥ r.
Probability mass function f(x) = P(X = x) = the probability that rth success occurs on the xth trial is given by
<math>f(x)={x-1 \choose r-1} p^r (1-p)^{x-r}<math>

(see binomial coefficient).

Cumulative distribution function F(x) = P(Xx) = probability that rth success occurs on or before the xth trial : No simple closed form solution exists, but this can be computed via the regularized incomplete beta function as with the binomial distribution.
Expected value E(X) = r/p.
Variance var(X) = σ2 = r(1 − p)/p2.

Properties

If Xr is a random variable following the negative binomial distribution with parameters r and p, then Xr is a sum of r independent variables following the geometric distribution with parameter p. As a result of the central limit theorem, Xr is therefore approximately normal for sufficiently large r.

Furthermore, if Ys is a random variable following the binomial distribution with parameters s and p, then

<math>\operatorname{Pr}\left(X_r \leq s\right)

= \operatorname{Pr}\left(Y_s \geq r\right)<math>

= Pr(after s trials, there are at least r successes).

In this sense, the negative binomial distribution is the "inverse" of the binomial distribution.

The sum of independent negative-binomially distributed random variables with the same value of the parameter p but the "r-values" r1 and r2 is negative-binomially distributed with the same p but with "r-value" r1 + r2.

The negative binomial distribution also arises as a continuous mixture of Poisson distributions for which the Poisson parameter λ was generated by a gamma distribution.

If we follow the convention that the negative binomial distribution is the probability distribution of the number of failures before the rth success, then any negative binomial distribution is infinitely divisible, i.e., if X has a negative binomial distribution, then for any positive integer n, there exist independent identically distributed random variables X1, ..., Xn whose sum has the same distribution that X has. These will not be negative-binomially distributed in the sense defined above unless n is a divisor of r, but see the "third convention" below.

Explanation of the name

Suppose X is a random variable with a negative binomial distribution with parameters r and p. The statement that the sum from x = r to infinity, of the probability Pr[X = x], is equal to 1, can be shown by a bit of algebra to be equivalent to the statement that (1 − p)r is what Newton's binomial theorem says it should be.

Suppose Y is a random variable with a binomial distribution with parameters n and p. The statement that the sum from y = 0 to n, of the probability Pr[Y = y], is equal to 1, says that that 1 = (p + (1 − p))n is what the strictly finitary binomial theorem of high-school algebra says it should be.

Thus the negative binomial distribution bears the same relationship to the negative-integer-exponent case of the binomial theorem that the binomial distribution bears to the positive-integer-exponent case.

Mathematical details

Assume p + q = 1. Then the binomial theorem of elementary algebra implies that

<math>1=1^n=(p+q)^n=\sum_{x=0}^n {n \choose x} p^x q^{n-x}.<math>

This can be written in a way that may at first appear to some to be incorrect, and perhaps perverse even if correct:

<math>(p+q)^n=\sum_{x=0}^\infty {n \choose x} p^x q^{n-x},<math>

in which the upper bound of summation is infinite. If the binomial coefficient is defined by

<math>{n \choose x}={n! \over x!(n-x)!}<math>

then it does not make sense when x > n, since factorials of negative numbers are not defined. But one may also read it as

<math>{n \choose x}={n(n-1)(n-2)\cdots(n-x+1) \over x!}.<math>

In that case it is defined even when n is negative or is not an integer. But in our case of the binomial distribution it is zero when x > n. So why would we write the result in that form, with a seemingly needless sum of infinitely many zeros? The answer comes when we generalize the binomial theorem of elementary algebra to Newton's binomial theorem. Then we can say, for example

<math>(p+q)^{8.3}=\sum_{x=0}^\infty {8.3 \choose x} p^x q^{n-x}.<math>

Now suppose r > 0 and we use a negative exponent:

<math>1=p^r p^{-r}=p^r (1-q)^{-r}=p^r\sum_{x=0}^\infty {-r \choose x} (-q)^x.<math>

Then all of the terms are positive, and the term

<math>p^r {-r \choose x} (-q)^x<math>

is just the probability that the number of failures before the rth success is equal to x, provided r is an integer. (If r is a negative non-integer, so that the exponent is a positive non-integer, then some of the terms in the sum above are negative, so we do not have a probability distribution on the set of all nonnegative integers.)

This brings us to the "third convention" mentioned above: Allow non-integer values of r. Then we have a generalized negative binomial distribution that coincides with the second convention above when r happens to be a positive integer.

Recall from above that

The sum of independent negative-binomially distributed random variables with the same value of the parameter p but the "r-values" r1 and r2 is negative-binomially distributed with the same p but with "r-value" r1 + r2.

This property persists when the definition is thus generalized, and affords a quick way to see that the negative binomial distribution is infinitely divisible.

Example

(After a problem by Dr. Diane Evans, professor of mathematics at Rose-Hulman Institute of Technology)

Johnny, a sixth grader at Honey Creek Middle School in Terre Haute, Indiana, is required to sell candy bars in his neighborhood to raise money for the 6th grade field trip. There are thirty homes in his neighborhood, and his father has told him not to return home until he has sold five candy bars. So the boy goes door to door, selling candy bars. At each home he visits, he has an 0.4 probability of selling one candy bar and an 0.6 probability of selling nothing.

What's the probability mass function for selling the last candy bar at the xth house?

f(x) = C(x − 1, 4) · 0.45 · (1 − 0.4)x − 5

What's the probability that he finishes on the tenth house?

f(10) = 0.100

What's the probability that he finishes on or before reaching the eighth house?

Answer: To finish on or before the eighth house, he must finish at the fifth, sixth, seventh, or eighth house. Sum those probabilities:

f(5) = 0.0102; f(6) = .0307, f(7) = .0553; f(8) = .0774; sum(f(j), j=5..8) = 0.1737

What's the probability that he exhausts all houses in the neighborhood, gives up, and then goes to live on the streets?

<math>1-\sum_{j=5}^{30} f(j)=1-0.9985=0.0015<math>

Moral: Negative binomial distributions don't turn our children out on the streets; bad parenting does.



it:Variabile casuale Binomiale Negativa de:Negative Binomialverteilung

Related links


Top visited 0 of 0 links

[no links posted yet]

>> place link >>

Discussion

Last posted 0 of 0 messages

[no messages posted yet]

>> post message >>

Watch

You can add this article to your own "watchlist" and receive e-mail notification about all changes in this page.
 
   
Innovate it
This page was last modified 22:34, 28 Aug 2004.
  Content is available under GNU Free Documentation License 1.2.
Powered by MediaWiki