# Probability-generating function

In

probability theory , the**probability-generating function**of adiscrete random variable is apower series representation (thegenerating function ) of theprobability mass function of the random variable. Probability-generating functions are often employed for their succinct description of the sequence of probabilities Pr("X" = "i"), and to make available the well-developed theory of power series with non-negative coefficients.**Definition**If "N" is a discrete random variable taking values on some subset of the non-negative integers, {"0,1, ..."}, then the "probability-generating function" of "N" is defined as::$G(x)\; =\; extrm\{E\}(x^N)\; =\; sum\_\{n=0\}^\{infty\}f\_N(n)x^n,$where "f

_{N}" is the probability mass function of "N". Note that the equivalent notation "G"_{"N"}is sometimes used to distinguish between the probability-generating functions of several random variables.**Properties****Power series**Probability-generating functions obey all the rules of power series with non-negative coefficients. In particular, "G"(1

^{−}) = 1, since the probabilities must sum to one, and where "G"(1^{−}) = lim_{z→1}"G"("z") from below. So theradius of convergence of any probability-generating function must be at least 1, byAbel's theorem for power series with non-negative coefficients.**Probabilities and expectations**The following properties allow the derivation of various basic quantities related to "X":

1. The probability mass function of "X" is recovered by taking

derivative s of "G":$quad\; f(k)\; =\; extrm\{Pr\}(X\; =\; k)\; =\; frac\{G^\{(k)\}(0)\}\{k!\}.$

2. It follows from Property 1 that if we have two random variables "X" and "Y", and "G"

_{"X"}= "G"_{"Y"}, then "f"_{"X"}= "f"_{"Y"}. That is, if "X" and "Y" have identical probability-generating functions, then they are identically distributed.3. The normalization of the probability density function can be expressed in terms of the generating function by

:$E(1)=G(1^-)=sum\_\{i=0\}^infty\; f(i)=1.$

The expectation of "X" is given by

:$extrm\{E\}left(X\; ight)\; =\; G\text{'}(1^-).$

More generally, the "k"th factorial moment, E("X"("X" − 1) ... ("X "− "k" + 1)), of "X" is given by

:$extrm\{E\}left(frac\{X!\}\{(X-k)!\}\; ight)\; =\; G^\{(k)\}(1^-),\; quad\; k\; geq\; 0.$

So the

variance of "X" is given by:$extrm\{Var\}(X)=G"(1^-)\; +\; G\text{\'}(1^-)\; -\; left\; [G\text{\'}(1^-)\; ight\; ]\; ^2.$

4."G"

_{"X"}($e^\{t\}$) = "M"_{"X"}(t) where "X" is a random variable, G(t) is the probability generating function and M(t) is themoment-generating function .**Functions of independent random variables**Probability-generating functions are particularly useful for dealing with functions of independent random variables. For example:

* If "X"

_{1}, "X"_{2}, ..., "X"_{n}is a sequence of independent (and not necessarily identically distributed) random variables, and::$S\_n\; =\; sum\_\{i=1\}^n\; a\_i\; X\_i,$

:where the "a"

_{i}are constants, then the probability-generating function is given by::$G\_\{S\_n\}(z)\; =\; E(z^\{S\_n\})\; =\; E(z^\{sum\_\{i=1\}^n\; a\_i\; X\_i,\})\; =\; G\_\{X\_1\}(z^\{a\_1\})G\_\{X\_2\}(z^\{a\_2\})cdots\; G\_\{X\_n\}(z^\{a\_n\}).$

:For example, if

::$S\_n\; =\; sum\_\{i=1\}^n\; X\_i,$

:then the probability-generating function, "G"

_{"Sn"}("z"), is given by::$G\_\{S\_n\}(z)\; =\; G\_\{X\_1\}(z)G\_\{X\_2\}(z)cdots\; G\_\{X\_n\}(z).$

:It also follows that the probability-generating function of the difference of two independent random variables "S" = "X"

_{1}− "X"_{2}is::$G\_S(z)\; =\; G\_\{X\_1\}(z)G\_\{X\_2\}(1/z).$

*Suppose that "N" is also an independent, discrete random variable taking values on the non-negative integers, with probability-generating function "G"

_{"N"}. If the "X"_{1}, "X"_{2}, ..., "X"_{N}are independent "and" identically distributed with common probability-generating function "G"_{X}, then::$G\_\{S\_N\}(z)\; =\; G\_N(G\_X(z)).$

:This can be seen as follows:

::$G\_\{S\_N\}(z)\; =\; E(z^\{S\_N\})\; =\; E(z^\{sum\_\{i=1\}^N\; X\_i\})\; =\; Eig(E(z^\{sum\_\{i=1\}^N\; X\_i\}|\; N)\; ig)\; =\; Eig(\; (G\_X(z))^Nig)\; =G\_N(G\_X(z)).$

:This last fact is useful in the study of

Galton–Watson process es.:Suppose again that "N" is also an independent, discrete random variable taking values on the non-negative integers, with probability-generating function "G"

_{"N"}. If the "X"_{1}, "X"_{2}, ..., "X"_{N}are independent, but "not" identically distributed random variables, where $G\_\{X\_i\}$ denotes the probability generating function of $X\_i$, then it holds::$G\_\{S\_N\}(z)\; =\; sum\_\{i\; ge\; 1\}\; f\_i\; prod\_\{k=1\}^i\; G\_\{X\_i\}(z).$

:For identically distributed "X

_{i}" this simplifies to the identity stated before. The general case is sometimes useful to obtain a decomposition of "S_{N}" by means of generating functions.**Examples*** The probability-generating function of a constant random variable, i.e. one with Pr("X" = "c") = 1, is

::$G(z)\; =\; left(z^c\; ight).$

* The probability-generating function of a binomial random variable, the number of successes in "n" trials, with probability "p" of success in each trial, is

::$G(z)\; =\; left\; [(1-p)\; +\; pz\; ight]\; ^n.$

:Note that this is the "n"-fold product of the probability-generating function of a Bernoulli random variable with parameter "p".

* The probability-generating function of a negative binomial random variable, the number of trials required to obtain the "r"th success with probability of success in each trial "p", is

::$G(z)\; =\; left(frac\{p\}\{1\; -\; (1-p)z\}\; ight)^r.$

:Note that this is the "r"-fold product of the probability generating function of a geometric random variable.

* The probability-generating function of a Poisson random variable with rate parameter λ is

::$G(z)\; =\; extrm\{e\}^\{lambda(z\; -\; 1)\}.;,$

**Example calculation: use of bivariate generating functions**The following example illustrates a very common technique the manipulation of PGFs: the use of bivariate super generating functions to compute the ordinary generating function (OGF) of the PGFs of a sequence of random variables.

Suppose you sample a system that can assume two states, "X" and "Y", "X" with probability "p" and "Y" with probability 1 − "p", e.g. a coin being flipped, obtaining the sequence of samples:$S\_1,\; ,\; S\_2,\; ,\; S\_3,\; ,ldots\; ,S\_n,$where the system was sampled "n" times and has no memory.

Define the random variable $M\_n$ to be the number of changes from one sample to the next in a sequence of "n" samples, i.e. how often $S\_m$ was different from $S\_\{m-1\}$. For example, the sequence:$X\; ,X\; ,X\; ,Y\; ,X\; ,X$has two changes, as does:$Y\; ,X\; ,X\; ,X\; ,X\; ,X\; ,Y\; ,Y\; ,Y\; ,Y.$We want to calculate the PGF of $M\_n$, which we will do by using bivariate generating functions.

We introduce the bivariate GF $G(z,\; u)$ given by:$G(z,\; u)\; =\; sum\_\{nge\; 1\}\; Eleft\; [u^\{M\_n\}\; ight]\; z^n,$i.e. $G(z,\; u)$ is the ordinary generating function of the PGFs of the $M\_n.$This step is completely general and indeed the core of the method.

Now let $x\_\{n,\; k\}$ be the probability of having "k" changes in a sequence of "n" samples, where the last sample was an "X". Similarly, let $y\_\{n,\; k\}$ be the probability of having "k" changes in a sequence of "n" samples, where the last sample was a "Y", and put:$X(z,\; u)\; =\; sum\_\{nge\; 1,\; kge\; 0\}\; x\_\{n,\; k\}\; ,\; u^k\; z^nquad\; mbox\{and\}\; quad\; Y(z,\; u)\; =\; sum\_\{nge\; 1,\; kge\; 0\}\; y\_\{n,\; k\}\; ,\; u^k\; z^n$so that:$G(z,\; u)\; =\; X(z,\; u)\; +\; Y(z,\; u).,$

Now we clearly have:$x\_\{n,\; 0\}\; =\; p^n\; quad\; mbox\{and\}\; quad\; y\_\{n,\; 0\}\; =\; (1-p)^n,$because having zero changes means getting a sequence of all "Xs" or "Ys."

For $kge\; 1$ we find:$x\_\{n,\; k\}\; =\; p\; ,\; y\_\{n-1,\; k-1\}\; +\; p\; ,\; x\_\{n-1,\; k\}quad\; mbox\{and\}\; quad\; y\_\{n,\; k\}\; =\; (1-p)\; ,\; x\_\{n-1,\; k-1\}\; +\; (1-p)\; ,\; y\_\{n-1,\; k\},$because e.g. to have "k" changes in a sequence of length "n" that ends in "X", we either append an "X" to a sequence having "k" − 1 changes and ending in "Y", or append an "X" to a sequence having "k" changes and ending in "X".

Summing these equations over "n" and "k" and writing "X" for "X(z, u)" and "Y" for "Y(z, u)", we obtain:$X\; -\; frac\{p\; z\}\{1\; -\; p\; z\}\; =p\; u\; z\; Y\; +\; p\; z\; left(\; X\; -\; frac\{p\; z\}\{1\; -\; p\; z\}\; ight)$and:$Y\; -\; frac\{(1-p)\; z\}\{1\; -\; (1-p)\; z\}\; =(1-p)\; u\; z\; X\; +\; (1-p)\; z\; left(\; Y\; -\; frac\{(1-p)\; z\}\{1\; -\; (1-p)\; z\}\; ight).$The solution of this system is:$X\; =-\{frac\; \{\; left(\; -pz+puz+z-uz-1\; ight)\; pz\}\{-z+1+p\{z\}^\{2\}-\{p\}^\{2\}\{z\}^\{2\}-\{u\}^\{2\}\{z\}^\{2\}p+\{p\}^\{2\}\{u\}^\{2\}\{z\}^\{2\}$and:$Y\; =-\{frac\; \{z\; left(\; -puz-1+p+pz-\{p\}^\{2\}z+\{p\}^\{2\}uz\; ight)\; \}\{-z+1+p\{z\}^\{2\}-\{p\}^\{2\}\{z\}^\{2\}-\{u\}^\{2\}\{z\}^\{2\}p+\{p\}^\{2\}\{u\}^\{2\}\{z\}^\{2\}.$

We may now use the general identity:$sum\_\{nge\; 1\}\; Eleft\; [M\_n\; (M\_n-1)\; ldots\; (M\_n-r)\; ight]\; z^n\; =left(\; left(frac\{d\}\{du\}\; ight)^\{r+1\}\; G(z,\; u)\; ight)\_\{u=1\}$to calculate the factorial moments of $M\_n.$ E.g. the OGF of the expectations is given by:$sum\_\{nge\; 1\}\; E\; [M\_n]\; z^n\; =left(\; frac\{d\}\{du\}\; (X\; +\; Y)\; ight)\_\{u=1\}\; =-2,\{frac\; \{\; left(\; -1+p\; ight)\; \{z\}^\{2\}p\}\{\; left(\; -1+z\; ight)\; ^\{2\},$from which we find (extracting coefficients) that:$E\; [M\_n]\; =\; 2\; ,\; p\; (1-p)\; ,(n-1).$

An extensive discussion of this problem, as well as solutions by other methods,may be found on "Les-Mathematiques.net" (external links).

**Example calculation: bivariate generating functions and differential equations**Consider the following balls and urns problem: suppose we have an urn containing "n" distinguishable balls, i.e. bearing labels from "1" to "n". We pick one of the balls at random and remove it from the urn. We also remove all balls whose labels are larger than the one we picked from the urn. E.g. if we picked ball number one, the urn is emptied after one operation. We repeat until the urn is empty. E.g. for an urn containing ten balls, the sequence of picks 6-3-1 would empty the urn in three operations. We introduce the random variable $X\_n$, which gives the number of picks needed to empty the urn. Our goal is to compute all of its moments, and we will do so using exactly the same bivariate generating function as in the previous example, namely the OGF of the PGFs::$P(z,\; u)\; =\; sum\_\{nge\; 1\}\; Eleft\; [u^\{X\_n\}\; ight]\; z^n.$

We let $p\_\{n,\; k\}$ be the probability of emptying an urn containing "n" balls with "k" operations, so that:$P(z,\; u)\; =\; sum\_\{nge\; 1,\; kge\; 1\}\; p\_\{n,\; k\}\; u^k\; z^n.$We find that:$p\_\{n,\; 1\}\; =\; frac\{1\}\{n\}quad\; mbox\{and\}\; quad\; p\_\{n,\; k\}\; =\; frac\{1\}\{n\}\; sum\_\{r=1\}^\{n-(k-1)\}\; p\_\{n-r,\; k-1\},\; quad\; kge\; 2$because to empty the urn with one operation, we must pick the ball labelled "1". The remaining probabilities are computed recursively, e.g. we pick the ball with the largest label with probability $1/n$, leaving $n-1$ balls (this is $r=1$). We pick the ball with the next-to-largest label with probability $1/n$, leaving $n-2$ balls (this is $r=2$), etc. The upper bound for "r" is $n-(k-1)$, because we must have $n-rge\; k-1$ (we cannot e.g. empty an urn containing six balls using seven operations).

Next we set $p\_\{n,\; k\}\; =\; 0$ for $nmath>,\; so\; that\; we\; may\; replace\; the\; recursion\; by:$ p\_\{n,\; k\}\; =\; frac\{1\}\{n\}\; sum\_\{r=1\}^\{n-1\}\; p\_\{n-r,\; k-1\},\; quad\; kge\; 2.$Using\; the\; coefficient-extraction\; operator\; forformal\; power\; series,\; we\; thus\; have:$ [z^\{n-1\}\; u^k]\; frac\{d\}\{dz\}\; P(z,\; u)\; =\; [z^\{n-1\}\; u^\{k-1\}]\; frac\{1\}\{1-z\}\; P(z,\; u)\; =\; [z^\{n-1\}\; u^k]\; frac\{u\}\{1-z\}\; P(z,\; u),\; quad\; kge\; 2.$We\; note\; furthermore\; that:$ [z^\{n-1\}\; u]\; frac\{d\}\{dz\}\; P(z,\; u)\; =\; [z^\{n-1\}]\; frac\{d\}\{dz\}\; sum\_\{nge\; 1\}\; frac\{1\}\{n\}\; z^n\; =\; [z^\{n-1\}]\; frac\{d\}\{dz\}\; log\; frac\{1\}\{1-z\}\; =\; [z^\{n-1\}]\; frac\{1\}\{1-z\}\; =\; 1$and\; that:$ [z^\{n-1\}\; u]\; frac\{u\}\{1-z\}\; P(z,\; u)\; =\; 0,$where\; the\; first\; equation\; results\; from\; our\; "boundary\; condition"\; that\; the\; probability\; of\; emptying\; an\; urn\; with\; one\; operation\; is$ 1/n.$$

Summing over $kge\; 1$ (there are two contributions to both sides of the equation, one for $kge\; 2$ and another one for $k=1$), we obtain:$frac\{d\}\{dz\}\; P(z,\; u)\; =\; frac\{u\}\{1-z\}\; +\; frac\{u\}\{1-z\}\; P(z,\; u),$e.g. through:$sum\_\{nge\; 1\}\; z^\{n-1\}\; u\; [z^\{n-1\}\; u]\; frac\{d\}\{dz\}\; P(z,\; u)\; =sum\_\{nge\; 1\}\; z^\{n-1\}\; u\; =\; frac\{u\}\{1-z\}.$The solution to the differential equation is:$P(z,\; u)\; =\; -1\; +\; C(u)\; left(\; frac\{1\}\{1-z\}\; ight)^u$,with $C(u)$ a

formal power series in "u". We note:$[z^0]\; P(z,\; u)\; =\; 0\; =\; -1\; +\; C(u),$which follows from the formal series:$left(\; frac\{1\}\{1-z\}\; ight)^u\; =\; sum\_\{kge\; 0\}\; frac\{u^k\}\{k!\}\; left(\; log\; frac\{1\}\{1-z\}\; ight)^k.$Hence $C(u)$ is constant and equal to one, and we finally have:$P(z,\; u)\; =\; -1\; +\; left(\; frac\{1\}\{1-z\}\; ight)^u,$which is, incidentally, the generating function of theStirling numbers of the first kind .The moments now follow trivially from the formula given in the first example. E.g. for the expectation, we find:$sum\_\{nge\; 1\}\; E\; [X\_n]\; z^n\; =left.\; frac\{d\}\{du\}\; P(z,\; u)\; ight|\_\{u=1\}\; =left.\; left(\; frac\{1\}\{1-z\}\; ight)^u\; log\; frac\{1\}\{1-z\}\; ight|\_\{u=1\}\; =frac\{1\}\{1-z\}\; log\; frac\{1\}\{1-z\}$which gives:$E\; [X\_n]\; =\; [z^n]\; frac\{1\}\{1-z\}\; log\; frac\{1\}\{1-z\}\; =\; H\_n,$the "n"th

harmonic number , and we need about $log\; n,$ operations to empty the urn.An extensive discussion of this problem, as well as solutions by other methods,may be found on "Les-Mathematiques.net" (external links).

**Related concepts**The probability-generating function is occasionally called the

z-transform of the probability mass function. It is an example of a generating function of a sequence (seeformal power series ).Other generating functions of random variables include the

moment-generating function and the characteristic function.**External links*** [

*http://les-mathematiques.u-strasbg.fr/phorum/read.php?f=2&i=339442&t=339442 Riedel, Marko,**et al.**Espérance**, in French.**]*

* [*http://les-mathematiques.u-strasbg.fr/phorum5/read.php?12,349295,349295#msg-349295 Riedel, Marko,**et al.**Variables aléatoires**, in French.**]*

*Wikimedia Foundation.
2010.*

*
*### Look at other dictionaries:

**Probability density function**— Boxplot and probability density function of a normal distribution N(0, σ2). In probability theory, a probability density function (pdf), or density of a continuous random variable is a function that describes the relative likelihood for this… … Wikipedia**Generating function**— This article is about generating functions in mathematics. For generating functions in classical mechanics, see Generating function (physics). For signalling molecule, see Epidermal growth factor. In mathematics, a generating function is a formal … Wikipedia**Moment-generating function**— In probability theory and statistics, the moment generating function of any random variable is an alternative definition of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compared with… … Wikipedia**Factorial moment generating function**— In probability theory and statistics, the factorial moment generating function of the probability distribution of a real valued random variable X is defined as:M X(t)=operatorname{E}igl [t^{X}igr] for all complex numbers t for which this… … Wikipedia**Coupon collector's problem (generating function approach)**— The coupon collector s problem can be solved in several different ways. The generating function approach is a combinatorial technique that allows to obtain precise results. We introduce the probability generating function (PGF) G(z) where… … Wikipedia**Probability distribution**— This article is about probability distribution. For generalized functions in mathematical analysis, see Distribution (mathematics). For other uses, see Distribution (disambiguation). In probability theory, a probability mass, probability density … Wikipedia**Probability plot**— The probability plot is a graphical technique for assessing whether or not a data set follows a given distribution such as the normal or Weibull, and for visually estimating the location and scale parameters of the chosen distribution. The data… … Wikipedia**Characteristic function (probability theory)**— The characteristic function of a uniform U(–1,1) random variable. This function is real valued because it corresponds to a random variable that is symmetric around the origin; however in general case characteristic functions may be complex valued … Wikipedia**Cumulative distribution function**— for the normal distributions in the image below … Wikipedia**List of probability topics**— This is a list of probability topics, by Wikipedia page. It overlaps with the (alphabetical) list of statistical topics. There are also the list of probabilists and list of statisticians.General aspects*Probability *Randomness, Pseudorandomness,… … Wikipedia