﻿

# Empirical measure

In probability theory, an empirical measure is a random measure arising from a particular realization of a (usually finite) sequence of random variables. The precise definition is found below. Empirical measures are relevant to mathematical statistics.

The motivation for studying empirical measures is that it is often impossible to know the true underlying probability measure $P$. We collect observations $X_1, X_2, dots , X_n$ and compute relative frequencies. We can estimate $P$, or a related distribution function $F$ by means of the empirical measure or empirical distribution function, respectively. These are uniformly good estimates under certain conditions. Theorems in the area of empirical processes provide rates of this convergence.

Definition

Let $X_1, X_2, dots$ be a sequence of independent identically distributed random variables with values in the state space "S" with probability measure "P".

Definition :The "empirical measure" $P_n$ is defined for measurable subsets of "S" and given by::$P_n\left(A\right) = \left\{1 over n\right\} sum_\left\{i=1\right\}^n I_A\left(X_i\right)=frac\left\{1\right\}\left\{n\right\}sum_\left\{i=1\right\}^n delta_\left\{X_i\right\}\left(A\right)$:where $I_A$ is the indicator function and $delta_X$ is the Dirac measure.

For a fixed measurable set "A", $nP_n\left(A\right)$ is a binomial random variable with mean "nP(A)" and variance "nP(A)(1-P(A))". In particular, $P_n\left(A\right)$ is an unbiased estimator of "P(A)".

Definition: is the "empirical measure" indexed by $mathcal\left\{C\right\}$, a collection of measurable subsets of "S".

To generalize this notion further, observe that the empirical measure $P_n$ maps measurable functions $f:S o mathbb\left\{R\right\}$ to their "empirical mean",

:$fmapsto P_n f=int_S fdP_n=frac\left\{1\right\}\left\{n\right\}sum_\left\{i=1\right\}^n f\left(X_i\right)$

In particular, the empirical measure of "A" is simply the empirical mean of the indicator function, $P_n\left(A\right)=P_n I_A$.

For a fixed measurable function "f", $P_nf$ is a random variable with mean $mathbb\left\{E\right\}f$ and variance $frac\left\{1\right\}\left\{n\right\}mathbb\left\{E\right\}\left(f -mathbb\left\{E\right\} f\right)^2$.

By the strong law of large numbers, $P_n\left(A\right)$ converges to "P(A)" almost surely for fixed "A". Similarly $P_nf$ converges to $mathbb\left\{E\right\} f$ almost surely for a fixed measurable function "f". The problem of uniform convergence of $P_n$ to "P" was open until Vapnik and Chervonenkis solved it in 1968.

If the class $mathcal\left\{C\right\}$ (or $mathcal\left\{F\right\}$) is Glivenko-Cantelli with respect to "P" then $P_n$ converges to "P" uniformly over $cinmathcal\left\{C\right\}$ (or $fin mathcal\left\{F\right\}$). In other words, with probability 1 we have:$|P_n-P|_mathcal\left\{C\right\}=sup_\left\{cinmathcal\left\{C|P_n\left(c\right)-P\left(c\right)| o 0,$:$|P_n-P|_mathcal\left\{F\right\}=sup_\left\{finmathcal\left\{F|P_nf-mathbb\left\{E\right\}f| o 0.$

Empirical distribution function

The "empirical distribution function" provides an example of empirical measures. For real-valued iid random variables $X_1,dots,X_n$ it is given by

:$F_n\left(x\right)=P_n\left(\left(-infty,x\right] \right)=P_nI_\left\{\left(-infty,x\right] \right\}.$

In this case, empirical measures are indexed by a class $mathcal\left\{C\right\}=\left\{\left(-infty,x\right] :xinmathbb\left\{R\right\}\right\}.$ It has been shown that $mathcal\left\{C\right\}$ is a uniform Glivenko-Cantelli class, in particular,

:$sup_F|F_n\left(x\right)-F\left(x\right)|_infty o 0$

with probability 1.

ee also

* Empirical process
* Poisson random measure

References

* P. Billingsley, Probability and Measure, John Wiley and Sons, New York, third edition, 1995.
* M.D. Donsker, Justification and extension of Doob's heuristic approach to the Kolmogorov-Smirnov theorems, Annals of Mathematical Statistics, 23:277--281, 1952.
* R.M. Dudley, Central limit theorems for empirical measures, Annals of Probability, 6(6): 899â€“929, 1978.
* R.M. Dudley, Uniform Central Limit Theorems, Cambridge Studies in Advanced Mathematics, 63, Cambridge University Press, Cambridge, UK, 1999.
* J. Wolfowitz, Generalization of the theorem of Glivenko-Cantelli. Annals of Mathematical Statistics, 25, 131-138, 1954.

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• Empirical probability — Empirical probability, also known as relative frequency, or experimental probability, is the ratio of the number favourable outcomes to the total number of trials [ [http://www.answers.com/topic/empirical probability statistics Empirical… …   Wikipedia

• Empirical process — The study of empirical processes is a branch of mathematical statistics and a sub area of probability theory. It is a generalization of the central limit theorem for empirical measures. DefinitionIt is known that under certain conditions… …   Wikipedia

• empirical VAR — A measure of a financial instrument s, a portfolio of financial instruments , or an entity s exposure to reductions in value resulting from changes in prevailing interest rates. Also known as simulation VAR, empirical VAR is one of several… …   Financial and business terms

• empirical duration — A measure of duration calculated by backing into the duration value using changes in observed market prices resulting from changes in prevailing rate. American Banker Glossary …   Financial and business terms

• Unit of measure — Unit U nit, n. [Abbrev. from unity.] 1. A single thing or person. [1913 Webster] 2. (Arith.) The least whole number; one. [1913 Webster] Units are the integral parts of any large number. I. Watts. [1913 Webster] 3. A gold coin of the reign of… …   The Collaborative International Dictionary of English

• Glivenko-Cantelli theorem — In the theory of probability, the Glivenko Cantelli theorem determines the asymptotic behaviour of the empirical distribution function as the number of iid observations grows. This uniform convergence of more general empirical measures becomes an …   Wikipedia

• List of mathematics articles (E) — NOTOC E E₇ E (mathematical constant) E function E₈ lattice E₈ manifold E∞ operad E7½ E8 investigation tool Earley parser Early stopping Earnshaw s theorem Earth mover s distance East Journal on Approximations Eastern Arabic numerals Easton s… …   Wikipedia

• Domestic violence — Domestic disturbance redirects here. For the 2001 film, see Domestic Disturbance. Domestic violence Classification and external resources eMedicine article/805546 MeSH …   Wikipedia

• Point process — In statistics and probability theory, a point process is a type of random process for which any one realisation consists of a set of isolated points either in time or geographical space, or in even more general spaces. For example, the occurrence …   Wikipedia

• Beaufort scale — Force 12 at sea. The Beaufort Scale (  / …   Wikipedia