Probability Theory
- Probability Theory
- Definitions and Basic Terms
- Conditional Probability
- Independence
- Random Variables
- Important Discrete Probability Distributions
- Multiple Random Variables
- Estimating Probabilities
Probability Theory
Definitions and Basic Terms
π‘ A discrete probability space is defined by the sample space/possibility space of elementary events. An (elementary-) probability is assigned to every elementary event , where we require and
A set is called an event. The probability of an event is defined by
If is an event, then we define as the complementary event of .
π For events the following hold:
- If , then it follows that
π (Addition theorem).
If the events are pairwise disjoint (i.e., if it holds for all pairs that ), then it holds that
For an infinite set of disjoint events it holds analogously
π (The sieve formula, Inclusion-Exclusion principle).
For events the following holds:
π (Booleβs Inequality, Union Bound).
For events the following holds:
Analogously, the following holds for an infinite sequence of events :
π‘ (Principle of Laplace).
If nothings indicates otherwise, we assume that all elementary events are equally likely.
Therefore, for all elementary events .
It immediately follows for an arbitrary event that
We say, the event of the modeled experiment on is uniformly distributed or equally distributed.
π‘ In an information-theoretical sense such a probability space ( for all ) has the largest-possible entropy. Every deviation from uniform probability distribution requires that we put more information into the model (and therefore decrease entropy).
Conditional Probability
π‘ Let and be events with . The conditional probability of given is defined by
π‘ The conditional probabilities of the form form a new probability space over for an arbitrary event with .
The probabilities of the elementary events are calculated through . Then
the definition of a discrete probability space is still fulfilled and therefore all rules for probabilities still apply to conditional probabilities.
π (Multiplication theorem).
Let the events be given. If then the following holds:
π (Law of total probability).
Let the events be pairwise disjoint and let Then it follows that
Analogously, for pairwise disjoint events with it follows that
π (Bayesβ theorem).
Let the events be pairwise disjoint. Furthermore, let be an event with . Then it holds for an arbitrary that
Analogously, for pairwise disjoint events with it holds that
Independence
π‘ The events and are called independent if
π‘ The events are called independent if it holds for all subsets with that
An infinite family of events with is called independent if the above condition is met for all finite subsets .
π The events are independent if and only if it holds for all that
where and .
π Let , and be independent events. Then and respectively and independent.
Random Variables
π‘ A random variable is a transformation , where is the possibility space of a probability space.
π‘ In discrete probability spaces the codomain of a random variable
is in all cases finite or countably infinite, depending on being finite or countably infinite.
π‘ When researching a random variable one is interested in the probabilities with which assumes a specific value. For respectively for an arbitrary respectively we regard the event . We abbreviate as .
Therefore, we can define
π‘ The function
is called the density (function) of .
π‘ The function
is called the distribution (function) of .
Expected Value
π‘ For a random variable we define the expected value as
if that sum converges absolutely. Otherwise the expected value is said to be undefined.
π If is a random variable then the following holds:
π Let a random variable with . Then it holds that
π‘ Conditional Random Variables
Let be a random variable and an event with . By we denote that we calculate probabilities with which the random variable assumes specific values with respect to the on conditional probabilities. It thus holds that
π Let be a random variable. For pairwise disjoint events with and it holds that
For pairwise disjoint events with and it holds analogously that
π‘ Linearity of the Expected Value
Assume, we have defined random variables:
For an we thus receive real numbers . When we define a function we immediately see that the concatenation in turn is also a random variable, for it holds that:
This holds for arbitrary functions , in particular for affine linear functions:
where are arbitrary real numbers. In this case we usually denote the random variable explicitly as
π (Linearity of the Expected Value).
For random variables and with it holds that
π‘ For an event the corresponding indicator variable is defined by:
For the expected value of it holds that: .
Variance
π‘ For a random variable with we define the variance as
The quantity is called the standard deviation of .
π For an arbitrary random variable it holds that
π For an arbitrary random variable and it holds that
π‘ For a random variable we call the -th moment* and the -th central moment.
The expected value is therefore identical to the first moment, and the variance identical to the second central moment.
Important Discrete Probability Distributions
π‘ Recall the probability density function and the probability distribution function :
Bernoulli Distribution
π‘ A random variable with and density
is called Bernoulli distributed. The parameter is called the probability of success of the Bernoulli distribution.
If a random variable is Bernoulli distributed with parameter , then it is denoted by
For a Bernoulli distributed random variable the following hold:
Binomial Distribution
π‘ A random variable with and density
is called binomially distributed. The parameter is called the number of trials, the parameter is called the probability of success of the binomial distribution.
If a random variable is binomially distributed with parameters and , then it is denoted by
For a binomially distributed random variable the following hold:
Geometric Distribution
π‘ A random variable with density
is called geometrically distributed. The parameter is called the probability of success of the geometric distribution.
If a random variable is geometrically distributed with parameter , then it is denoted by
For a geometrically distributed random variable the following hold:
π If , then for all the following holds:
Waiting for the -th Success - Negative Binomial Distribution
π‘ Let be the random variable that counts how often we have to repeat an experiment with probability of success until the -th success. For , . For , is called negatively binomially distributed with order .
The density of is
Let denote the number of experiments strictly after the -st success up until (including) the -th success. Then, each of the is geometrically distributed with parameter .
If a random variable is negatively binomially distributed with parameters and , then it is denoted by
Thus, by linearity of the expected value, it holds for the expected value that
Application: Coupon-Collector Problem
Poisson Distribution
π‘ A random variable with density
is called Poisson distributed. The parameter is equal to the mean and variance of the Poisson distribution.
If a random variable is Poisson distributed with parameter , then it is denoted by
For a Poisson distributed random variable the following hold:
Poisson Distribution as Limit of Binomial Distribution
π‘ The binomial distribution converges towards the Poisson distribution for .
Multiple Random Variables
π‘ We are interested in random variables and and probabilities of the form
π‘ The function
is called joint density of the random variables and .
If the joint density is given one can extract the density of the random variables themselves using
The functions and are called marginal densities.
π‘ The function
is called joint distribution of the random variables and .
If the joint distribution is given one can extract the distribution of the random variables themselves using
The functions and are called marginal distributions.
Independence of Random Variables
π‘ Random variables are called independent, if and only if it holds for all that
π For independent random variables and arbitrary sets with it holds that
π For independent random variables and the set , then are also independent.
π Let be real-values functions ( for ). If the random variables are independent then so are .
Composite Random Variables
π For two independent random variables and let . It holds that
π‘ The expression is called convolution, analogously to the corresponding terms for power series.
Moments of Composite Random Variables
π (Linearity of the Expected Value).
For random variables and with it holds that
π (Multiplicativity of the Expected Value).
For independent random variables it holds that
π For independent random variables and it holds that
Waldβs Identity
π (Waldβs Identity).
Let and be two independent random variables, where holds for the codomain of . Furthermore, let
where are independent copies of . Then the following holds:
Estimating Probabilities
Inequalities of Markov and Chebyshev
π (Markovβs Inequality).
Let be a random variable, that only assumes non-negative values. Then it holds for all with that
Or equivalently: .
π (Chebyshevβs Inequality).
Let be a random variable and with . It then holds that
Or equivalently:
Chernoffβs Inequality
π (Chernoff-Bounds).
Let be independent Bernoulli distributed random variables with and .
For
- for all ,
- for all ,
- for .