从维基百科和机器学习的书上整理、复制、粘贴了几个常见的集中不等式

Markov’s inequality

statement

If is a nonnegative random variable and , then the probability that is at least is at most the expectation of divided by

intuition

proof

For any event , let be the indicator random variable of , that is, if occurs and otherwise.
Using this notation, we have if the event occurs, and if . Then, given ,

which is clear if we consider the two possible values of . If , then , and so .
Otherwise, we have , for which and so .
Since is a monotonically increasing function, taking expectation of both sides of an inequality cannot reverse it. Therefore,

Now, using linearity of expectations, the left side of this inequality is the same as

Thus we have

and since , we can divide both sides by .

corollaries

is a nondecreasing nonnegative fundction

Chebyshev’s inequality

statement

proof

for any

because Markov’s inequality

This argument can be summarized (where “MI” indicates use of Markov’s inequality):

Hoeffding’s inequality

Let be independent random variables with taking values in for all . Then for any , the following inequalities hold for :

Hoeffding’s lemma

statement

Let be a random variable with and with . Then, for any , the following inequality holds:

Proof

By the convexity of , for all , the following holds:

Thus, using ,

where,

For any , the first and second derivative of are given below:

where denotes . Note that and that where . Since is in is upper bounded by and . Thus, by the second order expansion of function , there exists such that:

proof

McDiarmid’s inequality

statement

Let be a set of independent random variables and assume that there exist such that satisfies the following conditions:

for all and any points . Let denote , then, for all , the following inequalities hold:

Definition Martingale Difference

A sequence of random variables is a martingale difference sequence with respect to if for all is a function of and

Lemma

Let and be random variables satisfying and, for some function and constant , the inequalities:

Then, for all , the following upper bound holds:

Theorem Azuma’s inequality

Let be a martingale difference sequence with respect to the random variables , and assume that for all there is a constant and
random variable , which is a function of , that satisfy

Then, for all and , the following inequalities hold:

proof

Proof For any , let . Then, using Chernoff’s bounding technique, for any , we can write

where we chose to minimize the upper bound. This proves the first statement of the theorem, and the second statement is shown in a similar way.
()

proof

Define a sequence of random variables , as follows: , and for ,

Note that . Furthermore, the random variable is a function of . Conditioning on and taking its expectation is therefore:

which implies . Thus, the sequence is a martingale difference sequence. Next, observe that, since is a scalar, can be expressed as follows:

Now define the random variables for each

Since are independent of each other, conditioning on does not affect the probabilities of the other variables, so these are equal to the expressions

is a random variables about
thus, . In view of these inequalities, we can apply Azuma’s inequality to