从维基百科和机器学习的书上整理、复制、粘贴了几个常见的集中不等式
Markov’s inequality
statement
If is a nonnegative random variable and , then the probability that is at least is at most the expectation of divided by
intuition
proof
For any event , let be the indicator random variable of , that is, if occurs and otherwise.
Using this notation, we have if the event occurs, and if . Then, given ,
which is clear if we consider the two possible values of . If , then , and so .
Otherwise, we have , for which and so .
Since is a monotonically increasing function, taking expectation of both sides of an inequality cannot reverse it. Therefore,
Now, using linearity of expectations, the left side of this inequality is the same as
Thus we have
and since , we can divide both sides by .
corollaries
is a nondecreasing nonnegative fundction
Chebyshev’s inequality
statement
proof
for any
because Markov’s inequality
This argument can be summarized (where “MI” indicates use of Markov’s inequality):
Hoeffding’s inequality
Let be independent random variables with taking values in for all . Then for any , the following inequalities hold for :
Hoeffding’s lemma
statement
Let be a random variable with and with . Then, for any , the following inequality holds:
Proof
By the convexity of , for all , the following holds:
Thus, using ,
where,
For any , the first and second derivative of are given below:
where denotes . Note that and that where . Since is in is upper bounded by and . Thus, by the second order expansion of function , there exists such that:
proof
McDiarmid’s inequality
statement
Let be a set of independent random variables and assume that there exist such that satisfies the following conditions:
for all and any points . Let denote , then, for all , the following inequalities hold:
Definition Martingale Difference
A sequence of random variables is a martingale difference sequence with respect to if for all is a function of and
Lemma
Let and be random variables satisfying and, for some function and constant , the inequalities:
Then, for all , the following upper bound holds:
Theorem Azuma’s inequality
Let be a martingale difference sequence with respect to the random variables , and assume that for all there is a constant and
random variable , which is a function of , that satisfy
Then, for all and , the following inequalities hold:
proof
Proof For any , let . Then, using Chernoff’s bounding technique, for any , we can write
where we chose to minimize the upper bound. This proves the first statement of the theorem, and the second statement is shown in a similar way.
()
proof
Define a sequence of random variables , as follows: , and for ,
Note that . Furthermore, the random variable is a function of . Conditioning on and taking its expectation is therefore:
which implies . Thus, the sequence is a martingale difference sequence. Next, observe that, since is a scalar, can be expressed as follows:
Now define the random variables for each
Since are independent of each other, conditioning on does not affect the probabilities of the other variables, so these are equal to the expressions
is a random variables about
thus, . In view of these inequalities, we can apply Azuma’s inequality to