Denoising Diffusion Probabilistic Models
1. Forward Diffusion Process
All vectors are column vectors. For multi-dimensional tensors, they can be flattened into column vectors.
In the forward process, we gradually transform a data distribution into a distribution which is close to .
Given noise schedule ,
or equivalently, let , we have
1.1. Reparameterization
By mathematical induction, we can prove that
That is, multiple noise additions can be expressed as one noise addition and it is easy to know that as the number of noise additions increases, the distribution of the data will be transformed into a standard normal distribution.
import torch
n_steps = 500
betas = torch.linspace(0.0001, 0.02, n_steps)
alphas = 1 - betas
alphas_cumprod = torch.cumprod(alphas, dim=0)
expectation = alphas_cumprod.sqrt()
variance = 1 - alphas_cumprod
expectation_rounded = [round(x, 3) for x in expectation[::10].numpy().tolist()]
variance_rounded = [round(x, 3) for x in variance[::10].numpy().tolist()]
print(f"expectation: {expectation_rounded}")
print(f"variance: {variance_rounded}")expectation: [1.0, 0.998, 0.995, 0.989, 0.982, 0.972, 0.961, 0.948, 0.934, 0.917, 0.9, 0.88, 0.86, 0.838, 0.815, 0.791, 0.767, 0.742, 0.716, 0.689, 0.662, 0.635, 0.608, 0.581, 0.554, 0.527, 0.501, 0.474, 0.449, 0.423, 0.399, 0.375, 0.352, 0.329, 0.308, 0.287, 0.267, 0.248, 0.23, 0.213, 0.196, 0.181, 0.166, 0.153, 0.14, 0.128, 0.116, 0.106, 0.096, 0.087]
variance: [0.0, 0.003, 0.01, 0.021, 0.036, 0.054, 0.076, 0.101, 0.128, 0.159, 0.191, 0.225, 0.261, 0.298, 0.335, 0.374, 0.412, 0.45, 0.488, 0.525, 0.561, 0.596, 0.63, 0.662, 0.693, 0.722, 0.749, 0.775, 0.799, 0.821, 0.841, 0.859, 0.876, 0.891, 0.905, 0.918, 0.929, 0.938, 0.947, 0.955, 0.961, 0.967, 0.972, 0.977, 0.98, 0.984, 0.986, 0.989, 0.991, 0.992]
2. Training Process
Train a neural network to predict the noise in
To minimize:
3. Sampling Process
First, estimate an clean data
Then, we can use the following conditional distribution to sample (the data at the previous time step):
In practice, is usually set to .
Substitute into the formula above, we have
Therefore,
4. Useful Formulas
By mathematical induction, we can prove that
If , then
Then
In particular, for DDPM, we have and .