The basic idea of diffusion models is given some image, we want to progressively destroy the image over a series of timesteps. At each step, we add some Gaussian noise to the image. By the end of this sequence of steps, we are left with a completely random image, indistinguishable from Gaussian noise.
In the "Denoising Diffusion Probabilistic Models" paper by Ho et al. [1], the forward process is defined as:
If this seems complicated, don't worry! We'll build up to this definition step by step.
Ignoring the formal definition, let's try to write a simple function that takes an image and adds some Gaussian noise to it.
Let's simplify the definition of the forward process to:
Where we sample from a normal distribution with mean
If you're familiar with the "reparameterization trick" [2], you might recognize that we can rewrite the above equation as:
def add_noise(x, a, b):
noise = torch.randn_like(x)
return (a * x) + (b * noise)Try it out by running:
python part_a_very_simple_diffusion.pyYou can also try changing the image, a, b, and steps arguments to see how that effects the output:
python part_a_very_simple_diffusion.py \
--image data/mandrill.png \
--a 0.5 \
--b 0.1 \
--steps 4Using the defaults (image=data/mandrill.png, a=0.5, b=0.1, steps=4), our output looks like this:

Notice we also wrote two functions to normalize and denormalize the image. This is because we'd like our images to be scaled to
def normalize(x):
return 2 * x - 1
def denormalize(x):
return (x + 1) / 2Let's modify our process to be more in line with the formal definition of the forward process:
- Instead of separately defining
$a$ and$b$ , we'll use a single value, and define$a = 1 - b$ . This way our output is more like a weighted sum of the old image and the noise. We'll rename these to$\alpha$ and$\beta$ . - In the paper,
$\beta$ refers to the variance, but we've been using the standard deviation. The formula for variance is$\text{var} = \text{std}^2$ , so our standard deviation is$\sqrt{\beta}$ .
Now our function looks like this:
def add_noise(x, beta):
noise = torch.randn_like(x)
return math.sqrt(1 - beta) * x + math.sqrt(beta) * noiseTry it out by running:
python part_b_fixed_beta_values.pyYou can also try changing the image, beta, and steps arguments to see how that effects the output:
python part_b_fixed_beta_values.py \
--image data/mandrill.png \
--beta 0.3 \
--steps 24Using the defaults (image=data/mandrill.png, beta=0.3, steps=24), our output looks like this:

One thing we quickly notice is that using a fixed
- If
$\beta$ is too small, the original image will still be slightly visible even after many steps. - But if we raise
$\beta$ to compensate, many of the intermediate steps will be too noisy.
To address the issue of using a fixed
Let's implement this schedule in our code. We'll use a class, and initialize the schedule values in our constructor:
class NoiseScheduler:
def __init__(self, steps=24, beta_start=1e-4, beta_end=0.6):
super(NoiseScheduler, self).__init__()
self.steps = steps
self.beta_start = beta_start
self.beta_end = beta_end
self.beta = torch.linspace(beta_start, beta_end, steps)
def add_noise(self, x, t):
"""
Adds a single step of noise
:param x: image we are adding noise to
:param t: step number, 0 indexed (0 <= t < steps)
:return: image with noise added
"""
beta = self.beta[t]
noise = torch.randn_like(x)
return math.sqrt(1 - beta) * x + math.sqrt(beta) * noiseTry it out by running:
python part_c_schedule_for_beta.pyYou can also try changing the image, steps, beta_start beta_end arguments to see how that effects the output:
python part_c_schedule_for_beta.py \
--image data/mandrill.png \
--beta-start 1e-4 \
--beta-end 0.6 \
--steps 24Using the defaults (image=data/mandrill.png, beta_start=1e-4, beta_end=0.6, steps=24), our output looks like this:

We can already see that even with the same number of steps, using a schedule for
So far, we've only shown how to progress through the forward process one step at a time. But what if we want to get an image with noise at a specific timestep? If we have to go through all the intermediate steps, it would be very inefficient. When we begin training, we'll need to generate many samples at arbitrary timesteps for our training data.
One nice quality of the Gaussian distribution is that the sum of two Gaussian random variables is also Gaussian. Since at each step, we sample some noise from a Gaussian distribution, there's a Gaussian distribution that we can sample directly for step
In the paper, they use the following trick. Given our previous definition of
We can define a new variable
And then the new formula for the forward process becomes:
Let's implement this in our code:
class NoiseScheduler:
def __init__(self, steps=24, beta_start=1e-4, beta_end=0.6):
super(NoiseScheduler, self).__init__()
self.steps = steps
self.beta_start = beta_start
self.beta_end = beta_end
self.beta = torch.linspace(beta_start, beta_end, steps)
self.alpha = 1. - self.beta
self.alpha_bar = torch.cumprod(self.alpha, 0)
def add_noise(self, x0, t):
"""
Adds arbitrary noise to an image
:param x0: initial image
:param t: step number, 0 indexed (0 <= t < steps)
:return: image with noise at step t
"""
alpha_bar = self.alpha_bar[t]
noise = torch.randn_like(x0)
return math.sqrt(alpha_bar) * x0 + math.sqrt(1 - alpha_bar) * noiseWe can try it out by running:
python part_d_alpha_bar_trick.pyYou can also try changing the image, steps, beta_start beta_end arguments to see how that effects the output:
python part_d_alpha_bar_trick.py \
--image data/mandrill.png \
--beta-start 1e-4 \
--beta-end 0.6 \
--steps 24Using the defaults (image=data/mandrill.png, beta_start=1e-4, beta_end=0.6, steps=24), our output looks like this:

This results in similar results to our previous scheduler, but now each step is computed independently, making it much more efficient to sample arbitrary timesteps.
- Ho, J., Jain, A., & Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. arXiv preprint arXiv:2006.11239.
- Jayakody, D. (2023). The Reparameterization Trick - Clearly Explained. Retrieved from https://dilithjay.com/blog/the-reparameterization-trick-clearly-explained
