ImageGenerationDiffusionModels

Documentation for ImageGenerationDiffusionModels.

ImageGenerationDiffusionModels.apply_noiseMethod
apply_noise(img; num_noise_steps = 500, beta_min = 0.0001, beta_max = 0.02)

Applies forward-noise to an image This function adds Gaussian noise to an image during multiple steps, which corresponds to the forward process in diffusion models.

Arguments

  • img : The input image
  • num_noise_steps: number of steps over which noise should be added to the image (500 by default).
  • beta_min: Minimum beta value (0.0001 by default)
  • beta_max: Maximum beta value (0.02 by default)

Returns

  • An image with noise
source
ImageGenerationDiffusionModels.build_unetFunction
build_unet(in_ch::Int=1, out_ch::Int=1, time_dim::Int=256)

Builds a time-conditioned U-Net model for image denoising and generation in diffusion models

Arguments

  • in_ch::Int=1: Number of unput channels(1 is for grayscale)
  • out_ch::Int=1: Number of output channels(1 is for grayscale)
  • time_dim::Int=256: Dimensionality of the time embedding vector used for condition

Returns

  • A callable function (x, t_vec) -> output, where:
    • x: Input image
    • t_vec: time step vector
    • output: tensor of same dimensions as out_ch channels
source
ImageGenerationDiffusionModels.denoise_imageMethod
denoise_image(noisy_img)

Denoises a noisy image using the trained neural network 'model'. Given a single input noisy_img::Matrix{<:Real}, this function produces a denoised version of that input file

Arguments

  • noisy_img::Matrix{<:Real}: noisy image

Returns

  • A denoised version of the image
source
ImageGenerationDiffusionModels.down_blockMethod
down_block(in_ch, out_ch, time_dim)

Creates a downsampling block for the U-Net

Arguments

  • in_ch::Int: Number of input channels
  • out_ch::Int: Number of output channels
  • time_dim::Int: Dimensionality of the time embedding vector used for conditioning

Returns

  • A callable function (x, t_emb) -> (down, skip), where:
    • x: Input feature map
    • t_emb: Time embedding vector for the current step
    • down: Downsampled feature map for the next layer
    • skip: Intermediate feature map
source
ImageGenerationDiffusionModels.pad_or_cropMethod
pad_or_crop(x, ref)

Pads or crops the input tensor x so that its dimensions match those of ref

Arguments

  • x: A 4D tensor, typically shaped (C, H, W, N)
  • ref: A reference tensor whose spatial size (H, W) x should match

Returns

  • A tensor with the same number of channels and batch size as x, but with height and width adjusted to match ref
source
ImageGenerationDiffusionModels.sinusoidal_embeddingMethod
sinusoidal_embedding(t::Vector{Float32}, dim::Int)

Generates sinusoidal positional embeddings from a vector of scalar inputs, typically used to encode time steps or sequence positions

Arguments

  • t::Vector{Float32}: A vector of time or position values
  • dim::Int: The desired embedding dimensionality

Returns

  • A matrix of shape (length(t), dim) where each row is the embedding of one time step
source
ImageGenerationDiffusionModels.up_blockMethod
up_block(in_ch, out_ch, time_dim)

Creates an upsampling block used in U-Net

Arguments

  • in_ch::Int: Number of input channels to the block
  • out_ch::Int: Number of output channels after the convolutions
  • time_dim::Int: Dimensionality of the time embedding vector

Returns

  • A callable function (x, skip, t_emb) -> output, where:
    • x: The upsampled feature map from the previous layer
    • skip: The skip connection feature map from the encoder
    • t_emb: The time embedding vector for the current step
    • The output is a feature map with out_ch channels
source