ImageGenerationDiffusionModels
Documentation for ImageGenerationDiffusionModels.
ImageGenerationDiffusionModels.apply_noiseImageGenerationDiffusionModels.build_unetImageGenerationDiffusionModels.denoise_imageImageGenerationDiffusionModels.down_blockImageGenerationDiffusionModels.generate_gridImageGenerationDiffusionModels.generate_image_from_noiseImageGenerationDiffusionModels.get_dataImageGenerationDiffusionModels.pad_or_cropImageGenerationDiffusionModels.sinusoidal_embeddingImageGenerationDiffusionModels.up_block
ImageGenerationDiffusionModels.apply_noise — Methodapply_noise(img; num_noise_steps = 500, beta_min = 0.0001, beta_max = 0.02)Applies forward-noise to an image This function adds Gaussian noise to an image during multiple steps, which corresponds to the forward process in diffusion models.
Arguments
img: The input imagenum_noise_steps: number of steps over which noise should be added to the image (500 by default).beta_min: Minimum beta value (0.0001 by default)beta_max: Maximum beta value (0.02 by default)
Returns
- An image with noise
ImageGenerationDiffusionModels.build_unet — Functionbuild_unet(in_ch::Int=1, out_ch::Int=1, time_dim::Int=256)Builds a time-conditioned U-Net model for image denoising and generation in diffusion models
Arguments
in_ch::Int=1: Number of unput channels(1 is for grayscale)out_ch::Int=1: Number of output channels(1 is for grayscale)time_dim::Int=256: Dimensionality of the time embedding vector used for condition
Returns
- A callable function
(x, t_vec) -> output, where:x: Input imaget_vec: time step vectoroutput: tensor of same dimensions asout_chchannels
ImageGenerationDiffusionModels.denoise_image — Methoddenoise_image(noisy_img)Denoises a noisy image using the trained neural network 'model'. Given a single input noisy_img::Matrix{<:Real}, this function produces a denoised version of that input file
Arguments
noisy_img::Matrix{<:Real}: noisy image
Returns
- A denoised version of the image
ImageGenerationDiffusionModels.down_block — Methoddown_block(in_ch, out_ch, time_dim)Creates a downsampling block for the U-Net
Arguments
in_ch::Int: Number of input channelsout_ch::Int: Number of output channelstime_dim::Int: Dimensionality of the time embedding vector used for conditioning
Returns
- A callable function
(x, t_emb) -> (down, skip), where:x: Input feature mapt_emb: Time embedding vector for the current stepdown: Downsampled feature map for the next layerskip: Intermediate feature map
ImageGenerationDiffusionModels.generate_grid — Methodgenerate_grid()Loads the digits data and generates grid
ImageGenerationDiffusionModels.generate_image_from_noise — Methodgenerate_image_from_noise()Generates a new image from random noise and denoises it.
ImageGenerationDiffusionModels.get_data — Methodget_data(batch_size)Helper function that loads MNIST images and returns loader.
Arguments
batch_size::Int: size of batch
ImageGenerationDiffusionModels.pad_or_crop — Methodpad_or_crop(x, ref)Pads or crops the input tensor x so that its dimensions match those of ref
Arguments
x: A 4D tensor, typically shaped(C, H, W, N)ref: A reference tensor whose spatial size(H, W)xshould match
Returns
- A tensor with the same number of channels and batch size as
x, but with height and width adjusted to matchref
ImageGenerationDiffusionModels.sinusoidal_embedding — Methodsinusoidal_embedding(t::Vector{Float32}, dim::Int)Generates sinusoidal positional embeddings from a vector of scalar inputs, typically used to encode time steps or sequence positions
Arguments
t::Vector{Float32}: A vector of time or position valuesdim::Int: The desired embedding dimensionality
Returns
- A matrix of shape
(length(t), dim)where each row is the embedding of one time step
ImageGenerationDiffusionModels.up_block — Methodup_block(in_ch, out_ch, time_dim)Creates an upsampling block used in U-Net
Arguments
in_ch::Int: Number of input channels to the blockout_ch::Int: Number of output channels after the convolutionstime_dim::Int: Dimensionality of the time embedding vector
Returns
- A callable function
(x, skip, t_emb) -> output, where:x: The upsampled feature map from the previous layerskip: The skip connection feature map from the encodert_emb: The time embedding vector for the current step- The output is a feature map with
out_chchannels