Researchers Attempt to Uncover the Origins of Creativity in Diffusion Models
Briefly

Researchers propose a mathematical model to explain how diffusion models generate creative images through a denoising process. These models transform Gaussian noise into images by removing noise iteratively, guided by a score function. To create novel outputs beyond the training data, models need to fail in learning the ideal score function precisely. The researchers identify inductive biases, namely translational equivariance and locality, that affect the creative capacities of the models. They develop a mathematical framework to optimize the score function considering these biases, furthering the understanding of the creativity in diffusion models.
The research proposes that the creativity of diffusion models stems from a deterministic process linked to how they employ the denoising mechanism to create images.
Diffusion models function by iteratively removing Gaussian noise, displayed through a learned scoring function indicating paths toward higher image probability, retaining creativity.
An ideal score function allows perfect reconstruction of training examples, but for novel, creative outputs, models must intentionally deviate from this ideal score.
Translational equivariance and locality are identified as crucial inductive biases affecting how diffusion models generate novel samples beyond their training images.
Read at InfoQ
[
|
]