MIT researchers boost top AI image generators by 30x speed

MIT scientists have pioneered a breakthrough in accelerating the most popular AI image generators by 30x speed. A new framework has been developed to enhance generative AI systems like DALL·E 3 and Stable Diffusion, condensing their processes into smaller models while maintaining quality.

In their study uploaded on Dec. 5, 2023, to the preprint server arXiv, scientists introduced a technique named “distribution matching distillation” (DMD). This method enables new AI models to mimic established image generators, such as DALL·E 3 and Stable Diffusion, resulting in faster image generation without sacrificing quality.

“Our work is a novel method that accelerates current diffusion models such as Stable Diffusion and DALL·E 3 by 30 times,” stated Tianwei Yin, co-lead author of the study and a doctoral student at MIT. “This advancement not only significantly reduces computational time but also retains, if not surpasses, the quality of the generated visual content.”

Diffusion models typically involve a multi-stage process where AI learns to understand image context and meaning through training with descriptive text captions and metadata. By this way, MIT researchers boost top AI image generators by up to 30 times.

In practice, these models utilize “forward diffusion” to encode images with random noise, followed by up to 100 steps of “reverse diffusion” to produce a clear image based on text prompts, as explained by AI scientist Jay Alammar.

By implementing DMD, the number of “reverse diffusion” steps is reduced to one, significantly cutting down image generation time. For instance, using Stable Diffusion v1.5, image generation time dropped from approximately 2,590 milliseconds to 90 ms, making it 28.8 times faster.

The DMD technique incorporates two key components: “regression loss” organizes images based on similarity during training, facilitating faster learning, while “distribution matching loss” ensures generated images correspond to real-world probabilities, minimizing unrealistic outcomes.

“Decreasing the number of iterations has been the Holy Grail in diffusion models since their inception,” said Fredo Durand, co-lead author and professor at MIT. “We are very excited to finally enable single-step image generation, which will dramatically reduce compute costs and accelerate the process.”

This innovative approach drastically reduces computational power requirements, making image generation more efficient, particularly in industries where speed is critical.

134 views