The Latest AI Image Generator Outpaces OpenAI's Top Model by 8x

A breakthrough in artificial intelligence (AI) by South Korean researchers has led to the development of a tool capable of producing images in less than two seconds, without requiring high-end hardware. This advancement is made possible through a technique known as knowledge distillation, which has been applied to compress an existing open-source image generation model, Stable Diffusion XL, from its original size of 2.56 billion parameters to a more manageable 700 million parameters. The result is a new model dubbed “KOALA,” which is significantly more efficient and can operate swiftly on modest hardware.

Knowledge distillation is a process that involves transferring the capabilities of a larger AI model to a smaller one without significant loss of performance. This process enables the smaller model, in this case, KOALA, to execute tasks quickly and with less computational demand. The innovation allows KOALA to function on affordable graphics processing units (GPUs) with just around 8GB of RAM, in contrast to its predecessors that required more powerful and costly GPUs.

he research, detailed in a study published on December 7, 2023, in the preprint database arXiv and shared on the Hugging Face open-source AI repository, showcases the potential of this technology. The Electronics and Telecommunication Research Institute (ETRI), the team behind this project, has developed five variants of the model. This includes three versions of KOALA, designed for generating images from text prompts, and two versions of “Ko-LLaVA,” which can produce images or videos in response to text queries.

In a demonstration, KOALA impressively generated an image from the prompt “a picture of an astronaut reading a book under the moon on Mars” in just 1.6 seconds. This is significantly faster than OpenAI’s DALL·E 2 and DALL·E 3, which took 12.3 seconds and 13.7 seconds, respectively, to generate an image from the same prompt.

The team is now looking to incorporate this technology into various applications, including existing image generation services, educational tools, content creation, and other business ventures, highlighting the wide-ranging potential of this efficient and accessible AI model.