Stability AI Releases SDXL Turbo

Stability AI has unleashed its most powerful AI yet – introducing SDXL Turbo. Harnessing a groundbreaking new distillation technique, this revolutionary model can generate images of unparalleled quality with just a single step, reducing the required step count from 50 all the way down to one.

Gone are the days of waiting minutes at a time for an AI to slowly refine an image. SDXL Turbo works its magic instantly thanks to an ingenious combination of adversarial training and score distillation, as outlined in the latest research paper.

Eager to experience this imaging turbocharger yourself? Download the open-sourced model weights and code now on Hugging Face and take SDXL Turbo for a spin on Stability AI’s real-time editing platform, Clipdrop. The future of AI image generation is here. Step on the gas and take it for a ride.

SDXL Turbo Details

At the core of SDXL Turbo is a groundbreaking new distillation technique that enables single-step high-quality image generation. To develop this new AI, our research team compared multiple model variants on metrics of prompt relevance and image quality.

The models tested included StyleGAN-T++, OpenMUSE, IF-XL, SDXL, and LCM-XL. Human evaluators were shown two outputs side-by-side and tasked with choosing which one better fit a given prompt and which had higher quality.

In these blind tests, SDXL Turbo beat out a 4-step configuration of state-of-the-art LCM-XL model with just a single processing step. It also surpassed a 50-step configuration of the SDXL model with only 4 steps.

By combining adversarial training and score distillation, SDXL Turbo achieves unprecedented performance, generating images with more photorealistic details and less noise than ever before possible in a single inference pass.

The efficiency gains are massive – reducing computational requirements by over 10x without any drop in quality. This new distillation methodology truly represents a breakthrough in AI image generation.

The details behind this new technique are discussed more deeply in our research paper. But in summary, SDXL Turbo sets a new high bar for fast, high-fidelity text-to-image generation.

Limitations

While SDXL Turbo represents a major leap forward in AI image generation, there are still some limitations to be aware of:

The generated images are a fixed resolution of 512×512 pixels. Higher resolutions are on the roadmap but not yet supported.
Photorealism, while greatly improved, is still not perfect. Some generated images may have minor defects or uncanny elements.
The model cannot render legible text. Any text generated in images will be illegible.
Faces and people may not always generate properly. Results can be inconsistent depending on the prompt.
The autoencoding capabilities are lossy – meaning image edits made in Clipdrop may not be perfectly preserved when regenerating or expanding the image.

For now, being aware of these caveats can help set accurate expectations when exploring the current model’s capabilities.

The rapid pace of advancement in this field gives us confidence that SDXL Turbo is just the beginning. As models continue to improve, so too will the fidelity, control, and flexibility of AI-generated images.

Trying Out SDXL Turbo Yourself

You can try the demo yourself using Clipdrop. Note that you will need an account.

Final Thoughts on SDXL Turbo

The release of SDXL Turbo sparks an interesting debate – is the tradeoff of lower resolution and imperfect photorealism worth the massive speed gains?

It’s true, when compared side-by-side with SDXL, the image quality is diminished slightly. This leaves some questioning if it’s better to wait a minute or two for a higher fidelity 512px image from SDXL vs a fraction of a second for SDXL Turbo’s output.

However, SDXL Turbo enables unprecedented productivity. You can generate hundreds of images, picking only the best ones for upscaling. And with rapid advances in upscalers, starting from a 512px image is less of a hindrance.

The real-time prompting experience is incredibly fast and responsive. And for many applications, having good-enough placeholders to layout a scene or concept is more valuable than a long wait for perfection.

As with any new technology the use cases will evolve over time. There are certainly situations where SDXL’s quality is worth the wait. But for rapid iteration, SDXL Turbo can’t be beat. This newest addition to Stability AI’s lineup expands the creative possibilities, offering both quality and speed to suit varying needs. You can also check out the weights on HuggingFace.

And if the pace of innovation continues, today’s tradeoffs may be eliminated entirely as future models combine the best of both worlds. But for now, SDXL Turbo offers a tempting balance of quality and velocity for AI-assisted art creation.