Skip to content

Stability AI Launches Advanced Image-Generating Model, Stable Diffusion XL 1.0

Stability AI introduces Stable Diffusion XL 1.0, its most sophisticated text-to-image model yet. With vibrant and accurate colors, this model is poised to revolutionize image generation.

Stable Diffusion XL 1.0

Stability AI, an AI startup, has launched its latest image-generating model, Stable Diffusion XL 1.0, as it continues to advance its generative AI models in the face of increasing competition and ethical challenges. Touted as the company's most advanced release so far, this model is now available open-source on GitHub, in addition to Stability's API and consumer apps, Clipdrop and DreamStudio. According to the company, the model offers vibrant and accurate colors and superior contrast, shadows, and lighting compared to its predecessor.

Joe Penna, Stability AI’s head of applied machine learning explained, Stable Diffusion XL 1.0, with 3.5 billion parameters, can generate full 1-megapixel resolution images in seconds across multiple aspect ratios. Compared to the previous Stable Diffusion model, Stable Diffusion XL 1.0 offers higher-resolution images without requiring excessive computational power.

Penna elaborated that Stable Diffusion XL 1.0 is customizable and ready for fine-tuning to specific concepts and styles. It's also capable of complex designs with simple natural language processing prompts. Moreover, the model excels in text generation, an area where many text-to-image models struggle. It can generate advanced text and maintain legibility, something particularly challenging for logos, calligraphy, or fonts.

As reported by SiliconAngle and VentureBeat, the new model supports inpainting, outpainting, and "image-to-image" prompts. These features allow users to reconstruct missing parts of an image, extend existing images, and create more detailed variations of an input image based on text prompts. The model can also understand complicated instructions given in short prompts, unlike previous versions that required longer text prompts.

Despite the potential for misuse and the generation of harmful content like nonconsensual deepfakes, Stability AI has taken steps to mitigate these risks. It has filtered the model's training data for unsafe imagery, added warnings for problematic prompts, and blocked as many individual problematic terms in the tool as possible.

Legal challenges have also arisen over the model's training set, which includes artwork from artists protesting against the use of their work as training data. Although Stability AI believes it is shielded from legal liability by fair use doctrine, it respects artists' "opt-out" requests and continues to incorporate these requests.

To accompany the launch of Stable Diffusion XL 1.0, Stability AI is releasing a fine-tuning feature in beta for its API. This feature will allow users to specialize generation on specific people, products, and more with as few as five images. The company is also bringing Stable Diffusion XL 1.0 to Bedrock, Amazon’s cloud platform for hosting generative AI models.