Photo Credit: Hugging Face/ Stability AI
Stability AI on Wednesday released a smaller version of its Stable Diffusion 3 (SD3) artificial intelligence (AI) model. Dubbed Stable Diffusion 3 Medium, the company introduced the smaller text-to-image model as its most advanced image generation model. While retaining all the functionality of the larger generative AI model, the latest tool has lower GPU requirements and consumes less power that previous models. The open weights have also been made available on Hugging Face, and the company says that this AI model can run efficiently on consumer PCs and laptops.
While the Stable Diffusion 3 model (which is now being called the Stable Diffusion 3 Large) became publicly available in April, its high GPU and compute requirements made it difficult for most people with a consumer-grade PC or laptop to run efficiently. The company is solving this problem by offering Stable Diffusion 3 Medium, which can run on most laptops and PCs.
According to a report by VentureBeat, the minimum requirement for the AI model is 5GB of GPU VRAM and the recommended requirement is 16GB of GPU VRAM. Notably, the Nvidia GeForce RTX 3090 features 24GB of GDDR6X VRAM.
Despite the smaller size of two billion parameters (as opposed to SD3 Large's eight billion parameters), Stability AI said in a newsroom post that the Stable Diffusion 3 Medium will be able to show a similar level of efficiency as its larger counterpart. The latest image generation model will deliver detailed photorealistic outputs as well as high-quality outputs in flexible styles. To improve realism in hands and faces, the AI firm is using a 16-channel VAE (Variational Autoencoder).
Prompt adherence will also be at the same level as the larger model. SD3 Medium can understand complex prompts that include spatial reasoning, compositional elements, actions, and styles. Further, typography, which has been a common pitfall of image generation models, has also been improved in the latest AI model, added the company.
Stable Diffusion 3 Medium has been made generally available via the company's Fireworks AI-powered API (App Programming Interface). The text-to-image AI model can also be accessed via the Stable Assistant platform or the Stable Artisan Discord server. Further open weights have also been made available with a non-commercial licence on Hugging Face. To use it for commercial purposes, users will have to get a creator licence from the company.
For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.