Image source: Generated by Unbounded AI
Stability AI, the developer of Stable Diffusion (SD), an open-source image generation model, today announced several new enhancements to its Stable Diffusion platform. Not only do these updates offer exciting new text-to-image capabilities, but they also tap into the world of 3D content creation.
The most notable enhancement is the all-new Stable 3D model. Until now, Stable Diffusion has focused primarily on two-dimensional (2D) image generation. Stable 3D models will change that, providing features that help create any type of 3D content, including graphic design and even video game development.
For graphic designers, digital artists, and game developers, 3D content creation can be one of the most complex and time-consuming tasks, often taking hours (and sometimes days) to create a moderately complex 3D object.
Stable 3D’s ability to generate concept-quality textured 3D objects from images or illustrations, or by writing text prompts, removes much of the complexity and allows non-experts to generate a draft-quality 3D model in minutes by selecting a model.
Objects created with Stable 3D are in the “.obj” standard file format, which can be further edited and refined in 3D tools such as Blender and Maya, or imported into game engines such as Unreal Engine 5 or Unity. Dramatically reduces the workload of creators.
Stable 3D provides a fast creative environment for independent designers, artists, and developers, enabling them to create thousands of 3D objects per day at a fraction of the cost.
Currently, Stability AI is only open for a private preview of Stable 3D, request access
In addition to its foray into 3D content generation, Stability AI has also launched the Sky Replacer tool, which is designed to do exactly what the name suggests – to replace the sky background in 2D images.
The Stable Diffusion platform now also offers Stable Fine-Tuning, which is designed to help businesses speed up the process of image fine-tuning for specific use cases.
In addition, the company will integrate an invisible watermark for content authentication in images generated by the Stability AI API. As generative AI becomes increasingly part of common workflows, these new updates are all designed to help businesses with creative development.
Emad Mostaque, CEO of Stability AI, said in an interview, "It’s about giving creative storytellers the tools they need to have extra control over their images. ”
Stability AI’s advancements come at a time when the text-to-image generation market is becoming more competitive.
Adobe has targeted this market with Firefly, an AI tool that is tightly integrated with the company’s design software. Midjourney is constantly adding new features to its technology to help designers generate images. Not to be overlooked, OpenAI recently released ChatGPT’s native DALL-E 3 model, which improves the ability to generate text within images.
Mostaque is well aware of its competition and is committed to helping Stability AI stand out in a number of ways. In particular, he emphasized that his company is now moving from just offering models to providing a channel for ideas. He points out that with the new Sky Replacer and fine-tuning capabilities, they are all additional steps beyond the core base model used to generate images.
Sky Replacer is more than just a feature, it’s also focused on business use cases.
The concept of replacing backgrounds in images is not a new one. In non-generative AI applications, backgrounds can often be replaced by techniques such as green screen and chroma key.
Mostaque says Stability AI is building and automating workflows on top of these classic technologies to make processes fast and efficient for business users. Changing the background color of the sky isn’t just about adding some form of creative flair, it’s a feature with a very specific and practical use case.
“Sky Replacer, for example, is very useful for real estate.”
Mostaque points out that users want to be able to have different backgrounds and different lighting effects. Fundamentally, he stressed, it’s all about providing control, as organizations have their own workflows to generate images and content. What Stability AI is doing is building optimized workflows to help achieve the control needed for different use cases.
“Sky Replacer is the first in a series of products that we’ll be launching that are very industry- and enterprise-specific, building on our experience over the last 6 to 12 months.”
The new Stable 3D model works by extending the diffusion model used in Stable Diffusion to include additional 3D datasets and vectorization.
“I’m really excited to be able to create the whole world in 3D.”
Mostaque explained that Stable 3D was built based on the work of Stable Diffusion and Stability AI on Objaverse-XL, one of the world’s largest open 3D datasets. Building and rendering 3D images has long been a resource-intensive process, but Mostaque is optimistic that stabilizing 3D will be more effective than traditional 3D image generation methods. He stressed that it is still early days to stabilize AI, but he is optimistic that the technology will steadily evolve and expand over time. Stable 3D is initially available in private preview.
"This is very efficient compared to traditional 3D model creation. What used to take a long time to build is now getting the first approvals very quickly. ”
The Biden administration issued an executive order (EO) on artificial intelligence this week, and one component of it is to integrate watermarks into generated content.
Stability AI is now integrating invisible watermarks and Content Credentials into its API. Content Credentials is a multi-vendor industry effort that Adobe and others engage to help provide author information about content. Mostaque says it’s a responsible thing to do to add invisible watermarks and content credentials. It’s also part of Stability AI’s broader effort to bring authenticity to the content it generates.
“We’re really rolling out a series of initiatives and some additional initiatives around this issue, as well as additional research, because we want to know what’s true and what’s fake,” Mostaque said. It also helps with some of the attribution and other mechanisms that we’re building for future releases. ”
Resources: