Intel doesn’t stop anymore: after confirming the commitment in Germany with the chip factory in Magdeburg, the Intel Labs division and Blockade Labs announce Latent Diffusion Model for 3D (LDM3D). It’s a new generative AI model that creates 360-degree 3D visual content from a text prompt. This solution has the potential to revolutionize content creation, metaverse applications, and digital experiences, transforming a broad range of industries, from entertainment and gaming to architecture and design.
Intel unveils LDM3D: how does it work and what is it for?
LDM3D is a model trained on a 10,000 sample set from the LAION-400M database, which contains over 400 million image-caption pairs. Using then Dense Prediction Transformer (DPT), the model provides an extremely accurate relative depth for each pixel in an image, attributing certain parameters to minimal portions of an artistic composition and, finally, proposing a 360-degree three-dimensional reconstruction of an environment.
For Intel, LDM3D is one of the solutions towards the democratization of AI, as it will allow for broader access to all the benefits of AI across more industries. The first concrete application of this model is DepthFusion, an application that leverages standard 2D RGB photos and depth maps to create immersive and interactive 360-degree viewing experiences. Leveraging TouchDesigner, a node-based visual programming language for real-time interactive media, textual requests are transformed into rich digital experiences.
This paves the way for further advances in multi-view generative AI and computer vision. As Intel Labs AI/ML researcher Vasudev Lal says, LDM3D’s generative AI allows save a lot of time to develop whole three-dimensional scenes with considerable depth. It is currently based on Intel AI supercomputers powered by Intel Xeon processors and Intel Habana Gaudi AI accelerators, but it is clear that it will find a future in more environments and with more companies, as it is an open source solution.