Meta 3D Gen

Authors :: Bensadoun, Raphael
Monnier, Tom
Kleiman, Yanir
Kokkinos, Filippos
Siddiqui, Yawar
Kariya, Mahendra
Harosh, Omri
Shapovalov, Roman
Graham, Benjamin
Garreau, Emilien
Karnewar, Animesh
Cao, Ang
Azuri, Idan
Makarov, Iurii
Le, Eric-Tuan
Toisoul, Antoine
Novotny, David
Gafni, Oran
Neverova, Natalia
Vedaldi, Andrea
Publication Year :: 2024
Abstract: We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset generation. 3DGen offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in under a minute. It supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications. Additionally, 3DGen supports generative retexturing of previously generated (or artist-created) 3D shapes using additional textual inputs provided by the user. 3DGen integrates key technical components, Meta 3D AssetGen and Meta 3D TextureGen, that we developed for text-to-3D and text-to-texture generation, respectively. By combining their strengths, 3DGen represents 3D objects simultaneously in three ways: in view space, in volumetric space, and in UV (or texture) space. The integration of these two techniques achieves a win rate of 68% with respect to the single-stage model. We compare 3DGen to numerous industry baselines, and show that it outperforms them in terms of prompt fidelity and visual quality for complex textual prompts, while being significantly faster.