The Future of 3D GenAI: from Assets to World Building

Interview with Tejas Kulkarni

Tejas Kulkarni is the CEO of Common Sense Machine (a.k.a. CSM). He provides an in-depth look at the current state and future potential of AI in the creation of entire 3D worlds. From the challenges of using AI to create 3D models to the prospects of AI in democratizing world building, Tejas offers valuable insights into how AI technologies are about to transform industries ranging from gaming to architectural design.

The question that everybody's asking is, how far are we from using AI-generated 3D models for production work?

Tejas Kulkarni: That's a very good question. 3D turned out to be the hardest modality. We started with text, then images, and now we're at the stage of videos and 3D. The challenge with 3D is its complexity – combining assets, dynamics, rigging, and physics. For production readiness, the major hurdle is achieving high-quality geometry and topology. We're already seeing good results for prototyping and indie workflows. By 2024, we might see more production-ready applications, where artists won't need to rewire polygons and textures as much, allowing them to start from where the AI leaves off.

Could you elaborate on why achieving production readiness in 3D modeling is so challenging?

Tejas Kulkarni: 3D modeling is incredibly complex because it involves so many different elements coming together. For a production-ready workflow, especially in AAA games, you need to have optimized assets, dynamics, rigging, and physics for object interactions. At the lowest level, creating a textured polygonal mesh is challenging enough. The big jump needed for production readiness involves getting the geometry and topology right so artists can start working directly with AI-generated models without extensive rework.

What is the best input modality for generating 3D models with AI?

Tejas Kulkarni: It depends on the speed and control you want. For quick generation, text combined with images is effective. For more control, sketching is ideal, as it aligns with how we naturally conceptualize objects. Our platform supports multiple inputs – text, images, and sketches – to cater to different creative processes. Regardless of the input, we ground everything in an image to ensure clarity and reduce errors, making the process intuitive and reliable.

AI in 3D modeling is advancing rapidly. How do you keep up with the pace of innovation?

Tejas Kulkarni: We thrive on the rapid progress. Viewing the community as an “extended research team” helps us stay motivated and collaborative. Not all innovations are game-changers; we rigorously test new ideas to filter out the noise. With our head start and a solid research foundation, we quickly implement and evaluate new methods. We're also investing in larger models and infrastructure, staying ahead by continuously improving and scaling our technology.

Do you foresee specialization within the 3D generative AI market?

Tejas Kulkarni: Absolutely. There will be distinct verticals like VFX, industrial scanning, CAD, gaming, and more, each with specialized tools and applications. Some companies will dominate end-to-end solutions in specific areas, while others might focus on infrastructure or niche applications. The complexity of 3D creation requires this specialization, as different domains have unique requirements and challenges. We aim to be a leader in gaming and interactive experiences, but we recognize the potential for diverse applications across various industries.

What do you find most exciting about the potential of generative AI?

Tejas Kulkarni: I’m really excited about the ability to quickly create immersive and dynamic 3D worlds. Imagine being able to prototype a small game world in minutes, complete with interactive characters and environments. This rapid prototyping capability will revolutionize game development, making it accessible to more people. Our goal is to push towards creating full mini game worlds where assets are not just static but can interact, move, and even communicate with users, bringing a new level of creativity and immersion to game development.

Looking ahead, what impact do you envision CSM having on the industry?

Tejas Kulkarni: I'm excited about making 3D creation accessible and dynamic. Imagine quickly prototyping a mini-game world or animating interactive characters with ease. We're working towards that by integrating rigging, animation, and interaction capabilities into our assets. Soon, users will be able to chat with and control their 3D models, bringing them to life. This aligns with our vision of creating virtual characters and worlds with common sense and dynamic behaviors, making 3D content creation intuitive and engaging for everyone.