Check out daily original content around XR and AI on LinkedIn

Are 3D Gen AI tools ready for production?

Are 3D Gen AI tools ready for production?

In this Episode

Today, we’re joined by Ethan Hu, the CEO Meshy, a company at the forefront of 3D generative AI. In this episode, Ethan delves into the intricacies of creating 3D models with AI, discussing the challenges and breakthroughs in texturing and UV unwrapping. Join us to explore the latest advancement in 3D modeling, where AI is rapidly transforming production workflows, and pushing the boundaries of digital creativity and efficiency.

Interview with Ethan

Host: When will we be able to create Midjourney-level 3D models generated by AI?

Ethan: That's a common question. I’d divide it into the market and technology aspects. For market success, there must be real user need and at the moment, large-scale consumer use cases for 3D models are limited. A significant market shift might happen with the mass adoption of VR and XR headsets, creating a need for 3D, interactive models. From a technology perspective, we've only solved about 10% of the challenges. Proper UV unwrapping, topology, control, and reducing poly count are areas that need improvement. But I'm optimistic. Given the current pace of advancement, we could see major progress in the next few years.

3D generative AI tool

Host: You mentioned the importance of quality, diversity, and speed in a 3D generative AI product. Can you elaborate on that?

Ethan: Sure. Quality is paramount. Users are willing to wait or input more text for high-quality models with good textures, proper poly count, and neat UV unwrapping. Diversity is also crucial. A competitive 3D generative system should create a wide range of objects, not just limited categories like chairs or vases. Speed matters too. It’s essential to provide quick previews, even if the final high-quality model takes longer to generate.

Host: With Meshy, how are you addressing the challenge of UV unwrapping in generative AI?

Ethan: We’re currently using automatic UV unwrapping systems. If a model doesn’t come with UV, we unwrap it using existing technology. We're exploring ways to improve this, as there's significant user demand for better UV unwrapping and topology based on generative AI, but it’s a work in progress with no major breakthroughs yet.

Host: Can you tell us about Meshy’s unique text-to-texture feature?

Ethan: Our text-to-texture system is something we're really proud of. For instance, you can provide a basic 3D model to Meshy, type in a style or texture description, and it generates different textured styles for the model. This feature supports high resolutions like 4K, adding significant clarity and detail. We’re also working on speeding up this process.

Host: How does Meshy handle the creation of textures on unseen parts of a model when only a front view image is provided?

Ethan: Using AI to imagine different views of an object based on a single image is probably the most challenging aspect. Our AI combines 2D and 3D knowledge to rotate the camera and imagine unseen parts. But it’s a complex process involving view consistency and AI imagination, based on extensive data. The results can add unexpected value to the model, although they can also be unpredictable.

Host: What are your thoughts on control and iterability in generative AI for 3D modeling?

Ethan: Control is a critical aspect. Users want AI to closely follow their instructions. One idea we’re exploring is a ChatGPT-style 3D text modeler for more direct control over the model’s features. However, improving control in the initial modeling stage is our current focus. Users can describe specific features in their text prompts, but achieving consistent remains a challenge.

Host: When will 3D models generated by AI be suitable for production use?

Ethan: Currently, text-to-texture is most suitable for production, particularly for less prominent assets like environment objects. For characters and key assets, AI isn’t quite there yet, especially for animation purposes. AI-generated models tend to be all triangles, which poses challenges for animation. However, for static objects or less important NPCs in games, AI-generated models are quite useful.

 

 

Host: Do you see users refining AI-generated 3D models for their needs?

Ethan: Right now, AI-generated 3D models aren’t very user-friendly for adjustments due to issues with UV unwrapping and topology. Our goal is to have AI complete everything so users don’t need to make further adjustments. But better UV and topology would definitely add value, especially for professional users.

Host: What’s the role of API integrations in spreading the use of generative AI tools like Meshy?

Ethan: API integrations are crucial for reaching a wider audience and being part of different ecosystems. For instance, our collaboration with an MMORPG game allowed users to generate unique looks for their characters using our API. Another integration with Snapchat’s Lens Studio helps AR creators find or create 3D assets more efficiently. Our focus is on improving quality, speed, and diversity while being open to API integrations.

Host: Looking ahead, what’s your prediction for the advancement of 3D generative AI?

Ethan: I’m optimistic. The pace of technological advancement in this field is incredible. By around 2026 or 2027, I think we might see a 3D equivalent of Midjourney-level AI models, at least for niche uses like 3D printing. But for wider adoption, the number of users requiring 3D assets needs to grow, likely driven by VR and XR advancements.