Nvidia shows off AI model that turns a few dozen snapshots into a 3D-rendered scene
From 2D to 3D with the help of AI. | Image: NvidiaNvidia’s latest AI demo is pretty impressive: a tool that quickly turns a “few dozen” 2D snapshots into a 3D-rendered scene. In the video below you can see...
Nvidia’s latest AI demo is pretty impressive: a tool that quickly turns a “few dozen” 2D snapshots into a 3D-rendered scene. In the video below you can see the method in action, with a model dressed like Andy Warhol holding an old-fashioned Polaroid camera. (Don’t overthink the Warhol connection: it’s just a bit of PR scene dressing.)
The tool is called Instant NeRF, referring to “neural radiance fields” — a technique developed by researchers from UC Berkeley, Google Research, and UC San Diego in 2020. If you want a detailed explainer of neural radiance fields, you can read one here, but in short, the method maps the color and light intensity of different 2D shots, then generates data to connect these images from different vantage points and render a finished 3D scene. In addition to images, the system requires data about the position of the camera.
Researchers have been improving this sort of 2D-to-3D model for a couple of years now, adding more detail to finished renders and increasing rendering speed. Nvidia says its new Instant NeRF model is one of the fastest yet developed and reduces rendering time from a few minutes to a process that is finished “almost instantly.”
As the technique becomes quicker and easier to implement, it could be used for all sorts of tasks, says Nvidia in a blog post describing the work.
“Instant NeRF could be used to create avatars or scenes for virtual worlds, to capture video conference participants and their environments in 3D, or to reconstruct scenes for 3D digital maps,” writes Nvidia’s Isha Salian. “The technology could be used to train robots and self-driving cars to understand the size and shape of real-world objects by capturing 2D images or video footage of them. It could also be used in architecture and entertainment to rapidly generate digital representations of real environments that creators can modify and build on.” (Sounds like the metaverse is calling.)
Unfortunately, Nvidia didn’t share details on its method, so we don’t exactly how many 2D images are required or how long it takes to render the finished 3D scene (which would also depend on the power of the computer doing the rendering). Still, it seems the technology is progressing quickly and could start having a real-world impact in the years to come.