Last week, Nvidia launched a new version of Nvidia Broadcast (opens in new tab) — the deep learning and AI-powered software that can do noise suppression, background removal/replacement, camera framing, and now… Eye Contact. That last one is currently in beta, and… should probably stay in beta.
AI and deep learning has been in the news a lot lately, for good reason. Stuff like Dall-E, Midjourney, and Stable Diffusion are creating art from text, often with rather striking results. Of course, at other times you end up with mangled mutant creatures with two and a half heads and too many limbs. On the text side, ChatGPT is churning out legible writing that many fear means the death knell for English essays and journalism (and no, it did not write this news post).
The idea behind Eye Contact is simple enough: When you’re on a webcast or meeting, often you look away from the camera. In fact, there’s a real chance you’re always looking away from the camera — because it’s sitting at the top of the screen and the things you want to look at are on the screen. But what if there was a way to look like you’re looking at your camera without looking at your camera?
What if you could train an AI model on faces and teach it to correct image where someone isn’t looking straight into the lens? Get millions of images that are appropriately tagged, feed it into the network, and out pops an amazing tool, right?
Implementing it is not quite as simple; Nvidia has been talking about its Eye Contact feature for well over a year, and it’s only now going into public (beta) release. Differences between myriad faces around the world makes it a tough problem to “solve,” and even now the results are… imperfect (and that’s putting it nicely).
I went ahead and tested it anyway, on a system with an RTX 3090 Ti:
One of the things I noticed in testing is that often the live video feed would oscillate between me looking at the camera and me looking elsewhere, even though my focus stayed in the same spot. I guess this could be intentional, because having someone staring directly into the camera throughout an entire video chat would be a little creepy — but if it is, some adjustments to timing need to be made.
What’s more difficult to say is whether this sort of effect is even beneficial in the first place. If you want to look like you’re looking at the camera, you should probably learn to look… at the camera. Solving human error through AI might just end up encouraging bad habits — what happens if you end up on a video feed that doesn’t correct eye contact?
Regardless, Nvidia Broadcast with Eye Contact is now available for RTX owners to test. I tested it with an RTX 3090 Ti, but Nvidia lists the RTX 2060 as the entry point (and this should include mobile RTX 3050 GPUs, as far as I know). Long-term, I suspect at some point Nvidia will end up with some AI models that are more complex and require faster hardware than an RTX 2060 — just like how DLSS 3’s Frame Generation feature requires an RTX 40-series graphics card — but for now any RTX GPU made in the past four years can power this feature.
Do you like the effect, hate it, find it creepy, or something else? Let us know in the comments, along with other effects you’d rather see. I’m personally looking forward to the time when we can all have virtual cartoon avatars like Toy Jensen talking in place of real people, perhaps reading articles that were written by AI, with the videos and articles both being consumed by AI.
It’s bots all the way down from there!