I love the technology behind this stuff but the number of applications with negative social benefit seem to outweigh the rest.
"Neural Head Avatar" is not a good name, but it sure beats "deepfake".
I wonder which Hollywood actors will come pre-licensed in Unreal Engine 6?
There's something almost humorous about the last video being narrated by a text-to-speech system - hearing a system that clones human speech describe a system that clones human motion really adds a surrealist touch to the whole thing.
Was not prepared for the emotional reaction to seeing Mona Lisa looking around and really smiling.
This is so impressive it acutally scares me.
This is seriously incredible. Coolest thing I have seen in a very long time. Curious how long it takes to render one of the short example clips shown.
I'm really curious how well this works on highly stylized sources like anime where landmarks aren't equivalent and in some cases may not even exist
Aside, this would be sick for realtime apps, like, imagine you just get a good professional photo or two done and then drive those with your webcam? It'd be like making a vtuber of yourself
This is incredible! When I look at all the advances in computer vision and NLP in the last five years, I can't believe the pace of advancements. I have stopped saying "AI can't do ____ in our lifetime" to my friends.
The deepfake industry just got a lot bigger.
This is perhaps the most impressive "AI" demo I've seen, and that's saying a lot. Interesting to read about the Moscow-based "Samsung AI Center" that seems to be producing this work: https://research.samsung.com/aicenter_moscow.
Now Muggles get animated photos as well.
I wonder if this type of tech could be used for animating video game characters? Instead of trying to use motion capture or something like that, just record an actor making facial expressions that would drive the 3d model. It seems like they could achieve extremely realistic results.
Holy shit
Damn that's so good, when does it come out for a Zoom AR layer.
Pretty neat, I wonder if the AI can take predict the voice of the person in the picture and make it talk.
How does the algorithm decide what the side (and back) of a head looks like?
I wonder how this could be used by a state actor to manipulate.
When will something like this be available to the average user?
undefined
There are still some issues to be worked out, such as how the head shape distorts in some examples, but overall, this is very, very impressive work.
Back in the old days, Disney and other animation studios rotoscoped actors performances by drawing over the original image (by hand) each frame. It won't be long now before you just have an artist create a few examples of concept art, and just video the performances of the actors without much / any special setup other than maybe wearing a tracking suit.
How many years away are we from the point where you can just type in a script (or just put in some writing prompts and have an AI generate a script), describe the direction for the actors ("bend over and pick up the bucket", "exit stage left"), and then just churn out a movie?
If you pick up just a little bit of skill with animation, compositing, and such you're a one-person movie studio. Crazy times. This is not what I imagined the future was going to look like, but it will be entertaining.