AI video you can watch and interact with, in real-time

  • That felt so wrong AND someone is cheating here. This felt really suspicious...

    I got to the graffiti world and there were some stairs right next to me. So I started going up them. It felt like I was walking forward and the stairs were pushing under me until I just got stuck. So I turned to go back down and half way around everything morphed and I ended up back down at the ground level where I originally was. I was teleported. That's why I feel like something is cheating here. If we had mode collapse I'm not sure how we should be able to completely recover our entire environment. Not unless the model is building mini worlds with boundaries. It was like the out of bond teleportation you get in some games but way more fever dream like. That's not what we want from these systems, we don't want to just build a giant poorly compressed videogame, we want continuous generation. If you have mode collapse and recover, it should recover to somewhere new, now where you've been. At least this is what makes me highly suspicious.

  • Note that it isn't being created from whole cloth, it is trained on videos of the places and then it is generating the frames:

    "To improve autoregressive stability for this research preview, what we’re sharing today can be considered a narrow distribution model: it's pre-trained on video of the world, and post-trained on video from a smaller set of places with dense coverage. The tradeoff of this post-training is that we lose some generality, but gain more stable, long-running autoregressive generation."

    https://odyssey.world/introducing-interactive-video

  • Well, that felt like entering a dream on my phone. Fuzzy virtual environments generated by "a mind" based on its memory of real environments...

    I wonder if it'd break our brains more if the environment changes as the viewpoint changes, but doesn't change back (e.g. if there's a horse, you pan left, pan back right, and the horse is now a tiger).

  • This is pretty much the same thing as those models that baked dust2 into a diffusion model then used the last few frames as context to continue generating - same failure modes and everything.

    https://diamond-wm.github.io/

  • This is similar to the Minecraft version of this from a few months back [0], but it does seem to have a better time keeping a memory of what you've already seen, at least for a bit. Spinning in circles doesn't lose your position quite as easily, but I did find that exiting a room and then turning back around and re-entering leaves you with a totally different room than you exited.

    [0] Minecraft with object impermanence (229 points, 146 comments) https://news.ycombinator.com/item?id=42762426

  • This seems like a staggeringly inefficient way to develop what is essentially a FPS engine.

  • I feel like we're so close to remaking the classic Rob Schneider full motion video game "A Fork in the Tale"

    https://m.youtube.com/watch?v=YXPIv7pS59o

  • it’s super cool. I keep thinking it kind of feels like dream logic. It looks amazing at first but I’m not sure I’d want to stay in a world like that for too long. I actually like when things have limits. When the world pushes back a bit and gives you rules to work with.

  • related (and quite cool) -- minecraft generated on-the-fly which you can interact with: https://news.ycombinator.com/item?id=42014650

  • Thank you for this experience. Feels like you are exploring a dream.

    I LOVE dreamy AI content. That stuff where everything turned into dogs for example.

    As AI is maturing, we are slowly losing that im favor of boring realism and coherence.

  • I found an interesting glitch where you could never actually reach a parked car, as you move forward the car also moved. It looked a lot like traffic moving through Google Street View.

  • Hi HN, I hope you enjoy our research preview of interactive video!

    We think it's a glimpse of a totally new medium of entertainment, where models imagine compelling experiences in real-time and stream them to any screen.

    Once you've taken the research preview for a whirl, you can learn a lot more about our technical work behind this here (https://odyssey.world/introducing-interactive-video).

  • Can you say where the underground cellar with the red painting is? It's compelling.

  • This is amazing! I think the AI will completely replace the way we create and consume media currently. A well written story, with an amazing graphics generation AI can be both interactive and surprising every time you watch it again.

  • I'm unable to navigate anywhere. I'm on a laptop with a touchscreen and a trackpad. I clicked, double clicked, scrolled, and tried everything I could think of and the views just hovered around the same spot.

  • This is cool. I think there is good chance that this is the future of videogames.

  • To me, this is evidence we're not in a simulation. Even with a gazillion H100's the model runs out of memory just (very roughly) simulating a 50'x50' space over just a few seconds.

  • do u personally feel like scaling this approach is going to be the end game for generating navigatable worlds?

    ie. as opposed to first generating a 3d env then doing some sorts of img2img on top of it?

  • In playing with this it was unclear to me how this differs from a pre-programmed 3d world with bit mapped walls. What is the AI adding that I wouldn't get otherwise?

  • It’s pointless to do this with real world places. Why not do it for TV shows or a photograph? You could walk around inside and explore the scenes.

  • Would be more interesting with people in it.

  • Feels like the Mist (Myst?) game.

  • Interactive ads and interactive porn are the AI killer apps we miss so much.

  • Am I the only one stuck with a black screen as the audio plays?

  • This reminds me of the scapes in Diaspora (by Greg Egan).

  • very cool - what was the hardest part of building this?

  • Seems like it ingested google street view

  • I do not get the "interactive" part. I expect to be able to manipulate objects or at least move them, you know, "interact" with the "video". Now it is some cheap walking simulator, without narration or any plot. Disappearing lamp posts when you get near them also should not be considered an interaction. Maybe you should take a bit different approach to interactive videos, and let's say build a tech review video for some gadget or device, where viewer could interrupt host, using voice, and ask them questions, skip to some part or repeat something in more detail, explain some concept, even compare to other devices.

  • Exploring babel’s library!

  • Love the atmosphere.

  • Now this is an Assassin’s Creed memory machine that I can get behind

  • going outside breaks the ai lol

  • [dead]

  • [flagged]

  • I think this step towards a more immerse virtual reality can actually be dangerous. A lot of intellectual types might disagree but I do think that creating such immersion is a dangerous thing because it will reduce the value people place on the real world and especially the natural world, making them even less likely to care if big corporations screw it up with biospheric degradation.

    It seems like it has a high chance of leading to even more narcissism as well because we are reducing our dependence on others to such a degree that we will care about others less and less, which is something that has already started happening with increasingly advanced interactive technology like AI.