Ahhh, Sloot's digital coding system [1] is finally here ;).
[1] https://en.m.wikipedia.org/wiki/Sloot_Digital_Coding_System
How fast is this and how big is the decoder/encoder? The model weights are not accessible.
From the description, it looks like it's only being tested with 128x128 frames, which implies that the speed is very low.
> It can be observed that our model outperforms them at low bitrates
It can? Maybe I'm misunderstanding the graphs but it doesn't look like it to me?
Back in 2005 there was a collegue at my first job writing video format converters software. He was considered a genius and the stereo type of an introvert software developer. He claimed that one day an entire movie could be compressesed on a single floppydisk. Everybody laughed and thought he was weird. He might be right after all.
Here's the research behind this: https://arxiv.org/html/2402.08934v1
As a casual non-scholar, non-AI person trying to parse this though, it's infuriatingly convoluted. I was expecting a table of "given source file X, we got file size Y with quality loss Z", but while quality (SSIM/LPIPS) is compared to standard codecs like H.264, for the life of me I can't find any measure of how efficient the compression is here.
Applying AI to image compression has been tried before though, with distinctly mediocre results: some may recall the Xerox debacle about 10 years, when it turned out copiers were helpfully "optimizing" images by replacing digits with others in invoices, architectural drawings, etc.
https://www.theverge.com/2013/8/6/4594482/xerox-copiers-rand...
It’s uncanny how much of the current stuff has been predicted by the sitcom -“Silicon Valley”
It's important to remember that any compression gains must include the size of the decompressor which, I assume, will include an enormous diffusion model.
Does anyone remember the https://en.wikipedia.org/wiki/Sloot_Digital_Coding_System?
Can you share example videos?
> Extreme video compression with prediction using pre-trainded diffusion models
Is this more extreme than youtube ?
I wonder how effective a speed focused variation could be for quality among 264, 265, and AV1.
Middle-out.
Extreme compression will be when you put in a movie and get a SORA prompt back that regenerates something close enough to the movie.