The VM vs container debate is fascinating. They are separate yet slowly merging concepts that are becoming more blurred as technology becomes cheaper and faster. If the real bottleneck to scale is adaptable code, then it is foolish to dismiss the VM as outdated tech when it can be completely rehomed in 2 seconds. That megabyte of python code managing your containers would still be busy checking it's dependencies in that same timeframe.
Unmentioned: there are serious security issues with memory cloning code not designed for it.
For example, an SSL library might have pre-calculated the random nonce for the next incoming SSL connection.
If you clone the VM containing a process using that library, now both child VM's will use the same nonce. Some crypto is 100% broken open if a nonce is reused.
I'm starting to see a pattern here. This describes a technology that rapidly deploys "VM" instances in the cloud which support things like Lambda and single-process containers. At what point do we scale this all back to a more rudimentary OS that provides security and process management across multiple physical machines? Or is there already a Linux distro that does this?
I ask because watching cloud providers like AWS slowly reinvent mainframes just seems like the painful way around.
In the minecraft example video
We are shown a person who quit the server and then the server stops and restarts (that 2 second clone of vm)
but what if I have a service like lets say normal minecraft servers like hypixel or others, they can't hope for a 2 second delay. Maybe we would have to use proxies in that case.
I am genuinely interested by this tech.
Currently, I am much in favour of tinykvm and its snapshotting because its even lighter than firecracker(I think). I really like the dev behind tinykvm as well.
> How to handle network and IP duplicates on cloned VMs
That is indeed what I would love to read the most! Because no matter what you do, it gets complex - if you tear down the network stack of the "old" VM, applications (like Minecraft) might be heading down into unstable territory when the listener socket disappears and the "new" VM has to go through the entire DHCP flow that may easily take a second or more, and if you just do the equivalent of S3 sleep (suspend to RAM), the first "new" VM will have everything working as expected but any further VM being spawned from the template will run into duplicate IP/MAC address usage.
Interesting read—thanks! One question: in the CoW example, if VM A modifies the data post-fork, what does VM B see when it later copies that data? Does it get the original data from the time of the fork, or VM A’s modified version?
Related:
We clone a running VM in 2 seconds - https://news.ycombinator.com/item?id=38651805 - Dec 2023 (10 comments)
They tried running minecraft, but I wonder if a similar (or better) cloning is possible for a mission critical workload - like a database consuming a huge amount of memory. Neon uses QEMU to achieve this for example: https://neon.tech/docs/reference/glossary#live-migration but is that the only way?
This is an increasingly important area, with LLM generated code, and am curious about people's experiences with codesandbox vs e2b vs daytona
Different proposal:
Let's say we have 2 Linux machines. Identical hardware, identical libs.
I'd like to run a simple program on one machine, and then during mid-calculation, would like to transfer the running program to the other machine.
Is this doable?
Cool article! The stack (and results) are impressive, but I also appreciate the article in itself, starting from basics and getting to the point in a clear and slowly expanding way. Easy to follow and appreciate.
On a bit of a tangent rant, this kind of writing is slowly going away, taken over by LLM slop (and I'm a huge fan of LLMs, just not the people who write those kinds of articles). I was recently looking for real world benchmarks for vllm/sglang deployments of DeepSeek3 on a 8x 96GB pod, to see if the model fits into the amount of RAM, with kv cache and context length, what numbers to people get, etc.
Of the ~20 articles that google surfaced on various attempts of keywords, none were what I was looking for. The excerpts seemed promising, some even offered tables & stuff related to ds3 and RAM usage, but all were LLM crap. All were written in that simple style - intro - bla bla - conclusion, some even had RAM requirements that made no sense (running a model trained in FP8 in 16bit, something noone would do, etc.)
Has anybody tried running ollama and Open WebUI in firecracker instead of full VMs? I assume this should work, but not sure about GPU (single and multi) passthrough.
> "Virtual machines are often seen as slow, expensive, bloated and outdated. "
By who, exactly? Citations needed
What problem is this supposed to solve?
Needs [2022] in the title
(2022)
[2022]
>Virtual machines are often seen as slow, expensive, bloated and outdated.
by whom?
I tend to loathe firecracker posts because theyre all just thinly veiled ads for Amazon services.
Firecracker is not included in the standard linux KVM/QEMU duo and has sparse documentation. you cannot deploy a firecracker image like a traditional VM. in fact there are no tools to assist in creating a firecracker VM, and the filesystem for the VM must be EXT4.
TL;DR: this is all fun stuff if youre 200% cloud, but most companies run a ton of on-prem vms as well.
Oh wow! Unexpected and cool to see this post on Hacker News! Since then we have evolved our VM infra a bit, and I've written two more posts about this.
First, we started cloning VMs using userfaultfd, which allows us to bypass the disk and let children read memory directly from parent VMs [1].
And we also moved to saving memory snapshots compressed. To keep VM boots fast, we need to decompress on the fly as VMs read from the snapshot, so we chunk up snapshots in 4kb-8kb pieces that are zstd compressed [2].
Happy to answer any questions here!
[1]: https://codesandbox.io/blog/cloning-microvms-using-userfault...
[2]: https://codesandbox.io/blog/how-we-scale-our-microvm-infrast...