How Unikraft Cloud reduces serverless cold starts to milliseconds with unikernels and microVMs
In my former post, 'Why Cloudflare does not use containers in their Workers platform,' I discussed the V8 isolate architecture that enables them to achieve sub-millisecond serverless latency, supporting a significantly large number of tenants who could run their workloads independently without sharing memory and state at the edge. If you haven't read it yet, it's a recommended read.
This is a subsequent post to that where I discuss how Unikraft Cloud, a serverless platform, achieves millisecond serverless cold starts and supports a relatively large number of workloads in a single server instance, leveraging unikernels and microVMs.
On my Cloudflare post, I got a few comments and messages about why Cloudflare designed isolates as opposed to leveraging unikernels. In the concluding part of this post, I'll discuss the differences between both approaches and why isolates are a better fit for Cloudflare in contrast to going forward with the unikernel approach.
With that being said, let's get on with it.
What are unikernels?
Unikernels are single-purpose OS images, with only the necessary OS features clubbed with the application code to form a minimal, lightweight, highly optimized runtime image.
Unikernels only contain the OS features and libraries that are required by the application code to run. They are compiled in the form of a single binary that can either run directly on the bare metal or a hypervisor without the need for an underlying general-purpose OS.
In comparison to traditional VMs or containers that ideally run a general-purpose OS, unikernels are more lightweight, with a stripped-down version of the OS designed to run a specific application or a service. The approach of getting rid of all the unnecessary features cuts down on memory usage, boot time and CPU cycles, thus improving performance starkly.
How are the unnecessary OS components stripped out?
This is achieved by an architectural approach called the Library operation system that allows OS functionalities like networking, file I/O, memory management, etc., to be packaged as modular libraries.
With this, we can selectively pick OS features that our application code requires and package them together to create single binaries, cutting overhead and allowing us to build minimal, application-specific runtimes without the bloat.
Unikernels are single-process systems
Unikernels are typically single-process systems. This means they are designed to run a single application or a service as a single process with a minimal OS layer. This obviates the need for traditional multi-process environments that enable multiple applications or services to run concurrently. So, no context switching between processes.
Furthermore, there are no distinct user-space and kernel-space separations in the OS. This simplifies memory management and improves performance by getting rid of the isolation and the switches between the two spaces as well.
Running unikernels on microVMs
Once the unikernel image is ready, it could run directly on bare metal or a VM. However, it is more commonly run on a microVM leveraging a lightweight virtualization technology like Firecracker.
A microVM is a lightweight, stripped-down virtual machine designed to contain only the essential elements of virtualization cutting down the overhead that comes along with the traditional VMs.
These are optimized for environments where minimal startup time and reduced resource usage are required, like in serverless computing.
Firecracker, developed by AWS, is a minimalist VM monitor that enables us to deploy workloads in microVMs. It's used by AWS Lambda and Fargate to run isolated workloads with minimal resource overhead.
Running unikernals on microVMs combines the benefits of unikernels and microVMs, making it ideal for running serverless and at the edge use cases, including running a large number of multi-tenant workloads on the same physical server, having strong isolation between unikernel instances per microVM.
Since unikernels and microVMs are lightweight, modern physical servers can run thousands of microVMs in parallel, managed by a VM monitor like Firecracker with minimal overhead.
Handling a request
When a request arrives, the VM monitor launches a new microVM, with the unikernel already loaded, within milliseconds. There is negligible serverless cold boot-up time.
Furthermore, if some unikernels have a lengthy initialization time based on the application complexity, a snapshot of their 'ready-to-serve' state is taken in memory. For subsequent requests, the unikernel is loaded from this snapshot directly into memory to cut down on the loading time.
The ready-to-serve state snapshot includes the entire memory state of the application at the point it’s fully prepared to start handling requests.
Comparing Unikernels with Cloudflare isolates
The unikernels and isolates differ in architecture and use cases. Unikernels achieve isolation by running as separate VMs or on bare metal, relying on hardware-level isolation. In contrast, Cloudflare's isolates provide isolation at the language runtime level within a shared process, offering more lightweight and fast context switching.
Unikernels, while efficient, are more resource-intensive compared to isolates.
Furthermore, Cloudflare's isolates primarily support JavaScript and languages that compile to WebAssembly. Unikernels can be built using various programming languages, depending on the unikernel framework used.
Unikernels are focused on creating single binaries that can be run directly on hardware or microVMs, providing better isolation.
The choice between them depends on the application requirements.
If you wish to delve deeper into how Firecracker technology works, do go through this research paper.
Check out Unicraft Cloud as well for more info.
Resources to upskill on cloud computing and systems programming
To learn the fundamentals of cloud computing, including concepts like FaaS, containers, VMs, deployment infrastructure and more, do check out my cloud course. It's a part of the zero to system architecture bundle, which takes you from zero to mastering the fundamentals of system architecture.
Furthermore, to learn to code distributed systems like Git, Redis, Kafka, and more from the bare bones in the programming language of your choice, check out CodeCrafters.
With their interactive, hands-on exercises, you'll develop a good concept on how these systems work, augmenting your domain knowledge and helping you become a better engineer. If you decide to make a purchase, you can use my unique link to get 40% off (affiliate).
Recommended reads on systems programming
I've implemented a single-threaded and multithreaded TCP/IP server in Java. You can go through the linked posts to understand how servers function.
I am writing a crash course on building a distributed message broker like Kafka from the bare bones. You can read the first post here. The following post on this is lined up and dropping pretty soon.
If you haven't subscribed yet, do subscribe to have my posts slide into your inbox as soon as they are published.
If you found this post insightful, consider sharing it with your friends for more reach. You can find me on LinkedIn & X and can chat with me on Substack chat as well. I'll see you in the next post.
Until then, Bye!