It’s really exciting to see this on the front page. The project actually started during a SUSE Hackweek by my colleague Hussein. It was initially envisioned as a "Kubernetes version of k3d," but it evolved into something more ambitious and eventually became a real product. We’ve always been big believers in the power of open source. For the current default "shared" mode, we even experimented with Virtual Kubelet, another CNCF project, during our development process.
I’ll be hanging around the thread today, so if you have any questions about the history, the tech stack, or where we're headed next, feel free to ask!
Since I do have context, the original Rancher labs CTO created k3s, one of the earliest severely stripped down versions of Kubernetes, which bundles all of the required executables into a single multi-call binary, in order to be able to run Kubernetes on a Raspberry Pi. Along the lines of kind, k3d was released to be able to run k3s in Docker containers instead of full Linux hosts. The main use case is testing. We used it extensivel in the early days of Air Force and IC cloud migrations that insisted we needed to rehost all systems in Kubernetes so developers could have local targets to work with. Rancher eventually rebuilt its Kubernetes engine when Docker fell out of favor and based rke2 on k3s, but with the Kubernetes components as static pods instead of embedded multi-call binaries and kubelet and containerd extracted from an embedded virtual filesystem to the host when rke2 is first run.
When KubeVirt came out, Rancher also released an HCI product that uses it, Harvester, running on top of rke2 and Rancher's storage project Longhorn. This runs a full virtual machine manager with virtualized networking and storage, a la something like ESXI, vSAN, and vSphere, with Multus and the bridge CNI plugin providing the networking (it now has KubeOVN as well).
Harvester relies on being imported to and managed by Rancher to have things like SSO and Rancher's multi-cluster RBAC and node provisioners for Harvester to run guest clusters. A whole lot of customers migrating off of VMWare since the Broadcom acquisition want all of that, but without necessarily having an external Rancher. Early on, Harvester offered an experimental vCluster addon that created a guest cluster with Rancher installed on it and that automatically managed Harvester.
This had a lot of problems. I'm not going to rehash them because I don't want to come across as bashing vCluster, but it was not a tenable long-term option that crashed hard on most who tried to use it. Since Rancher already had k3d, it was pretty natural step to just create their own virtualized Kubernetes that runs in Kubernetes by adapting k3d to become k3k, which runs k3s in Kubernetes rather than in Docker. Now you can get a guest cluster to install Rancher onto and get the full suite of Rancher features and a much better experience than the bare Harvester UI without needing to run full VMs.
Why not just install Rancher directly onto the same rke2 cluster that is running Harvester itself? Because it already has one, but that was considered an implementation detail that developers used to bootstrap and not have to duplicate work that was already done, but not meant to be exposed to users. If you try to install a second Rancher to actually use, you'll conflict with a whole bunch of resources that already exist and it won't work.
It's a tangled mess of confusing layers, but that's the world we live in. It's why we still have IPv4, VLAN, VXLAN, virtual terminals, discretionary access control for Linux. We build on top of what is already there instead of rebuilding from scratch in a saner way. This isn't just how software works. It's why city designs rarely make sense. It's why life itself has vestigial anti-features. Cruft rarely disappears. It just gets buried underneath whatever comes next.
The only trade-off is that K3s currently requires privileged mode to operate. We are actively exploring ways to address this limitation and improve security, such as implementing user namespaces or microVMs.
I understood from the host cluster perspective you won’t see the child cluster pods. And what is the perspective on nodes?
Can you have like a host cluster spawning on host nodes and the host cluster has control over spawning separate physical nodes which contain the child cluster (api server) + workload pods ?
But I don’t fully understand what you meant with content adressed :)
Maybe one has to ensure in the host cluster that the image pull policy is set to Always or all references to images have to be based on the shasum rather than Tags.
However clusers themselves often become the new pets. Many orgs do not reach a level of operational maturity where they can blow away and recreate whole clusters without downtime and toil.
A meta-pattern has emerged where higher order tooling managers a whole fleet of clusters. This is an implementation of that meta pattern which uses K8s itself as the higher order tool to manage other clusters.
It's not a new idea, just a new implementation of the pattern.
Also it's a fun idea. Sandbox in a sandbox.