Tech Blog.

Thoughts, stories, ideas.

CNI Switch Kubespray cluster to Cilium at Dectris

31. May 2023

In this blog post, we’ll focus on CNI, which stands for Container Network Interface. It is the standard interface that enables the connection of container runtimes to a variety of networking implementations. Each with its features and possibilities. Basic CNI implementation attach newly created pods to the Kubernetes network and releases their resources during deletion. This is the baseline. To add some complexity, many CNI implementations come with features like encryption, load balancing, or traffic monitoring, to only name a few.

Flannel is one of the many CNIs available today. It comes as a small, single binary agent running on each node in the cluster. Its main responsibility is allocating a subnet lease to each host out of pre-configured address space. Flannel provides IP addresses for each host in the cluster, but it does not control the networking flow between containers and hosts. It only focuses on transporting data between hosts in the cluster. If you need to control a traffic flow between pods, you can extend Flannel functionalities with Calico policies.

Cilium on the other side is a new player in this game, and we are a proud partner of Isovalent, the company behind Cilium. According to the project’s official website:

Cilium is an open source, cloud native solution for providing, securing, and observing network connectivity between workloads, fueled by the revolutionary Kernel technology eBPF

At the foundation of Cilium is a Linux kernel technology called eBPF, which enables the dynamic insertion of powerful security, visibility, and networking control logic into the Linux kernel. eBPF provides high-performance networking, multi-cluster and multi-cloud capabilities, advanced load balancing, transparent encryption, and much more.

Let’s say that the above Cilium description sounds appealing to you and you would like to start using it in your organization. What should you do? Should you deploy a new cluster or update an existing one to start using it? In this article, I would like to focus on the latter option: how can we switch CNIs on already existing clusters? How much work does it require and what are its options? And finally, how did we achieve it?

Because we did it, we migrated a production-grade cluster for one of our highly valued customers – Dectris Ltd – and we will tell you how we implemented the migration.

So, what are the options?

Performing drastic changes to infrastructure is not new in the IT world, and some questions will always come up when planning them – how long will it take? Will there be any downtime?

Those are reasonable concerns, most business applications out there are actually doing something worthwhile, and we cannot just take them down for a while. So, the first question is – will there be any downtime?

And that gives us two main paths to follow – online and offline migration. One with a very short or no downtime and the other with some downtime. To leverage which option is better, we should define a set of qualifications to measure both methods. In our case, that was:

  • Overall complexity – if the method is over complicated maybe that is the sign to avoid it.
  • Time – how long business applications will not be available? How long is the procedure to switch CNIs?
  • Possibility for automation – it is a stressful open-heart surgery, and we are only humans after all. We make mistakes, especially under pressure. Maybe we can lower the risk of “human errors” by automating some tasks.

Later in this article, I will describe each method in more detail. If you are reading this article to find out which options we choose – it is an offline option. Why? Let’s first go through both of the options and see what are pros and cons of each.

We set up an additional cluster with Kubespray to test both methods, so we will not break anything on production while experimenting.

While exploring online options, we found an article by Josh Van Leeuwen from Jetstack. It is a great source of knowledge that we relayed while planning online migration.

Offline migration

Let’s start with an offline migration since as we’ll later see, online migration is essentially offline with extra steps.

After installing a new CNI on a running cluster, usually, only newly deployed pods will join the network with the new CNI. Those containers will be installed with a different CIDR, will use different encapsulation protocols and network policies.
After that, we installed Cilium on a testing cluster and observed what happened. The recommended way is to use Helm; I don’t see any reason to do otherwise. Since our test and production cluster were managed with Kubespray, we could use Kubespray to install Cilium for us. However, we faced multiple issues when setting it up that way.

One of our requirements was to enable IPsec encryption between cluster nodes. When installing Cilium with Kubespray 2.20.0 we noticed a bug in configmap that was making Cilium unstable. Luckily it was solved in the next version 2.21.0. But then we were not able to access the Hubble portal. Apparently, there was some issue with TLS. Eventually, we decided to set generic CNI in Kubespray configuration and install Cilium with an official Helm chart.

Installing it with the Helm chart gives us yet another option to use the latest version of Cilium before it gets adapted in Kubespray playbooks.

When installing its agents, Cilium updated the /etc/cni/net.d directory making it a default CNI. So technically all new pods should use it upon node reboot. But before that, we wanted to clean some stuff. Flannel binaries to be more precise, can be done with a simple Ansible playbook.

---
- name: Clean and reboot node
  gather_facts: false
  hosts: all
  become: true
  tasks:
  - name: Install missing dependencies from apt
    ansible.builtin.apt:
      name: net-tools
      update_cache: yes

  - name: Remove CNI configuration
    ansible.builtin.file:
      path: /etc/cni/net.d/10-flannel.conflist.cilium_bak
      state: absent

  - name: Remove Flannel binaries
    ansible.builtin.file:
      path: /run/flannel/
      state: absent

  - name: Remove flannel.1 network interface 
    ansible.builtin.shell: ifconfig flannel.1 down && ip link delete flannel.1
    ignore_errors: true

  - name: Reboot Node 
    ansible.builtin.reboot:

Updating nodes was an easy task in our case. As soon as nodes started coming back online, connection inside the cluster was possible again—this time, with Cilium CNI in place.

And that’s it; with a few simple steps, we could switch from Flannel to Cilium in an hour or even less.

Online migration

This approach is all about downtime – mostly about lacking it. We aim for a solution where the cluster is always available, and we slowly roll out necessary changes. That might require pods to use both CNIs simultaneously, which is possible with a tool called Multus. It is a CNI plugin that can attach multiple network interfaces to a single pod.

Installing Multus won’t take immediate effect. We need first to roll out all the pods. Afterward, we’re ready to install Cilium.

In the previously described method, this was the moment when the cluster was becoming unstable. We want to avoid that this time and keep the cluster operational throughout the migration.

After a short investigation, it becomes clear that a DaemonSet is responsible for installing Cilium on each node. How can we tell Cilium DaemonSet to target only a node we want? It’s not a big reveal that we can use taints and labels. We can set the nodeSelector and tolerations property to achieve exactly what we want.

helm repo add cilium https://helm.cilium.io/
helm install cilium cilium/cilium \
    --version 1.12.6 \
    --namespace kube-system \
    --values cilium.yml

Where `cilium.yml` file contains the following values:

–-- # cilium.yml
nodeSelector:
  cni-migration: cilium-flannel
tolerations:
- key: "cni-migration-node"
  operator: "Equal"
  value: "true"
  effect: "noSchedule"

With Cilium already cruising around nodes unable to schedule pods on cluster nodes, we can start rolling out Cilium agents node by node.

kubectl taint nodes <node-name> cni-migration-node=true:NoSchedule
kubectl drain <node-name>
kubectl label --overwrite node <node-name> cni-migration=cilium-flannel

Finally, we had to update CNI configuration files so Multus knew how to attach both CNIs.

This procedure needs to be repeated for each node. After rolling out all nodes, you will have to perform a similar procedure, but this time, remove Flannel and one more time if you want to remove Multus.

If everything went as planned, you can congratulate yourself, and only Cilium is used by all pods in the cluster! Remember to update Cilium Helm values and clean up clusters from eventual labels or taints.

Migration/Conclusion

Unsurprisingly, the online method sounds just like the offline method but with extra steps. As a result, it takes more time and is way more complex.

In the end, we decided to go with an offline option. Why? Because it’s less complex and requires less time to finish. We don’t have to install Multus to delete it afterward, so, in the end, we have to perform fewer steps.

So, how did it go?

Everything went surprisingly well. Ultimately, we didn’t face any issues and maintenance was over after 1-2 hours.

While preparing for this task we were not only running multiple tests on a smaller cluster, but we were also documenting our steps in code. Or at least we were trying to document as much as possible. Those Ansible playbooks / Bash scripts together with step-by-step instructions gave us a well-defined process that we could blindly follow on the migration day and avoid many “human errors” along the way.

Our customer Dectris Ltd. approached us regarding the CNI migration towards Cilium. We want to thank Dectris Ltd, who trusted us with this project and running their Kubernetes clusters.

Download the presentation ‘Switch CNI to Cilium – Our journey’ by our colleagues Jan Müller and Norbert Gruszka gave during the Isovalent Workshop 2023 in Zürich.