May 3, 2019

Cilium User Survey March 2019 - The Results

Back in March we have asked our users to provide feedback via our first ever user survey. Many of you have responded and the results are in!

next features

The survey was announced on our Slack channel and on Twitter. Participation was anonymous and did not require to leave behind contact information. Most questions had a set of predefined answers plus a field to add additional answers. All questions were optional.

April 29, 2019

Cilium 1.5: Scaling to 5k nodes and 100k pods, BPF-based SNAT, and Rolling Key Updates for Transparent Encryption

We are excited to announce the Cilium 1.5 release. Cilium 1.5 is the first release where we primarily focused on scalability with respect to number of nodes, pods and services. Our goal was to scale to 5k nodes, 20k pods and 10k services. We went well past that goal with the 1.5 release and are now officially supporting 5k nodes, 100k pods and 20k services. Along the way, we learned a lot, some expected, some unexpected, this blog post will dive into what we learned and how we improved.

Header

Besides scalability, several significant features made its way into the release including: BPF templating, rolling updates for transparent encryption keys, transparent encryption for direct-routing, a new improved BPF based service load-balancer with improved fairness, BPF based masquerading/SNAT support, Istio 1.1.3 integration, policy calculation optimizations as well as several new Prometheus metrics to assist in operations and monitoring. For the full list of changes, see the 1.5 Release Notes.

March 18, 2019

Deep Dive into Cilium Multi-cluster

This is a deep dive into ClusterMesh, Cilium's multi-cluster implementation. In a nutshell, ClusterMesh provides:

  • Pod IP routing across multiple Kubernetes clusters at native performance via tunneling or direct-routing without requiring any gateways or proxies.

  • Transparent service discovery with standard Kubernetes services and coredns/kube-dns.

  • Network policy enforcement spanning multiple clusters. Policies can be specified as Kubernetes NetworkPolicy resource or the extended CiliumNetworkPolicy CRD.

  • Transparent encryption for all communication between nodes in the local cluster as well as across cluster boundaries.

Introduction

February 12, 2019

Cilium 1.4: Multi-Cluster Service Routing, DNS Authorization, IPVLAN support, Transparent Encryption, Flannel Integration, Benchmarking other CNIs, ...

We are excited to announce the Cilium 1.4 release. The release introduces several new features as well as optimization and scalability work. The highlights include the addition of global services to provide Kubernetes service routing across multiple clusters, DNS request/response aware authorization and visibility, transparent encryption (beta), IPVLAN support for better performance and latency (beta), integration with Flannel, GKE on COS support, AWS metadata based policy enforcement (alpha) as well as significant efforts into optimizing memory and CPU usage.

Release Overview

December 10, 2018

Cilium 1.4 Preview: Multi-Cluster Service Routing, DNS Authorization, and Transparent Encryption

Multi Cluster Services

As we all enjoy a wonderful week at KubeCon 2018 US, we want to provide a preview into the upcoming Cilium 1.4 release. We are days away from 1.4.0-rc1 which will allow for community testing of a lot new exciting functionality. Some of the highlights:

  • Multi-Cluster service routing using standard Kubernetes services.
  • DNS Authorization with DNS request/response aware security policy enforcement to restrict the DNS names a pod can lookup as well as limit the egress connectivity to the IPs returned in the DNS response of that particular pod.
  • Transparent encryption and authentication for all service to service communication using X.509 certificates.

As always, we love hearing from you, so stop by our KubeCon booth and chat with us and other Cilium users.

December 3, 2018

Analyzing the CNI performance benchmark

Title

First of all, huge shout-out to Alexis Ducastel for putting together a great CNI benchmark comparison. To be honest, there was definitely a moment of panic when we saw the article pop up. Did we just miss a major performance regression?

This blog post documents the investigation we have done so far of what looked like a performance regression of HTTP/FTP traffic over pure TCP.

Alexis was super quick to share the scripts that he used to collect the benchmarks numbers. This not only allowed for a quick verification but also allows us to integrate this into our CI tests and run it alongside of the existing benchmarks for better coverage.

November 20, 2018

Deep Dive into Facebook's BPF edge firewall

Facebook Infrastructure Logo

We have covered Facebook's BPF-based load balancer with DDoS protection in a previous blog post: Why is the kernel community replacing iptables with BPF?. This post provides further details on Facebook's BPF use by covering Anant Deepak's talk at the BPF/networking microconference on Facebook's BPF-based edge firewall running in production.

The same conference also featured many other BPF related talks which we will cover in follow-up blog posts. In particular interesting will be Nikita V. Shirokov's (Facebook) talk XDP: 1.5 years in production. Evolution and lessons learned where Nikita shows the impressive difference between IPVS and BPF under heavy load as well as Vlad Dumitrescu from Google talking about Scaling Linux Traffic Shaping with BPF where Vlad and others share their experience deploying BPF to production solving scalable traffic shaping.