DNS servers monitoring

A few months ago, I found myself needing to know about the reliability of some internal DNS provider’s servers, after getting a series of hardly trackable random network issues, aka “It’s always DNS”. More specifically, I needed to know about the following: number of errors/timeouts capability to query over TCP or UDP capability to monitor multiple DNS servers at once return codes received in the answer (i.e. NOERROR, SERVFAIL, NXDOMAIN, you name it) ...

31 July 2023 · 3 min · 612 words · Clément Nussbaumer

Minimal downtime when rebooting etcd nodes

Graceful leader changes When needing to restart some Kubernetes control-plane nodes on which etcd also happens to be running, you will prefer a graceful transfer of the leadership of the etcd cluster, to reduce the transition period that comes with a leader election. This can be achieved with the following script, provided you specify the adequate environment variables in /etc/profile.d/etcd-all file. set -o pipefail && \ source /etc/profile.d/etcd-all && \ AM_LEADER=$(etcdctl endpoint status | grep $(hostname) | cut -d ',' -f 5 | tr -d ' ') && \ if [[ $AM_LEADER = "true" ]] then NEW_LEADER=$(etcdctl endpoint status | grep -v $(hostname) | cut -d ',' -f 2 | tr -d ' ' | tail -n '-1') && \ etcdctl move-leader $NEW_LEADER && sleep 15 fi Info: the following environment variables need to be set, for example through a file such as: /etc/profile.d/etcd-all ...

7 July 2023 · 1 min · 154 words · Clément Nussbaumer

Kubernetes CNI — deconstructed

A few months ago, I had to understand in detail how Container Network Interface (CNI) is implemented to, well, simply get a chaos testing solution working on a bare-metal installation of Kubernetes. At that time, I found a few resources that helped me understand how this was implemented, mainly Kubernetes’ official documentation on the topic, and the official CNI specification. And yes, this specification simply consists of a Markdown document, which I needed to invest a consequent amount of energy to digest and process. ...

29 March 2021 · 8 min · 1534 words · Clément Nussbaumer