Goal: redeploy Kubernetes master node without redeploying the worker nodes, while keeping the same x509 certificates and all the Kubernetes configuration. No client/consumer action is required, this is solely a server side operation.
The following has been tested in a single master Kubernetes deployment scenario (deployed with the
If you have tested this in a multi-master node deployment, please share your observations.
With a single master server, the impact should be minimal as long as internal DNS is not actively used since it will be the only unavailable service (along with the Kubernetes API itself of course).
Backup Kubernetes PKI
The x509 certificates in
/etc/kubernetes/pki directory are created when the Kubernetes cluster is built and maintained so they will not expire. These certificates are mainly used for identification and authorization, and also for securing the connection between the core services such as kubelets, etcd.
The Kubernetes CA certificate contain DNS names, IP addresses of the Kubernetes master server(s). The same valid for the etcd CA certificate, except that DNS and IP's are of the server(s) running the etcd cluster.
ssh k8s-master "sudo tar czf - -C /etc/kubernetes pki" > pki.tar.gz
etcd keeps all Kubernetes configuration, objects such as PodSecurityPolicy, Deployments, Pods, ServiceAccounts, Secrets, ... pretty much everything.
Make sure you have saved snapshot.db somewhere safe as you will need to restore your Kubernetes from that file.
curl -L https://github.com/coreos/etcd/releases/download/v3.1.11/etcd-v3.1.11-linux-amd64.tar.gz -o /tmp/etcd-v3.1.11-linux-amd64.tar.gz tar xvf /tmp/etcd-v3.1.11-linux-amd64.tar.gz 'etcd-v3.1.11-linux-amd64/etcdctl' --strip-components=1 mv etcdctl /usr/bin/ ETCDCTL_API=3 etcdctl --endpoints http://127.0.0.1:2379 snapshot save /tmp/snapshot.db exit scp k8s-master:/tmp/snapshot.db .
Now that you have backups of PKI & etcd, you can destroy and redeploy the
Restore Kubernetes PKI
Right before you will install Kubernetes on your new
k8s-master node, you need to restore previously backed up x509 certificates:
ssh-keygen -R k8s-master ssh k8s-master "sudo mkdir /etc/kubernetes" cat pki.tar.gz | ssh k8s-master "sudo tar xvzf - -C /etc/kubernetes"
If you are using
kubeadm to deploy Kubernetes, then restore etcd right after
kubeadm init has been completed, otherwise it will abort complaining on that
/var/lib/etcd directory is not empty.
ssh-keygen -R k8s-master scp snapshot.db k8s-master:/tmp/ ssh k8s-master sudo -i curl -L https://github.com/coreos/etcd/releases/download/v3.1.11/etcd-v3.1.11-linux-amd64.tar.gz -o /tmp/etcd-v3.1.11-linux-amd64.tar.gz tar xvf /tmp/etcd-v3.1.11-linux-amd64.tar.gz 'etcd-v3.1.11-linux-amd64/etcdctl' --strip-components=1 mv etcdctl /usr/bin/ docker stop k8s_etcd_etcd-k8s-master_kube-system_d0ddbed1539cb679a70d43b61fa403c5_0 rm -rf /var/lib/etcd ETCDCTL_API=3 etcdctl snapshot restore /tmp/snapshot.db \ --name k8s-master \ --initial-cluster k8s-master=http://127.0.0.1:2380 \ --initial-cluster-token etcd-cluster-1 \ --initial-advertise-peer-urls http://127.0.0.1:2380 \ --data-dir=/var/lib/etcd systemctl restart kubelet
Do not use
docker startfor starting the etcd dokcer container. It will get automatically started after you restart the
If your snapshot was done with an older Kubernetes version and with this redeployment you have gotten a new version, you should use upgrade your etcd by using
kubeadm upgrade commands:
kubeadm upgrade plan kubeadm upgrade apply vX.Y.Z
kubeadm might not initialize if the server gets different IP, e.g. 10.0.0.4 instead of 10.0.0.5 which was previously written to the Kubernetes CA certificate.
In this occasion, you will see the following error in the logs:
# journalctl -u kubelet -f Mar 11 11:47:12 k8s-master kubelet: E0311 11:47:12.362418 11330 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:465: Failed to list *v1.Service: Get https://10.0.0.4:6443/api/v1/services?limit=500&resourceVersion=0: x509: certificate is valid for 10.96.0.1, 10.0.0.5, not 10.0.0.4
While it is possible for you to regenerate the x509 certificate, sign with it everything that is related to it again and distribute it across your environment, it is better if you do not go that way and instead just use the internal IP for the
k8s-master instance which was used before.
To set the previous IP address to your new instance, use the following commands:
openstack port create --network private1 --fixed-ip subnet=private1,ip-address=10.0.0.5 k8s-master-int-ip openstack server create ... --port k8s-master-int-ip ... # or "--nic portid=k8s-master-int-ip" but without "--network private1" flag.