无需二次开发,Cloud Alert 快速衔接您的IT事件和钉钉通知
1552
2022-10-20
Kubernetes master 无法加入 etcd 集群解决方法
背景:一台 master 磁盘爆了导致 k8s 服务故障,重启之后死活 kubelet 起不来,于是笔者就想把它给 reset 掉重新 join,接着出现如下报错提示是说 etcd 集群健康检查未通过:
error execution phase check-etcd: error syncing endpoints with etc: dial tcp 172.31.182.152:2379: connect: connection refused
解决方法:
1.在 kubeadm-config 删除的状态不存在的 etcd 节点:
kubectl edit configmaps -n kube-system kubeadm-config
cn-hongkong.i-j6caps6av1mtyxyofmrw:advertiseAddress: 172.31.182.152bindPort: 6443
把上边的删掉:
2.因为笔者是用 kubeadm 搭建的集群,所有 etcd 在每个 master 节点都会以 pod 的形式存在一个,etcd 是在每个控制平面都启动一个实例的,当删除 k8s-001 节点时,etcd 集群未自动删除此节点上的 etcd 成员,因此需要手动删除。
注意这里首先要进入 etcd 的 pod。
kubectl exec -it etcd-cn-hongkong.i-j6caps6av1mtyxyofmrx sh -n kube-system
export ETCDCTL_API=3alias etcdctl='etcdctl --endpoints=https://172.31.182.153:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key'/ # etcdctl member listceb6b1f4369e9ecc, started, cn-hongkong.i-j6caps6av1mtyxyofmrx, https://172.31.182.154:2380, https://172.31.182.154:2379d4322ce19cc3f8da, started, cn-hongkong.i-j6caps6av1mtyxyofmrw, https://172.31.182.152:2380, https://172.31.182.152:2379d598f7eabefcc101, started, cn-hongkong.i-j6caps6av1mtyxyofmry, https://172.31.182.153:2380, https://172.31.182.153:2379 #删除不存在的节点/ # etcdctl member remove d4322ce19cc3f8daMember d4322ce19cc3f8da removed from cluster ed812b9f85d5bcd7/ # etcdctl member listceb6b1f4369e9ecc, started, cn-hongkong.i-j6caps6av1mtyxyofmrx, https://172.31.182.154:2380, https://172.31.182.154:2379d598f7eabefcc101, started, cn-hongkong.i-j6caps6av1mtyxyofmry, https://172.31.182.153:2380, https://172.31.182.153:2379/ # etcdctl member listcd4e1e075b1904b2, started, cn-hongkong.i-j6caps6av1mtyxyofmrw, https://172.31.182.152:2380, https://172.31.182.152:2379ceb6b1f4369e9ecc, started, cn-hongkong.i-j6caps6av1mtyxyofmrx, https://172.31.182.154:2380, https://172.31.182.154:2379d598f7eabefcc101, started, cn-hongkong.i-j6caps6av1mtyxyofmry, https://172.31.182.153:2380, https://172.31.182.153:2379/ # exit
最后每次 kubeadm join 失败后要 kubeadm reset 重置节点,在kubeadm join 才会成功。
与君共勉!
出处:http://1t.click/aCXa
发表评论
暂时没有评论,来抢沙发吧~