rancher 证书过期

Last updated on September 15, 2024 pm

🧙 Questions

rancher证书过期

k8s没问题,但是rancher的ui无法访问,docker查看rancher状态也是没问题的

☄️ Ideas

找出rancher的docker容器id
docker ps | grep rancher/rancher
[root@ispong-demo ~]# docker ps  | grep rancher/rancher
e8d118c45513        rancher/rancher:v2.4.4                                                     "entrypoint.sh"          13 months ago       Up 24 minutes       0.0.0.0:8080->80/tcp, 0.0.0.0:4443->443/tcp   rancher
[root@ispong-demo ~]# 

查看容器日志

docker logs --tail=100 e8d118c45513

发现tls: bad certificate证书过期

2023-08-02 09:47:52.728750 I | http: TLS handshake error from 127.0.0.1:33646: remote error: tls: bad certificate
2023/08/02 09:47:52 [INFO] Waiting for server to become available: Get https://127.0.0.1:6443/version?timeout=30s: x509: certificate has expired or is not yet valid
2023-08-02 09:47:54.730086 I | http: TLS handshake error from 127.0.0.1:33672: remote error: tls: bad certificate
2023/08/02 09:47:54 [INFO] Waiting for server to become available: Get https://127.0.0.1:6443/version?timeout=30s: x509: certificate has expired or is not yet valid
2023-08-02 09:47:56.731469 I | http: TLS handshake error from 127.0.0.1:33732: remote error: tls: bad certificate
2023/08/02 09:47:56 [INFO] Waiting for server to become available: Get https://127.0.0.1:6443/version?timeout=30s: x509: certificate has expired or is not yet valid
2023-08-02 09:47:58.732709 I | http: TLS handshake error from 127.0.0.1:33790: remote error: tls: bad certificate
2023/08/02 09:47:58 [INFO] Waiting for server to become available: Get https://127.0.0.1:6443/version?timeout=30s: x509: certificate has expired or is not yet valid
更新证书

特别注意: 当前是在rancher2.4.4版本上操作,可行

检查rancher过期内容
docker exec -it e8d118c45513 /bin/bash
cd /var/lib/rancher/k3s/server/tls
for i in `ls *.crt` ;do openssl x509 -in $i -noout -dates;echo $i;done

存在notAfter小于当前时间则为过期,有时候发现没有证书是过期的。

client-k3s-controller.crt
notBefore=Jun 29 04:32:21 2022 GMT
notAfter=Jun 28 04:32:35 2024 GMT
client-kube-apiserver.crt
notBefore=Jun 29 04:32:21 2022 GMT
notAfter=Jun 28 04:32:35 2024 GMT
client-kube-proxy.crt
notBefore=Jun 29 04:32:21 2022 GMT
notAfter=Jun 28 04:32:35 2024 GMT
client-scheduler.crt
notBefore=Jun 29 04:32:21 2022 GMT
notAfter=Jun 26 04:32:21 2032 GMT
request-header-ca.crt
notBefore=Jun 29 04:32:21 2022 GMT
notAfter=Jun 26 04:32:21 2032 GMT
server-ca.crt
notBefore=Jun 29 04:32:21 2022 GMT
notAfter=Jun 28 04:32:35 2024 GMT
serving-kube-apiserver.crt
尝试一(无效)

直接删除证书,让rancher重新生成,实测无效果。

# 进入容器
docker exec -it e8d118c45513 /bin/bash
cd /var/lib/rancher/k3s/server
# 备份一下之前的证书,预防突发事件
cp -r tls tls_bak
rm -rf /var/lib/rancher/k3s/server/tls/*
docker restart e8d118c45513
尝试二
docker exec -it e8d118c45513 /bin/bash
kubectl --insecure-skip-tls-verify -n kube-system delete secrets k3s-serving
kubectl --insecure-skip-tls-verify delete secret serving-cert -n cattle-system
rm -f /var/lib/rancher/k3s/server/tls/dynamic-cert.json
docker restart e8d118c45513


docker exec -it e8d118c45513 /bin/bash
curl --insecure -sfL https://192.168.115.104:4443/v3
docker restart e8d118c45513
docker logs -f --tail=100 e8d118c45513
重启后,可能无法获取集群信息
docker logs -f --tail=100 e8d118c45513
2023/08/02 10:18:44 [ERROR] ClusterController c-9lh2m [cluster-deploy] failed with : Get https://10.43.0.1:443/apis/apps/v1/namespaces/cattle-system/daemonsets/cattle-node-agent?timeout=30s: context deadline exceeded
2023/08/02 10:19:20 [INFO] Error on LIST namespaces: Get https://10.43.0.1:443/api/v1/namespaces?timeout=30s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers). Attempt: 1. Retrying
2023/08/02 10:19:29 [ERROR] ClusterController c-9lh2m [cluster-deploy] failed with : Get https://10.43.0.1:443/apis/apps/v1/namespaces/cattle-system/daemonsets/cattle-node-agent?timeout=30s: waiting for cluster [c-9lh2m] agent to connect
2023/08/02 10:19:40 [INFO] Stopping cluster agent for c-9lh2m
2023/08/02 10:19:48 [ERROR] Unknown error: Get https://10.43.0.1:443/api/v1/namespaces?timeout=30s: waiting for cluster [c-9lh2m] agent to connect
2023/08/02 10:20:12 [ERROR] ClusterController c-9lh2m [cluster-deploy] failed with : Get https://10.43.0.1:443/apis/apps/v1/namespaces/cattle-system/daemonsets/cattle-node-agent?timeout=30s: waiting for cluster [c-9lh2m] agent to connect

重启一下cluster-agent

docker ps | grep cluster-agent
6bb19950edd2        5cd32c74250a                                                               "run.sh"                 5 months ago        Up 5 months                                                       k8s_cluster-register_cattle-cluster-agent-7678f75644-gpnsx_cattle-system_e941e2ca-b148-4845-ab47-92c54a364420_1
baae5d2b608e        rancher/pause:3.1                                                          "/pause"                 5 months ago        Up 5 months                                                       k8s_POD_cattle-cluster-agent-7678f75644-gpnsx_cattle-system_e941e2ca-b148-4845-ab47-92c54a364420_1
docker restart 1661fa3b318f
docker restart baae5d2b608e

实在不行

sudo systemctl restart docker

rancher 证书过期
https://ispong.isxcode.com/kubernetes/rancher/rancher 证书过期/
Author
ispong
Posted on
August 2, 2023
Licensed under