前言
K3s 是一个轻量级的 Kubernetes 发行版,它针对边缘计算、物联网等场景进行了高度优化。K3s 有以下增强功能:
- 打包为单个二进制文件。
- 使用基于 sqlite3 的轻量级存储后端作为默认存储机制。同时支持使用 etcd3、MySQL 和 PostgreSQL 作为存储机制。
- 封装在简单的启动程序中,通过该启动程序处理很多复杂的 TLS 和选项。
- 默认情况下是安全的,对轻量级环境有合理的默认值。
- 添加了简单但功能强大的
batteries-included
功能,例如:本地存储提供程序,服务负载均衡器,Helm controller 和 Traefik Ingress controller。 - 所有 Kubernetes control-plane 组件的操作都封装在单个二进制文件和进程中,使 K3s 具有自动化和管理包括证书分发在内的复杂集群操作的能力。
- 最大程度减轻了外部依赖性,K3s 仅需要 kernel 和 cgroup 挂载。 K3s 软件包需要的依赖项包括:
- containerd
- Flannel
- CoreDNS
- CNI
- Ingress controller(Traefik)
- 嵌入式服务负载均衡器(service load balancer)
- 嵌入式网络策略控制器(network policy controller)
官方文档网址:https://docs.k3s.io/
国内文档网址:https://docs.rancher.cn/docs/k3s/
操作环境及软件版本
1. 集群节点规划
操作系统: Rocky Linux release 9.2
K3s主节点1 IP:172.30.87.184
K3s主节点2 IP:172.30.87.185
K3s主节点3 IP:172.30.87.186
K3s工作节点1 IP:172.30.87.187
2. 网络规划
Pod网段:10.77.0.0/16
Service网段:10.123.0.0/16
高可用虚拟VIP:172.30.87.188
3. 组件版本/模式
K3s:v1.30.6+k3s1
CoreDNS:v1.11.3
Etcd:v3.5.13-k3s1 (使用K3s内置Etcd组成集群作为数据存储)
Flannel:v0.25.6 (使用K3s内置Flannel作为网络插件,默认vxlan模式)
Containerd:v1.7.22-k3s1(使用K3s内置Containerd作为容器运行时)
kube-proxy转发方式:ipvs
4. 国内源
镜像仓库:registry.cn-hangzhou.aliyuncs.com
二进制下载源:使用国内ranhcer源 rancher-mirror.rancher.cn
环境准备
以下操作所有主机都需要执行
1. 加载内核模块
cat << eof > /etc/modules-load.d/ipvs.conf
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
eof
cat << eof > /etc/modules-load.d/k8s.conf
overlay
br_netfilter
nf_conntrack
eof
# 加载模块
cat /etc/modules-load.d/{ipvs,k8s}.conf | xargs -n 1 modprobe
# 校验
lsmod | grep -e ip_vs -e nf_conntrack_ipv4 -e br_netfilter -e nf_conntrack -e overlay
2. 关闭swap分区
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab
3. 禁用selinux
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
4. 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
5. 修改内核参数
cat << eof > /etc/sysctl.conf
vm.swappiness=0
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-arptables =1
eof
sysctl -p
6. 下载K3s安装脚本
mkdir -p /opt/scripts/k3s
curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh -o /opt/scripts/k3s/k3s-install.sh && chmod +x /opt/scripts/k3s/k3s-install.sh
7. 创建containerd代理docker.io仓库配置文件
mkdir -p /etc/rancher/k3s/
cat << 'eof' > /etc/rancher/k3s/registries.yaml
mirrors:
docker.io:
endpoint:
- "https://hub.xzt.me"
- "https://docker.ketches.cn"
eof
8. 创建集群通用配置脚本
cat << 'eof' > /opt/scripts/k3s/k3s-setting.sh
pod_ip=10.77.0.0/16 # k8s pod的ip地址范围
svc_ip=10.123.0.0/16 # k8s svc的ip地址范围
token=12345 # 设置加入集群的token
# 设置二进制文件下载源为国内镜像站
export INSTALL_K3S_MIRROR=cn
# 指定特定K3S版本,安装最新版本则移除该行
export INSTALL_K3S_VERSION=v1.30.6+k3s1
eof
安装K3s
1. 第一台主节点(172.30.87.184)
1. 自建100年ca证书(可选)
不执行此步骤,则K3s会自行创建ca证书,有效期为10年
mkdir -p /var/lib/rancher/k3s/server/tls/etcd
k3s_server_tls_dir=/var/lib/rancher/k3s/server/tls
etcd_tls_dir=/var/lib/rancher/k3s/server/tls/etcd
openssl genrsa -out ${k3s_server_tls_dir}/client-ca.key 2048
openssl genrsa -out ${k3s_server_tls_dir}/server-ca.key 2048
openssl genrsa -out ${k3s_server_tls_dir}/request-header-ca.key 2048
openssl req -x509 -new -nodes -key ${k3s_server_tls_dir}/client-ca.key -sha256 -days 36500 -out ${k3s_server_tls_dir}/client-ca.crt -subj '/CN=k3s-client-ca'
openssl req -x509 -new -nodes -key ${k3s_server_tls_dir}/server-ca.key -sha256 -days 36500 -out ${k3s_server_tls_dir}/server-ca.crt -subj '/CN=k3s-server-ca'
openssl req -x509 -new -nodes -key ${k3s_server_tls_dir}/request-header-ca.key -sha256 -days 36500 -out ${k3s_server_tls_dir}/request-header-ca.crt -subj '/CN=k3s-request-header-ca'
openssl genrsa -out ${etcd_tls_dir}/peer-ca.key 2048
openssl genrsa -out ${etcd_tls_dir}/server-ca.key 2048
openssl req -x509 -new -nodes -key ${etcd_tls_dir}/peer-ca.key -sha256 -days 36500 -out ${etcd_tls_dir}/peer-ca.crt -subj '/CN=etcd-peer-ca'
openssl req -x509 -new -nodes -key ${etcd_tls_dir}/server-ca.key -sha256 -days 36500 -out ${etcd_tls_dir}/server-ca.crt -subj '/CN=etcd-server-ca'
2. 执行安装脚本
source /opt/scripts/k3s/k3s-setting.sh
# 设置kube-conifg存储路径; 启用ipvs; 设置Service网段; 设置Pod网段; 设置集群加入Token; 在TLS证书中添加其他允许的主机名或 IP
export INSTALL_K3S_EXEC="--write-kubeconfig ~/.kube/config --kube-proxy-arg=proxy-mode=ipvs --cluster-cidr ${pod_ip} --service-cidr ${svc_ip} --token ${token} --tls-san 172.30.87.188 --system-default-registry registry.cn-hangzhou.aliyuncs.com"
# 设置二进制文件下载源为国内镜像站
export INSTALL_K3S_MIRROR=cn
# 开始安装K3s;设置禁用初始化安装软件;建议禁用servicelb,cloud-controller,其余组件按需禁用
/opt/scripts/k3s/k3s-install.sh server --cluster-init --disable servicelb --disable traefik --disable local-storage --disable metrics-server --disable-cloud-controller
3. 安装后检查
[root@k3s-master-87-184 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k3s-master-87-184 Ready control-plane,etcd,master 17m v1.30.6+k3s1
[root@k3s-master-87-184 ~]# kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-7b98449c4-vhk6m 1/1 Running 0 17m
kube-system kubeapi-haproxy 1/1 Running 0 16m
[root@k3s-master-87-184 ~]# systemctl status k3s -l
● k3s.service - Lightweight Kubernetes
Loaded: loaded (/etc/systemd/system/k3s.service; enabled; preset: disabled)
Active: active (running) since Wed 2024-12-11 14:34:17 CST; 18min ago
Docs: https://k3s.io
Process: 372307 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service 2>/dev/null (>
Process: 372309 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 372310 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 372311 (k3s-server)
Tasks: 49
Memory: 805.5M
CPU: 59.928s
CGroup: /system.slice/k3s.service
├─372311 "/usr/local/bin/k3s server"
├─372327 "containerd "
├─372814
Node状态为Ready,Pod状态均Running即可
2. 其余主节点加入集群
1. 执行安装脚本
source /opt/scripts/k3s/k3s-setting.sh
export INSTALL_K3S_VERSION=v1.30.6+k3s1
export INSTALL_K3S_EXEC="--write-kubeconfig ~/.kube/config --kube-proxy-arg=proxy-mode=ipvs --cluster-cidr ${pod_ip} --service-cidr ${svc_ip} --token ${token} --system-default-registry registry.cn-hangzhou.aliyuncs.com"
export INSTALL_K3S_MIRROR=cn
# --server 指定第一个主节点api地址
/opt/scripts/k3s/k3s-install.sh server --disable servicelb --disable traefik --disable local-storage --disable metrics-server --disable-cloud-controller --server https://172.30.87.184:6443
2. 安装完成后检查
[root@k3s-master-87-186 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k3s-master-87-184 Ready control-plane,etcd,master 61m v1.30.6+k3s1
k3s-master-87-185 Ready control-plane,etcd,master 59m v1.30.6+k3s1
k3s-master-87-186 Ready control-plane,etcd,master 41m v1.30.6+k3s1
Node状态为Ready即可
3. 工作节点加入集群
1. 执行安装脚本
source /opt/scripts/k3s/k3s-setting.sh
export INSTALL_K3S_EXEC="agent --kube-proxy-arg=proxy-mode=ipvs --token ${token}"
/opt/scripts/k3s/k3s-install.sh --server https://172.30.87.184:6443
2. 安装后检查
[root@k3s-worker-87-187 ~]# systemctl status k3s-agent
● k3s-agent.service - Lightweight Kubernetes
Loaded: loaded (/etc/systemd/system/k3s-agent.service; enabled; preset: disabled)
Active: active (running) since Wed 2024-12-11 17:12:57 CST; 1min 4s ago
Docs: https://k3s.io
Process: 2141 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service 2>/dev/null (co>
Process: 2143 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 2144 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 2145 (k3s-agent)
Tasks: 36
Memory: 612.9M
CPU: 7.400s
# 在任意一台主节点上执行
[root@k3s-master-87-186 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k3s-master-87-184 Ready control-plane,etcd,master 98m v1.30.6+k3s1
k3s-master-87-185 Ready control-plane,etcd,master 96m v1.30.6+k3s1
k3s-master-87-186 Ready control-plane,etcd,master 79m v1.30.6+k3s1
k3s-worker-87-187 Ready <none> 87s v1.30.6+k3s1
Node状态为Ready即可
其他相关操作
1. 验证集群网络
1. 创建nginx Daemonset 和 Service
cat << eof | kubectl apply -f -
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nginx-daemonset
labels:
app: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP
eof
# 等待pod均创建完成
[root@k3s-master-87-184 ~]# kubectl get po -l app=nginx -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-daemonset-9wz4v 1/1 Running 0 17h 10.77.0.3 k3s-master-87-184 <none> <none>
nginx-daemonset-rt4bj 1/1 Running 0 17h 10.77.2.2 k3s-master-87-186 <none> <none>
nginx-daemonset-rtr2z 1/1 Running 0 17h 10.77.1.2 k3s-master-87-185 <none> <none>
nginx-daemonset-zkvn4 1/1 Running 0 16h 10.77.4.2 k3s-worker-87-187 <none> <none>
2. 执行验证脚本
验证内容如下:
- pod访问其它节点pod
- pod访问service服务
- pod内解析k8s短域名
在任意一台主节点上执行:
pods_name=`kubectl get po -l app=nginx -owide | awk 'NR!=1{print $1}'`
pods_ip=`kubectl get pods -o custom-columns='NAME:.metadata.name,IP:.status.podIP' | awk 'NR!=1{print $2}'`
echo "$pods_name" | while read pod
do
echo "$pods_ip" | while read ip
do
echo "Checking pod $pod curl IP:$ip"
kubectl exec $pod -- curl -s -o /dev/null -w "%{http_code}" $ip; echo
done
echo "Checking pod $pod curl Serivce:nginx-service"
kubectl exec $pod -- curl -s -o /dev/null -w "%{http_code}" nginx-service; echo
done
返回如下,状态码均为200即可
[root@k3s-master-87-184 ~]# pods_name=`kubectl get po -l app=nginx -owide | awk 'NR!=1{print $1}'`
pods_ip=`kubectl get po -l app=nginx -owide | awk 'NR!=1{print $6}'`
echo "$pods_name" | while read pod
do
echo "$pods_ip" | while read ip
do
echo "Checking pod $pod curl IP:$ip"
kubectl exec $pod -- curl -s -o /dev/null -w "%{http_code}" $ip; echo
done
echo "Checking pod $pod curl Serivce:nginx-service"
kubectl exec $pod -- curl -s -o /dev/null -w "%{http_code}" nginx-service; echo
done
Checking pod nginx-daemonset-9wz4v curl IP:10.77.0.3
200
Checking pod nginx-daemonset-9wz4v curl IP:10.77.2.2
200
Checking pod nginx-daemonset-9wz4v curl IP:10.77.1.2
200
Checking pod nginx-daemonset-9wz4v curl IP:10.77.4.2
200
Checking pod nginx-daemonset-9wz4v curl Serivce:nginx-service
200
Checking pod nginx-daemonset-rt4bj curl IP:10.77.0.3
200
Checking pod nginx-daemonset-rt4bj curl IP:10.77.2.2
200
Checking pod nginx-daemonset-rt4bj curl IP:10.77.1.2
200
Checking pod nginx-daemonset-rt4bj curl IP:10.77.4.2
200
Checking pod nginx-daemonset-rt4bj curl Serivce:nginx-service
200
Checking pod nginx-daemonset-rtr2z curl IP:10.77.0.3
200
Checking pod nginx-daemonset-rtr2z curl IP:10.77.2.2
200
Checking pod nginx-daemonset-rtr2z curl IP:10.77.1.2
200
Checking pod nginx-daemonset-rtr2z curl IP:10.77.4.2
200
Checking pod nginx-daemonset-rtr2z curl Serivce:nginx-service
200
Checking pod nginx-daemonset-zkvn4 curl IP:10.77.0.3
200
Checking pod nginx-daemonset-zkvn4 curl IP:10.77.2.2
200
Checking pod nginx-daemonset-zkvn4 curl IP:10.77.1.2
200
Checking pod nginx-daemonset-zkvn4 curl IP:10.77.4.2
200
Checking pod nginx-daemonset-zkvn4 curl Serivce:nginx-service
200
2. 证书轮转
K3s 客户端和服务器证书自颁发之日起 365 天内有效。任何已过期或过期 90 天内的证书都会在每次 K3s 启动时自动续订
1. 创建自动重启K3s脚本并设置每日执行
在每个主机下执行以下命令
cat << 'eof' > /opt/scripts/k3s/k3s-rotate-cert.sh
#!/bin/bash
if [ -e "/var/lib/rancher/k3s/server/tls/client-k3s-controller.crt" ];then
crt_file=/var/lib/rancher/k3s/agent/client-k3s-controller.crt
server_name=k3s
else
crt_file=/var/lib/rancher/k3s/agent/client-k3s-controller.crt
server_name=k3s-agent
fi
expire_time=`openssl x509 -enddate -noout -in ${crt_file} |grep -Po '(?<==).*$'`
expire_time_s=$(date -d "${expire_time}" +%s)
now_plus_30days=$(date -d '30 days' +'%s')
# 证书不足30天则自动重启k3s/k3s-agent服务来轮准证书
[ "${now_plus_30days}" -gt "${expire_time_s}" ] && systemctl restart ${server_name}
eof
chmod +x /opt/scripts/k3s/k3s-rotate-cert.sh
# 设置每天3点2分执行脚本
echo '2 3 * * * /opt/scripts/k3s/k3s-rotate-cert.sh' >> /var/spool/cron/root
2. 手动节点轮转证书
# 停止 K3s
systemctl stop k3s
# 轮转该节点所有证书
k3s certificate rotate
# 启动 K3s
systemctl start k3s
3. 查看节点证书到期时间
主节点执行:
for i in `ls /var/lib/rancher/k3s/server/tls/{,etcd/}*.crt`; do echo $i; openssl x509 -enddate -noout -in $i; done
工作节点执行:
for i in `ls /var/lib/rancher/k3s/agent/*.crt`; do echo $i; openssl x509 -enddate -noout -in $i; done
3. 自定义内置flannel配置文件
# 根据每台主机已有的flanel配置文件修改
mkdir /etc/flannel/
cp /var/lib/rancher/k3s/agent/etc/flannel/net-conf.json /etc/flannel/
# 添加vxlan端口8442配置
vim /etc/flannel/net-conf.json
{
"Network": "10.77.0.0/16",
"EnableIPv6": false,
"EnableIPv4": true,
"IPv6Network": "::/0",
"Backend": {
"Type": "vxlan",
"Port": 8442
}
}
# 主节点配置加载配置文件
vim /etc/systemd/system/k3s.service
# 从节点配置加载配置文件
vim /etc/systemd/system/k3s-agent.service
# 最后2行添加
'--flannel-conf' \
'/etc/flannel/net-conf.json' \
systemctl daemon-reload
systemctl restart k3s # 主节点执行
systemctl restart k3s-agent # 工作节点执行
4. 自定义内置Containerd配置
可以扩展 K3s 内置Containerd的配置模板,而不是从 K3s 源代码中复制粘贴完整的配置模板,如下:
mkdir -p /var/lib/rancher/k3s/agent/etc/containerd/
cat << 'eof' > /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl
{{ template "base" . }}
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."custom"]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes."custom".options]
BinaryName = "/usr/bin/custom-container-runtime"
eof
# 重启后生效
systemctl restart k3s # 主节点执行
systemctl restart k3s-agent # 工作节点执行
5. K3s集群高可用负载均衡方案
k3s-agent启动命令中的--server
选项指定了启动时需要连接k8s-api地址,当172.30.87.184
主节点异常后,则工作节点连接k8s-api就会异常。
[root@k3s-worker-87-187 ~]# systemctl cat k3s-agent
......
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service 2>/dev/null'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s \
agent \
'--kube-proxy-arg=proxy-mode=ipvs' \
'--token' \
'12345' \
'--server' \
'https://172.30.87.184:6443' \
1. 安装kube-vip
使用kube-vip给就绪的主机绑定虚拟VIP 172.30.87.188
在所有主节点执行(注释的地方需要根据实际情况修改)
mkdir -p /var/lib/rancher/k3s/server/manifests
cat << 'eof' > /var/lib/rancher/k3s/server/manifests/kube-vip-ds.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-vip
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
name: system:kube-vip-role
rules:
- apiGroups: [""]
resources: ["services/status"]
verbs: ["update"]
- apiGroups: [""]
resources: ["services", "endpoints"]
verbs: ["list","get","watch", "update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list","get","watch", "update", "patch"]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["list", "get", "watch", "update", "create"]
- apiGroups: ["discovery.k8s.io"]
resources: ["endpointslices"]
verbs: ["list","get","watch", "update"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: system:kube-vip-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:kube-vip-role
subjects:
- kind: ServiceAccount
name: kube-vip
namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
creationTimestamp: null
name: kube-vip-ds
namespace: kube-system
spec:
selector:
matchLabels:
name: kube-vip-ds
template:
metadata:
labels:
name: kube-vip-ds
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
- matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: Exists
containers:
- args:
- manager
env:
- name: vip_arp
value: "true"
- name: port
value: "6443"
- name: vip_interface
# 设置网卡名
value: ens192
- name: vip_cidr
value: "32"
- name: cp_enable
value: "true"
- name: cp_namespace
value: kube-system
- name: vip_ddns
value: "false"
- name: svc_enable
value: "true"
- name: vip_leaderelection
value: "true"
- name: vip_leaseduration
value: "5"
- name: vip_renewdeadline
value: "3"
- name: vip_retryperiod
value: "1"
- name: address
# 设置VIP地址
value: 172.30.87.188
image: giantswarm/kube-vip:v0.8.7
imagePullPolicy: Always
name: kube-vip
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_RAW
- SYS_TIME
hostNetwork: true
serviceAccountName: kube-vip
tolerations:
- effect: NoSchedule
operator: Exists
- effect: NoExecute
operator: Exists
eof
2. 安装haproxy
使用haproxy给3台主节点的apiserver实现负载均衡,在所有主节点上监听8443端口,转发至所有主节点的6443端口
在所有主节点执行(注释的地方需要根据实际情况修改)
mkdir -p /var/lib/rancher/k3s/server/manifests/
cat << eof > /var/lib/rancher/k3s/server/manifests/kubeapi-haproxy.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kubeapi-haproxy
namespace: kube-system
spec:
selector:
matchLabels:
name: kubeapi-haproxy
template:
metadata:
creationTimestamp: null
labels:
name: kubeapi-haproxy
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
- matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: Exists
containers:
- args:
# 设置3台主节点的ip
- "CP_HOSTS=172.30.87.184,172.30.87.185,172.30.87.186"
image: juestnow/haproxy-proxy:2.5.4
imagePullPolicy: IfNotPresent
name: kubeapi-haproxy
env:
- name: CPU_NUM
value: "4"
- name: BACKEND_PORT
value: "6443"
- name: HOST_PORT
# 设置监听端口
value: "8443"
- name: CP_HOSTS
# 设置为3台主节点的ip
value: "172.30.87.184,172.30.87.185,172.30.87.186"
hostNetwork: true
priorityClassName: system-cluster-critical
eof
3. 检查kube-vip 和 kubeapi-haproxy 状态
[root@k3s-master-87-184 ~]# kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7b98449c4-26m8h 1/1 Running 2 (10m ago) 23h
kube-vip-ds-95klt 1/1 Running 1 (10m ago) 138m
kube-vip-ds-htbrj 1/1 Running 1 (10m ago) 138m
kube-vip-ds-mjw7g 1/1 Running 2 (10m ago) 138m
kubeapi-haproxy-5krgm 1/1 Running 1 (10m ago) 65m
kubeapi-haproxy-b6qxm 1/1 Running 1 (10m ago) 65m
kubeapi-haproxy-rthft 1/1 Running 1 (10m ago) 65m
Pod均处于运行状态即可
4. 修改主节点tls-san
所有主节点执行
在server
子命令后添加选项--tls-san
和虚拟VIP 本文中为172.30.87.188
[root@k3s-master-87-184 ~]# vim /etc/systemd/system/k3s.service
......
ExecStart=/usr/local/bin/k3s \
server \
'--tls-san' \
'172.30.87.188' \
'--write-kubeconfig' \
'/root/.kube/config' \
'--kube-proxy-arg=proxy-mode=ipvs' \
'--cluster-cidr' \
'10.77.0.0/16' \
'--service-cidr' \
'10.123.0.0/16' \
'--token' \
'12345' \
'--tls-san' \
'172.30.87.188' \
'--system-default-registry' \
'registry.cn-hangzhou.aliyuncs.com' \
'server' \
'--cluster-init' \
'--disable' \
'servicelb' \
'--disable' \
'traefik' \
'--disable' \
'local-storage' \
'--disable' \
'metrics-server' \
'--disable-cloud-controller' \
systemctl daemon-reload
systemctl restart k3s
5. 修改工作节点连接K3s的地址
修改--server
选项后的地址为虚拟VIP+端口,本文中为https://172.30.87.188:8443
所有工作节点执行
[root@k3s-worker-87-187 ~]# vim /etc/systemd/system/k3s-agent.service
......
ExecStart=/usr/local/bin/k3s \
agent \
'--kube-proxy-arg=proxy-mode=ipvs' \
'--token' \
'12345' \
'--server' \
'https://172.30.87.188:8443' \
systemctl daemon-reload
systemctl restart k3s-agent
6. 使用 kubectl 从外部访问集群高可用负载均衡地址
将主节点上的 ~/.kube/config
复制到集群外部的主机上的 ~/.kube/config
,然后使用https://172.30.87.188:8443
替换为 server
字段的值即可。
6. 备份与恢复
1. 内置etcd手动备份
在任意一台主节点执行
k3s etcd-snapshot save
# 查看备份
[root@k3s-master-87-184 ~]# ls /var/lib/rancher/k3s/server/db/snapshots
on-demand-k3s-master-87-184-1733992524
2. 内置etcd备份恢复
1. 在有备份的主节点上执行
# 先停止K3s服务
systemctl stop k3s
# 查看已有的备份文件
[root@k3s-master-87-184 ~]# ls /var/lib/rancher/k3s/server/db/snapshots
on-demand-k3s-master-87-184-1733992524
# 指定备份文件路径恢复
k3s server \
--cluster-reset \
--cluster-reset-restore-path=/var/lib/rancher/k3s/server/db/snapshots/on-demand-k3s-master-87-184-1733992524 \
--disable servicelb --disable traefik --disable local-storage --disable metrics-server --disable-cloud-controller
# 启动K3s服务
systemctl start k3s
2. 其他主节点执行以下
# 删除原先的文件
rm -rf /var/lib/rancher/k3s/server/db/
# 重启K3s服务
systemctl restart k3s
3. 自定义内置etcd计划备份策略
K3s默认启用计划快照,时间为系统时间 00:00 和 12:00,保留 5 个快照。
自定义计划备份策略如下:
在主节点上执行,添加--etcd-snapshot-schedule-cron
和 --etcd-snapshot-retention
设置crontab表达式和保留快照的数量
[root@k3s-master-87-185 ~]# vim /etc/systemd/system/k3s.service
......
ExecStart=/usr/local/bin/k3s \
server \
'--write-kubeconfig' \
'/root/.kube/config' \
'--kube-proxy-arg=proxy-mode=ipvs' \
'--cluster-cidr' \
'10.77.0.0/16' \
'--service-cidr' \
'10.123.0.0/16' \
'--token' \
'12345' \
'--system-default-registry' \
'registry.cn-hangzhou.aliyuncs.com' \
'server' \
'--disable' \
'servicelb' \
'--disable' \
'traefik' \
'--disable' \
'local-storage' \
'--disable' \
'metrics-server' \
'--disable-cloud-controller' \
'--server' \
'https://172.30.87.184:6443' \
'--etcd-snapshot-schedule-cron' \
'0 */5 * * *' \
'--etcd-snapshot-retention' \
'10'
上述为设置每隔5小时备份一次,保留数量为10个
7. 卸载K3s
主节点执行:
k3s-uninstall.sh
工作节点执行:
k3s-agent-uninstall.sh
如果使用了外置数据库或者外置容器运行时需要自行清理
参考文档
- https://docs.k3s.io/installation/configuration
- https://github.com/k3s-io/k3s/releases/tag/v1.30.6%2Bk3s1
- https://docs.rancher.cn/docs/k3s/installation/install-options/_index
- https://kube-vip.io/docs/installation/daemonset/#arp-example-for-daemonset