完整、可落地的 Redis Cluster(K8S)构建/重建流程
一、整体架构
StatefulSet(6副本)
↓
Headless Service
↓
ConfigMap(scripts + redis.conf)
↓
PVC(每个Pod一个)
↓
Bitnami Redis Cluster 自动初始化
二、完整部署步骤(从0开始构建)
🔥 Step 1:创建 Headless Service(必须)
apiVersion: v1
kind: Service
metadata:
name: ape-redis-redis-cluster-headless
namespace: <ns>
spec:
clusterIP: None
selector:
app: redis-cluster
ports:
- name: redis
port: 6379
- name: cluster
port: 16379
👉 作用:
- 给每个 Pod 分配固定 DNS:
redis-cluster-0.redis-cluster-headless
🔥 Step 2:创建 ConfigMap(核心)
2.1 scripts(最关键)
apiVersion: v1
kind: ConfigMap
metadata:
name: ape-redis-redis-cluster-scripts
data:
entrypoint.sh: |
#!/bin/bash
set -e
pod_index=$(echo "$POD_NAME" | tr "-" "\n" | tail -1)
if [[ "$pod_index" == "0" ]]; then
export REDIS_CLUSTER_CREATOR="yes"
export REDIS_CLUSTER_REPLICAS="1"
fi
exec /opt/bitnami/scripts/redis-cluster/run.sh
2.2 redis 配置
apiVersion: v1
kind: ConfigMap
metadata:
name: ape-redis-redis-cluster-default
data:
redis-default.conf: |
cluster-enabled yes
cluster-node-timeout 5000
appendonly yes
protected-mode no
🔥 Step 3:创建 StatefulSet(核心控制器)
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: ape-redis-redis-cluster
spec:
serviceName: ape-redis-redis-cluster-headless
replicas: 6
selector:
matchLabels:
app: redis-cluster
template:
metadata:
labels:
app: redis-cluster
spec:
containers:
- name: redis
image: registry.tce.com/ape_team/redis-cluster:7.0.11
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: REDIS_CLUSTER_DYNAMIC_IPS
value: "yes"
command: ["/bin/bash", "-c"]
args:
- /scripts/entrypoint.sh
ports:
- containerPort: 6379
- containerPort: 16379
volumeMounts:
- name: scripts
mountPath: /scripts
- name: redis-data
mountPath: /bitnami/redis/data
- name: default-config
mountPath: /opt/bitnami/redis/etc/redis-default.conf
subPath: redis-default.conf
volumes:
- name: scripts
configMap:
name: ape-redis-redis-cluster-scripts
- name: default-config
configMap:
name: ape-redis-redis-cluster-default
volumeClaimTemplates:
- metadata:
name: redis-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: tceinf-csi-loopdevice
resources:
requests:
storage: 8Gi
🔥 Step 4:部署
kubectl apply -f redis-all.yaml
🔥 Step 5:验证集群
kubectl get pod
kubectl logs redis-cluster-0
进入 Pod:
kubectl exec -it redis-cluster-0 -- redis-cli
cluster info
三、重建集群(你刚经历的场景)
✅ 场景1:彻底重建(推荐)
# 1. 停服务
kubectl scale sts ape-redis-redis-cluster --replicas=0
# 2. 删除 PVC
kubectl delete pvc -l app=redis-cluster
# 3. 启动
kubectl scale sts ape-redis-redis-cluster --replicas=6
✅ 场景2:只重建 Pod(不丢数据)
kubectl delete pod -l app=redis-cluster
✅ 场景3:配置变更
kubectl apply -f configmap.yaml
kubectl rollout restart sts ape-redis-redis-cluster
四、关键原理(必须理解)
✔ 为什么 pod-0 是关键?
if [[ "$pod_index" == "0" ]]
👉 pod-0:
创建 cluster
分配 slot
加入其他节点
✔ 为什么必须 StatefulSet?
因为 Redis Cluster 依赖:
固定节点名
固定顺序启动
独立存储
✔ PVC 为什么会导致集群炸?
PVC里有:
nodes.conf
👉 一旦:
IP变了
Pod重建了
👉 cluster 信息就不一致 → 直接挂
五、生产优化建议(你一定用得到)
🔥 必加配置
- name: REDIS_CLUSTER_DYNAMIC_IPS
value: "yes"
🔥 健康检查(建议放宽)
timeoutSeconds: 5
periodSeconds: 10
🔥 避免 loopdevice(你现在就是)
👉 建议换:
本地盘(local PV)
云盘
🔥 备份策略
BGSAVE
或者:
appendonly yes
本作品采用《CC 协议》,转载必须注明作者和本文链接
关于 LearnKu
推荐文章: