使用 kube-vip 搭建高可用 K8S 集群

作者: 王炳明 分类: Kubernetes 发布时间: 2021-04-07 10:42 热度:41

1. 准备工作

创建三台虚拟机,并准备一个可用的公网 ip

Hostname IP 系统
master01 192.168.56.201 (公网: xx.xx.xx.xx) CentOS7.2
master02 192.168.56.202 CentOS7.2
master03 192.168.56.202 CentOS7.2

2. 部署依赖

创建SNAT

iptables -t nat -A POSTROUTING -s 192.168.56.0/24 -o eth0 -j MASQUERADE

先安装源

curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo

然后再安装 net-tools,如果报如下的错

repomd.xml signature could not be verified for Kubernetes

可以关闭 /etc/yum.repos.d/kubernetes.repo 的验证

gpgcheck=0
repo_gpgcheck=0

同步下时间

ntpdate 0.asia.pool.ntp.org

加载 netfilter 模块,并查看一下文件是否为1

$ modprobe br_netfilter
$ cat /proc/sys/net/bridge/bridge-nf-call-iptables
1

新建 /etc/sysctl.d/k8s.conf,内容如下

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

关闭防火墙

systemctl stop firewalld
systemctl disable firewalld
setenforce 0

关闭 SELinux,编辑 /etc/selinux/config 修改如下值

SELINUX=disable

关闭 swap

swapoff -a

新建文件

cat > /etc/docker/daemon.json <<EOF
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "registry-mirrors": ["https://registry.cn-hangzhou.aliyuncs.com"]
}
EOF

完了后,重启 docker

systemctl daemon-reload
systemctl restart docker
systemctl enable docker

执行命令安装

yum install -y kubelet-1.20.4 kubeadm-1.20.4 kubectl-1.20.4

3. 构建集群

在 master01 上执行如下命令,生成 vip.yaml

docker run --network host --rm plndr/kube-vip:0.2.1 manifest pod --interface eth1 --vip 192.168.56.200 --arp --leaderElection --startAsLeader | sudo tee /etc/kubernetes/manifests/vip.yaml

先在 master01 上执行如下命令,构建一个 k8s 集群

kubeadm init --control-plane-endpoint "192.168.56.200:6443" \
  --apiserver-advertise-address 192.168.56.201 \
  --apiserver-bind-port 6443 \
  --upload-certs \
  --image-repository registry.aliyuncs.com/Google_containers

执行完成后,会打印出两条命令,一条是把 master node 加入集群(分别在 master02 和 master03 上执行)

kubeadm join 192.168.56.200:6443 --token 4unabw.07u5hlx6grqlxywx \
  --discovery-token-ca-cert-hash sha256:efbe93fde432735a5535bed765e7889af4ae535dc4db1f7b960cba56c61c07ac \
  --control-plane --certificate-key 22a7ceeb4b4dc621d3f1ec65634b8d7831f73efdaf81304a78f93b733979a671 \
  --apiserver-advertise-address 192.168.56.203

一条是把 worker node 加入集群(在任意的 worker node 上执行),这条命令可以用 kubeadm token create --print-join-command 重新生成,token 的有效期为 24小时

kubeadm join 192.168.56.200:6443 --token 4unabw.07u5hlx6grqlxywx \
    --discovery-token-ca-cert-hash sha256:efbe93fde432735a5535bed765e7889af4ae535dc4db1f7b960cba56c61c07ac

完成后,再执行这四条命令(在所有的节点上执行)

mkdir -p HOME/.kube
sudo cp -i /etc/kubernetes/admin.confHOME/.kube/config
sudo chown (id -u):(id -g) $HOME/.kube/config

echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile

设置开机自启

systemctl start kubelet.service
systemctl enable kubelet.service

再使用 kubectl get node 会发现,三台 master 加入集群中,但是状态还是 NotReady,这是因为我们还没有部署网络插件(下一节会讲到)

[root@master01 ~]# kubectl get nodes
NAME       STATUS     ROLES                  AGE     VERSION
master01   NotReady   control-plane,master   30m     v1.20.4
master02   NotReady   control-plane,master   3m27s   v1.20.4
master03   NotReady   control-plane,master   1m32s   v1.20.4

4. 创建网络

准备 flannel.yaml

---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp.flannel.unprivileged
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
    seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
  privileged: false
  volumes:
  - configMap
  - secret
  - emptyDir
  - hostPath
  allowedHostPaths:
  - pathPrefix: "/etc/cni/net.d"
  - pathPrefix: "/etc/kube-flannel"
  - pathPrefix: "/run/flannel"
  readOnlyRootFilesystem: false
  # Users and groups
  runAsUser:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
  # Privilege Escalation
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  # Capabilities
  allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
  defaultAddCapabilities: []
  requiredDropCapabilities: []
  # Host namespaces
  hostPID: false
  hostIPC: false
  hostNetwork: true
  hostPorts:
  - min: 0
    max: 65535
  # SELinux
  seLinux:
    # SELinux is unused in CaaSP
    rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
rules:
- apiGroups: ['extensions']
  resources: ['podsecuritypolicies']
  verbs: ['use']
  resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: flannel
  namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      hostNetwork: true
      priorityClassName: system-node-critical
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: registry.cn-beijing.aliyuncs.com/qingfeng666/flannel:v0.13.0
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: registry.cn-beijing.aliyuncs.com/qingfeng666/flannel:v0.13.0
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN", "NET_RAW"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
      - name: run
        hostPath:
          path: /run/flannel
      - name: cni
        hostPath:
          path: /etc/cni/net.d
      - name: flannel-cfg
        configMap:
          name: kube-flannel-cfg

执行如下命令进行创建

kubectl apply -f flannel.yaml

再次查看 node 的状态,就一切正常啦

[root@master01 ~]# kubectl get nodes
NAME       STATUS   ROLES                  AGE    VERSION
master01   Ready    control-plane,master   47m    v1.20.4
master02   Ready    control-plane,master   20m    v1.20.4
master03   Ready    control-plane,master   12m    v1.20.4

5. 验证高可用

5.1 网络架构

eth0 上配置的是公网 ip,eth1 上配置的是内网 ip,如果当前节点是 master 的主节点,则会多出一个 vip (192.168.56.200)

使用 kube-vip 搭建高可用 K8S 集群插图

weixin

文章有帮助,请作者喝杯咖啡?

发表评论

邮箱地址不会被公开。 必填项已用*标注