跳到主要内容

单Master集群部署

信息
  • 最低节点要求:1 或 2 台节点。
  • Master 节点一旦宕机,kubectl 等操作工具将无法管理集群,但运行中的容器不会受到影响。
  • 精简版集群,两台微服务节点时采用单 Master 集群部署。

本文档基于操作系统 CentOS 7.9 / Debian 12 进行部署 Kubernetes 集群

服务器IP主机角色
192.168.10.20Kubernetes 01(Master、Node)
192.168.10.21Kubernetes 02(Node)

服务器要求

  • 集群服务器之间网络策略无限制
  • 集群服务器之间主机名不能重复
  • 主网卡 MAC 地址不能重复【 ip link 查看 】
  • product_id 不能重复【 cat /sys/class/dmi/id/product_uuid 】
  • kubelet 的6443端口未被占用【 nc -vz 127.0.0.1 6443 】
  • 禁用 swap 内存【 执行 swapoff -a 命令进行禁用,并且 /etc/fstab 中禁用 swap 分区挂载 】

安装CRI容器运行环境

Kubernetes 集群各节点均需要操作

  1. 下载 docker 安装包

    wget https://pdpublic.mingdao.com/private-deployment/offline/common/docker-27.3.1.tgz
  2. 安装 docker

    tar -zxvf docker-27.3.1.tgz
    mv -f docker/* /usr/local/bin/
  3. 创建 docker 与 containerd 配置文件目录

    mkdir /etc/docker
    mkdir /etc/containerd
  4. 创建 docker 的 daemon.json 文件

    cat > /etc/docker/daemon.json <<\EOF
    {
    "registry-mirrors": ["https://uvlkeb6d.mirror.aliyuncs.com"],
    "data-root": "/data/docker",
    "max-concurrent-downloads": 10,
    "exec-opts": ["native.cgroupdriver=cgroupfs"],
    "storage-driver": "overlay2",
    "default-address-pools":[{"base":"172.80.0.0/16","size":24}],
    "insecure-registries": ["127.0.0.1:5000"]
    }
    EOF
  5. 创建 containerd 的 config.toml 文件,并修改配置

    containerd config default > /etc/containerd/config.toml
    sed -i 's/SystemdCgroup =.*/SystemdCgroup = true/g' /etc/containerd/config.toml
    sed -i 's#bin_dir =.*#bin_dir = "/usr/local/kubernetes/cni/bin"#' /etc/containerd/config.toml
    sed -i 's#sandbox_image =.*#sandbox_image = "127.0.0.1:5000/pause:3.8"#' /etc/containerd/config.toml
    sed -i 's#^root =.*#root = "/data/containerd"#' /etc/containerd/config.toml
  • 检查 containerd 配置文件

    grep "SystemdCgroup\|bin_dir\|sandbox_image\|^root =" /etc/containerd/config.toml
    输出结果示例
    root = "/data/containerd"
    sandbox_image = "127.0.0.1:5000/pause:3.8"
    bin_dir = "/usr/local/kubernetes/cni/bin"
    SystemdCgroup = true
  1. 配置 docker 的 systemd 文件

    cat > /etc/systemd/system/docker.service <<EOF
    [Unit]
    Description=Docker
    After=network-online.target
    Wants=network-online.target
    Requires=containerd.service
    [Service]
    Type=notify
    ExecStart=/usr/local/bin/dockerd --containerd /var/run/containerd/containerd.sock
    ExecReload=/bin/kill -s HUP \$MAINPID
    LimitNOFILE=1024000
    LimitNPROC=infinity
    LimitCORE=0
    TimeoutStartSec=0
    Delegate=yes
    KillMode=process
    Restart=on-failure
    StartLimitBurst=3
    StartLimitInterval=60s
    [Install]
    WantedBy=multi-user.target
    EOF
  2. 配置 containerd 的 systemd 文件

    cat > /etc/systemd/system/containerd.service <<EOF
    [Unit]
    Description=containerd
    After=network-online.target
    Wants=network-online.target
    [Service]
    Type=notify
    ExecStart=/usr/local/bin/containerd --config /etc/containerd/config.toml
    LimitNOFILE=1024000
    LimitNPROC=infinity
    LimitCORE=0
    TimeoutStartSec=0
    Delegate=yes
    KillMode=process
    Restart=on-failure
    StartLimitBurst=3
    StartLimitInterval=60s
    [Install]
    WantedBy=multi-user.target
    EOF
  3. 启动 containerd 与 docker 并加入开机自启动

    systemctl daemon-reload && systemctl restart containerd && systemctl enable containerd
    systemctl daemon-reload && systemctl restart docker && systemctl enable docker

安装CNI插件

Kubernetes 集群各节点均需要操作

  1. 下载 cni 插件文件

    wget https://pdpublic.mingdao.com/private-deployment/offline/common/kubernetes-1.25.4/cni-plugins-linux-amd64-v1.1.1.tgz
  2. 创建 cni 文件安装目录

    mkdir -p /usr/local/kubernetes/cni/bin
  3. 解压 cni 插件到安装目录

    tar -zxvf cni-plugins-linux-amd64-v1.1.1.tgz -C /usr/local/kubernetes/cni/bin

安装 K8S 集群所需命令

安装 crictl/kubeadm/kubelet/kubectl 命令,Kubernetes 集群各节点均需要操作

  1. 创建命令安装目录

    mkdir -p /usr/local/kubernetes/bin
  2. 下载命令文件至安装目录

    wget https://pdpublic.mingdao.com/private-deployment/offline/common/kubernetes-1.25.4/crictl-v1.25.0-linux-amd64.tar.gz
    tar -zxvf crictl-v1.25.0-linux-amd64.tar.gz -C /usr/local/kubernetes/bin
    curl -o /usr/local/kubernetes/bin/kubeadm https://pdpublic.mingdao.com/private-deployment/offline/common/kubernetes-1.25.4/kubeadm
    curl -o /usr/local/kubernetes/bin/kubelet https://pdpublic.mingdao.com/private-deployment/offline/common/kubernetes-1.25.4/kubelet
    curl -o /usr/local/kubernetes/bin/kubectl https://pdpublic.mingdao.com/private-deployment/offline/common/kubernetes-1.25.4/kubectl
  3. 赋予命令文件可执行权限

    chmod +x /usr/local/kubernetes/bin/*
    chown $(whoami):$(groups) /usr/local/kubernetes/bin/*
  4. 配置 systemd 管理 kubelet

    cat > /etc/systemd/system/kubelet.service <<\EOF
    [Unit]
    Description=kubelet: The Kubernetes Node Agent
    Documentation=https://kubernetes.io/docs/home/
    Wants=network-online.target
    After=network-online.target

    [Service]
    ExecStart=/usr/local/kubernetes/bin/kubelet
    Restart=always
    StartLimitInterval=0
    RestartSec=10

    [Install]
    WantedBy=multi-user.target
    EOF
  5. 配置 systemd 管理 kubeadm

    mkdir -p /etc/systemd/system/kubelet.service.d

    cat > /etc/systemd/system/kubelet.service.d/10-kubeadm.conf <<\EOF
    # Note: This dropin only works with kubeadm and kubelet v1.11+
    [Service]
    Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
    Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
    # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
    EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
    # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
    # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
    EnvironmentFile=-/etc/default/kubelet
    ExecStart=
    ExecStart=/usr/local/kubernetes/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
    EOF
  6. 启动 kubelet 并加入开机自启动

    systemctl daemon-reload && systemctl restart kubelet && systemctl enable kubelet
    • 这里 restart 之后无需查看服务状态,后续步骤 kubeadm init 和 kubeadm join 之后该服务会自动拉起
  7. 配置 K8S 命令所在目录并加入环境变量

    export PATH=/usr/local/kubernetes/bin/:$PATH
    echo 'export PATH=/usr/local/kubernetes/bin/:$PATH' >> /etc/bashrc
  8. 配置防止后续 crictl 拉取镜像出错

    crictl config runtime-endpoint unix:///run/containerd/containerd.sock

安装nerdctl工具

  1. 下载nerdctl工具

    wget https://pdpublic.mingdao.com/private-deployment/private-deployment/offline/common/kubernetes-1.25.4/nerdctl-1.7.0-linux-amd64.tar.gz
    tar -zxvf nerdctl-1.7.0-linux-amd64.tar.gz
    rm -f containerd-rootless*.sh
    mv nerdctl /usr/local/kubernetes/bin/
  2. 加入环境变量

    echo 'alias nerdctl="nerdctl -n k8s.io"' >> ~/.bashrc

    source ~/.bashrc
    • nerdctl -v 输出 nerdctl version 1.7.0 代表正常

安装环境依赖

Kubernetes 集群各节点均需要操作

  1. 安装环境依赖 socat/conntrack

    # centos / redhat 使用 yum 安装
    yum install -y socat conntrack-tools

    # debian / ubuntu 使用 apt 安装
    apt install -y socat conntrack
  2. 检查命令是否缺失

    docker --version && dockerd --version && pgrep -f 'dockerd' && crictl --version && kubeadm version && kubelet --version && kubectl version --client=true && socat -V | grep 'socat version' && conntrack --version && echo ok || echo error
    • 输出 ok 代表正常,输出 error 则需根据错误补全命令

修改内核配置

Kubernetes 集群各节点均需要操作

  1. 添加内核模块

    cat > /etc/modules-load.d/kubernetes.conf <<EOF
    overlay
    br_netfilter
    ip_vs
    ip_vs_rr
    ip_vs_wrr
    ip_vs_sh
    EOF
  2. 加载模块

    modprobe overlay
    modprobe br_netfilter
    modprobe ip_vs
    modprobe ip_vs_rr
    modprobe ip_vs_wrr
    modprobe ip_vs_sh
  3. 添加内核参数

    cat >> /etc/sysctl.conf <<EOF
    net.bridge.bridge-nf-call-iptables = 1
    net.bridge.bridge-nf-call-ip6tables = 1
    net.ipv4.ip_forward = 1
    vm.max_map_count = 262144

    # MD Config
    net.nf_conntrack_max = 524288
    net.ipv4.tcp_max_tw_buckets = 5000
    net.ipv4.tcp_window_scaling = 1
    net.ipv4.tcp_rmem = 8192 87380 16777216
    net.ipv4.tcp_wmem = 8192 65536 16777216
    net.ipv4.tcp_max_syn_backlog = 32768
    net.core.netdev_max_backlog = 32768
    net.core.netdev_budget = 600
    net.core.somaxconn = 32768
    net.core.wmem_default = 8388608
    net.core.rmem_default = 8388608
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    net.ipv4.tcp_timestamps = 1
    net.ipv4.tcp_synack_retries = 2
    net.ipv4.tcp_syn_retries = 2
    net.ipv4.tcp_tw_recycle = 0
    net.ipv4.tcp_tw_reuse = 1
    net.ipv4.tcp_fin_timeout = 2
    net.ipv4.tcp_mem = 8388608 12582912 16777216
    net.ipv4.ip_local_port_range = 1024 65000
    net.ipv4.tcp_max_orphans = 16384
    net.ipv4.tcp_keepalive_intvl = 10
    net.ipv4.tcp_keepalive_probes = 3
    net.ipv4.tcp_keepalive_time = 600
    vm.max_map_count = 262144
    net.netfilter.nf_conntrack_tcp_be_liberal = 0
    net.netfilter.nf_conntrack_tcp_max_retrans = 3
    net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300
    net.netfilter.nf_conntrack_tcp_timeout_established = 86400
    fs.inotify.max_user_watches=10485760
    fs.inotify.max_user_instances=10240
    EOF

    sysctl --system

K8S 环境镜像准备

Kubernetes 集群各节点均需要操作

  1. 加载离线镜像

    wget https://pdpublic.mingdao.com/private-deployment/offline/common/kubernetes-1.25.4/kubeadm-1.25.4-images.tar.gz
    docker load -i kubeadm-1.25.4-images.tar.gz
  2. 启动本地仓库给镜像打标签

    docker run -d -p 5000:5000 --restart always --name registry registry:2
    for i in $(docker images | grep 'registry.k8s.io\|rancher' | awk 'NR!=0{print $1":"$2}');do docker tag $i $(echo $i | sed -e "s/registry.k8s.io/127.0.0.1:5000/" -e "s#coredns/##" -e "s/rancher/127.0.0.1:5000/");done
    for i in $(docker images | grep :5000 | awk 'NR!=0{print $1":"$2}');do docker push $i;done
    docker images | grep :5000

主节点配置

仅在 Kubernetes 01 节点操作

  1. 初始化 master 节点

    kubeadm init --cri-socket unix:///var/run/containerd/containerd.sock -v 5 --kubernetes-version=1.25.4 --image-repository=127.0.0.1:5000 --pod-network-cidr=10.244.0.0/16
    • 初始化成功会输出 kube join 命令,保存下输出后续会用到
  2. 修改nodePort可使用端口范围

    sed -i '/- kube-apiserver/a\ \ \ \ - --service-node-port-range=1024-32767' /etc/kubernetes/manifests/kube-apiserver.yaml
  3. 设置配置路径

    export KUBECONFIG=/etc/kubernetes/admin.conf
    echo 'export KUBECONFIG=/etc/kubernetes/admin.conf' >> /etc/bashrc
  4. 调整当前节点 Pod 上限

    echo "maxPods: 300" >> /var/lib/kubelet/config.yaml
    systemctl restart kubelet
  5. 允许 master 参与调度

    • 在初始化 master 节点后大概要等待两分钟左右再执行下方命令

    • 执行前需先检查 kubelet 服务状态 systemctl status kubelet,看下是否为 running

    kubectl taint node $(kubectl get node | grep control-plane | awk '{print $1}') node-role.kubernetes.io/control-plane:NoSchedule-
    • 此命令执行后,正确输出为:"xxxx untainted",如果输出不符,则需稍加等待,再次执行进行确认
  6. 安装网络插件

    cat > /usr/local/kubernetes/kube-flannel.yml <<EOF
    ---
    kind: Namespace
    apiVersion: v1
    metadata:
    name: kube-flannel
    labels:
    pod-security.kubernetes.io/enforce: privileged
    ---
    kind: ClusterRole
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
    name: flannel
    rules:
    - apiGroups:
    - ""
    resources:
    - pods
    verbs:
    - get
    - apiGroups:
    - ""
    resources:
    - nodes
    verbs:
    - list
    - watch
    - apiGroups:
    - ""
    resources:
    - nodes/status
    verbs:
    - patch
    ---
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
    name: flannel
    roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: flannel
    subjects:
    - kind: ServiceAccount
    name: flannel
    namespace: kube-system
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
    name: flannel
    namespace: kube-system
    ---
    kind: ConfigMap
    apiVersion: v1
    metadata:
    name: kube-flannel-cfg
    namespace: kube-system
    labels:
    tier: node
    app: flannel
    data:
    cni-conf.json: |
    {
    "name": "cbr0",
    "cniVersion": "0.3.1",
    "plugins": [
    {
    "type": "flannel",
    "delegate": {
    "hairpinMode": true,
    "isDefaultGateway": true
    }
    },
    {
    "type": "portmap",
    "capabilities": {
    "portMappings": true
    }
    }
    ]
    }
    net-conf.json: |
    {
    "Network": "10.244.0.0/16",
    "Backend": {
    "Type": "vxlan"
    }
    }
    ---
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
    name: kube-flannel-ds
    namespace: kube-system
    labels:
    tier: node
    app: flannel
    spec:
    selector:
    matchLabels:
    app: flannel
    template:
    metadata:
    labels:
    tier: node
    app: flannel
    spec:
    affinity:
    nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    nodeSelectorTerms:
    - matchExpressions:
    - key: kubernetes.io/os
    operator: In
    values:
    - linux
    hostNetwork: true
    priorityClassName: system-node-critical
    tolerations:
    - operator: Exists
    effect: NoSchedule
    serviceAccountName: flannel
    initContainers:
    - name: install-cni-plugin
    #image: flannelcni/flannel-cni-plugin:v1.1.0 for ppc64le and mips64le (dockerhub limitations may apply)
    image: 127.0.0.1:5000/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
    command:
    - cp
    args:
    - -f
    - /flannel
    - /opt/cni/bin/flannel
    volumeMounts:
    - name: cni-plugin
    mountPath: /opt/cni/bin
    - name: install-cni
    #image: flannelcni/flannel:v0.20.1 for ppc64le and mips64le (dockerhub limitations may apply)
    image: 127.0.0.1:5000/mirrored-flannelcni-flannel:v0.20.1
    command:
    - cp
    args:
    - -f
    - /etc/kube-flannel/cni-conf.json
    - /etc/cni/net.d/10-flannel.conflist
    volumeMounts:
    - name: cni
    mountPath: /etc/cni/net.d
    - name: flannel-cfg
    mountPath: /etc/kube-flannel/
    containers:
    - name: kube-flannel
    #image: flannelcni/flannel:v0.20.1 for ppc64le and mips64le (dockerhub limitations may apply)
    image: 127.0.0.1:5000/mirrored-flannelcni-flannel:v0.20.1
    command:
    - /opt/bin/flanneld
    args:
    - --ip-masq
    - --kube-subnet-mgr
    resources:
    requests:
    cpu: "100m"
    memory: "50Mi"
    limits:
    cpu: "100m"
    memory: "50Mi"
    securityContext:
    privileged: false
    capabilities:
    add: ["NET_ADMIN", "NET_RAW"]
    env:
    - name: POD_NAME
    valueFrom:
    fieldRef:
    fieldPath: metadata.name
    - name: POD_NAMESPACE
    valueFrom:
    fieldRef:
    fieldPath: metadata.namespace
    - name: EVENT_QUEUE_DEPTH
    value: "5000"
    volumeMounts:
    - name: run
    mountPath: /run/flannel
    - name: flannel-cfg
    mountPath: /etc/kube-flannel/
    - name: xtables-lock
    mountPath: /run/xtables.lock
    volumes:
    - name: run
    hostPath:
    path: /run/flannel
    - name: cni-plugin
    hostPath:
    path: /usr/local/kubernetes/cni/bin
    - name: cni
    hostPath:
    path: /etc/cni/net.d
    - name: flannel-cfg
    configMap:
    name: kube-flannel-cfg
    - name: xtables-lock
    hostPath:
    path: /run/xtables.lock
    type: FileOrCreate
    EOF

    kubectl apply -f /usr/local/kubernetes/kube-flannel.yml

工作节点配置

需在Kubernetes 02 节点上进行操作

  1. 加入 kubernetes 集群

    kubeadm join 192.168.10.20:6443 --token 3nwjzw.pdod3r27lnqqhi0x \
    --discovery-token-ca-cert-hash sha256:a84445303a0f8249e7eae3059cb99d46038dc275b2dc2043a022de187a1175a2
    • 此命令为在主节点执行 kubeadm init 成功后输出,此处的为示例,每个集群都不同
    • 如遗忘的话可以在主节点执行 kubeadm token create --print-join-command 重新获取
  2. 调整当前节点 Pod 上限

    echo "maxPods: 300" >> /var/lib/kubelet/config.yaml
    systemctl restart kubelet

集群状态检查

  1. 节点状态检查

    kubectl get pod -n kube-system    # READY列需要是"1/1"
    kubectl get node # STATUS列需要是"Ready"
  2. 下载镜像(各微服务节点均需要操作)

    提前下载并上传 centos:7.9.2009 镜像至各服务器

    离线镜像下载链接:https://pdpublic.mingdao.com/private-deployment/offline/common/centos7.9.2009.tar.gz

    各服务器加载离线镜像:

    gunzip -d centos7.9.2009.tar.gz
    ctr -n k8s.io image import centos7.9.2009.tar
  3. 仅在微服务01节点上写入配置启动测试容器

    cat > /usr/local/kubernetes/test.yaml <<\EOF
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: test
    namespace: default
    spec:
    replicas: 3
    selector:
    matchLabels:
    app: test
    template:
    metadata:
    labels:
    app: test
    annotations:
    md-update: '20200517104741'
    spec:
    containers:
    - name: test
    image: centos:7.9.2009
    command:
    - sh
    - -c
    - |
    echo $(hostname) > hostname.txt
    python -m SimpleHTTPServer
    resources:
    limits:
    memory: 512Mi
    cpu: 1
    requests:
    memory: 64Mi
    cpu: 0.01
    volumeMounts:
    - name: tz-config
    mountPath: /etc/localtime
    volumes:
    - name: tz-config
    hostPath:
    path: /usr/share/zoneinfo/Etc/GMT-8

    ---

    apiVersion: v1
    kind: Service
    metadata:
    name: test
    namespace: default
    spec:
    selector:
    app: test
    ports:
    - name: external-test
    port: 8000
    targetPort: 8000
    nodePort: 8000
    type: NodePort
    EOF

    kubectl apply -f /usr/local/kubernetes/test.yaml
  4. 检查 Pod 状态

    kubectl get pod -o wide
  5. 测试访问

    curl 127.0.0.1:8000/hostname.txt
    • 多次 curl 正常应会返回不同 pod 的 hostname
  6. 如果 curl 到其他节点上的容器,返回需要约1秒左右时间,则关闭 flannel.1 的网络接口硬件卸载功能(每个节点都需要)

    cat > /etc/systemd/system/disable-offload.service <<\EOF
    [Unit]
    Description=Disable offload for flannel.1
    After=network-online.target flanneld.service

    [Service]
    Type=oneshot
    ExecStartPre=/bin/bash -c 'while [ ! -d /sys/class/net/flannel.1 ]; do sleep 1; done'
    ExecStart=/sbin/ethtool --offload flannel.1 rx off tx off

    [Install]
    WantedBy=multi-user.target
    EOF

    重载systemd 配置并启动服务

    systemctl daemon-reload
    systemctl enable disable-offload
    systemctl start disable-offload