Skip to main content

Deploy Dedicated Computing Services

Configure Dedicated Computing Server

Joining the Kubernetes Cluster

  1. Deploy the Kubernetes node environment on the dedicated computing server in advance.

  2. Obtain the command to add a new node to the cluster from the Kubernetes master node:

    kubeadm token create --print-join-command
  3. Execute the join cluster command on the dedicated computing server:

    kubeadm join 192.168.1.100:6443 --token 3nwjzw.pdod3r27lnqqhi0x --discovery-token-ca-cert-hash sha256:a84445303xxxxxxxxxxxxxxxxxxx175a2

Download Images

  1. Download the computing instance image (only needs to be done on the dedicated computing server).

    crictl pull registry.cn-hangzhou.aliyuncs.com/mdpublic/mingdaoyun-computinginstance:6.4.0

Add Taint and Label

  1. Obtain the node name of the dedicated computing node:

    kubectl get node
  2. Add taint and label to the dedicated computing node:

    kubectl taint node $your_node_name md=workflowcompute:NoSchedule
    kubectl label node $your_node_name md=workflowcompute
    • Note, you need to replace $your_node_name with the actual node name
  3. Add label to the dedicated computing node

Deploy Dedicated Computing Services

note
  • Dedicated computing is supported starting from version v5.1.0. When installing the manager version >=5.1.0 for the first time, a default computinginstance.yaml file will be generated. Usually, you only need to modify the existing configuration file when configuring the dedicated computing service.

  • If installing manager version <5.1.0 during the first deployment, a new computinginstance.yaml file needs to be added and the relevant scripts need to be modified.

  1. Execute cat $KUBECONFIG to obtain the $KUBECONFIG content (corresponding to the /etc/kubernetes/admin.conf file in Kubernetes):

    cat $KUBECONFIG

    Output sample:

    apiVersion: v1
    clusters:
    - cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0...
    server: https://192.168.0.150:6443
    name: kubernetes
    contexts:
    - context:
    cluster: kubernetes
    user: kubernetes-admin
    name: kubernetes-admin@kubernetes
    current-context: kubernetes-admin@kubernetes
    kind: Config
    preferences: {}
    users:
    - name: kubernetes-admin
    user:
    client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0...
    • In the next step, copy the cat $KUBECONFIG output content to the computinginstance.yaml file, make sure to add four spaces before each line when copying
  2. Modify the /data/mingdao/script/kubernetes/computinginstance.yaml file:

    For the first deployment, make sure to modify the following points in the computinginstance.yaml file

    • kafka.brokers: Change to the kafka address connected by the microservice
    • syncStatus.mongoUri: Modify to the actual mongodb connection information
    • image: Change to the same computing instance version number as the microservice
    • kubeconfig.yaml: Change to the actual content of $KUBECONFIG, remember to add four spaces before each line of the original $KUBECONFIG content

    Sample Modification:

    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: computinginstance
    namespace: default
    data:
    config.yaml: |-
    server:
    listen:
    host: "0.0.0.0"
    port: "9157"
    common:
    kubernetes:
    configFile: "/usr/local/computinginstance/kubeconfig.yaml"
    namespace: "default"
    configmapTemplate: "/usr/local/computinginstance/configmap-workflowcompute.yaml"
    deploymentTemplate: "/usr/local/computinginstance/deployment-workflowcompute.yaml"
    kafka:
    brokers: "192.168.0.144:9092" # Modify to the kafka address connected by the microservice
    workflowTopicPrefix: "WorkFlow-"
    workSheetTopicPrefix: "WorkSheet-"
    workflowConsumerIdPrefix: "md-workflow-consumer-"
    replicationFactor: 1
    deleteTopic: true
    callback:
    url: "http://computingschedule:9158"
    createInterval: 120000 #ms
    deleteInterval: 120000 #ms
    syncStatus:
    mongoUri: "mongodb://mingdao:123456@192.168.0.144:27017/mdIdentification" # Modify to the actual mongodb connection information
    interval: 30000 # ms
    model:
    10:
    replicas: 1 # Number of instances
    thread: 2 # Number of threads per type, total = 5*thread*replicas=10
    cpu: 2 # Maximum CPU cores per instance, total = cpu*replicas=2
    memory: 4 # Maximum memory in GB per instance, total = memory*replicas=4
    20:
    replicas: 1
    thread: 4
    cpu: 4
    memory: 8
    50:
    replicas: 1
    thread: 10
    cpu: 8
    memory: 16
    100:
    replicas: 1
    thread: 20
    cpu: 16
    memory: 32

    configmap-workflowcompute.yaml: |-
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: workflowcompute
    data:
    application-www-computing.properties: |-
    md.resource.consumer.config.maps={\
    'resourceId': 'CONFIGMAP_INSTANCEID', \
    'wfTopic': 'WorkFlow-CONFIGMAP_INSTANCEID', \
    'wsTopic': 'WorkSheet-CONFIGMAP_INSTANCEID', \
    'partition': 'CONFIGMAP_WORKFLOW_PARTITION' \
    }
    md.kafka.consumer.topic=WorkSheet-CONFIGMAP_INSTANCEID
    md.kafka.consumer.group.id=md-workflow-consumer-CONFIGMAP_INSTANCEID
    md.kafka.consumer.concurrency=CONFIGMAP_WORKFLOW_THREAD
    md.kafka.batch.topic=WorkSheet-CONFIGMAP_INSTANCEID
    md.kafka.batch.concurrency=CONFIGMAP_WORKFLOW_THREAD
    spring.kafka.batch.topic=WorkFlow-CONFIGMAP_INSTANCEID
    spring.kafka.batch.concurrency=CONFIGMAP_WORKFLOW_THREAD
    spring.kafka.button.topic=WorkFlow-CONFIGMAP_INSTANCEID
    spring.kafka.button.concurrency=CONFIGMAP_WORKFLOW_THREAD
    spring.kafka.process.topic=WorkFlow-CONFIGMAP_INSTANCEID
    spring.kafka.process.concurrency=CONFIGMAP_WORKFLOW_THREAD
    spring.kafka.consumer.topic=WorkFlow-CONFIGMAP_INSTANCEID
    spring.kafka.consumer.group.id=md-workflow-consumer-CONFIGMAP_INSTANCEID
    spring.kafka.consumer.concurrency=CONFIGMAP_WORKFLOW_THREAD
    spring.kafka.properties.partition=CONFIGMAP_WORKFLOW_PARTITION
    grpc.client.MDWorksheetService.address=static://127.0.0.1:9422
    spring.kafka.router.topic=WorkFlow-CONFIGMAP_INSTANCEID

    deployment-workflowcompute.yaml: |-
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: workflowcompute-DEPLOYMENT_INSTANCEID
    labels:
    app: workflowcompute-DEPLOYMENT_INSTANCEID
    md-service: workflowcompute
    spec:
    replicas: DEPLOYMENT_REPLICAS
    selector:
    matchLabels:
    app: workflowcompute-DEPLOYMENT_INSTANCEID
    template:
    metadata:
    labels:
    app: workflowcompute-DEPLOYMENT_INSTANCEID
    md-service: workflowcompute
    annotations:
    md-update: '20231228184263'
    spec:
    imagePullSecrets:
    - name: hub.mingdao.com
    tolerations:
    - key: "md"
    operator: "Equal"
    value: "workflowcompute"
    effect: "NoSchedule"
    nodeSelector:
    md: workflowcompute
    containers:
    - name: workflow-consumer
    image: registry.cn-hangzhou.aliyuncs.com/mdpublic/mingdaoyun-computinginstance:6.4.0 # computinginstance image version
    env:
    - name: ENV_SERVERID
    value: "single:workflowconsumer"
    command:
    - sh
    - -c
    - |
    cp /usr/local/MDPrivateDeployment/workflowconsumer/application-www-computing.properties.template /usr/local/MDPrivateDeployment/workflowconsumer/application-www-computing.properties
    sed -i s/CONFIGMAP_INSTANCEID/DEPLOYMENT_INSTANCEID/g /usr/local/MDPrivateDeployment/workflowconsumer/application-www-computing.properties
    sed -i s/CONFIGMAP_WORKFLOW_THREAD/DEPLOYMENT_WORKFLOW_THREAD/g /usr/local/MDPrivateDeployment/workflowconsumer/application-www-computing.properties
    sed -i s/CONFIGMAP_WORKFLOW_PARTITION/DEPLOYMENT_WORKFLOW_PARTITION/g /usr/local/MDPrivateDeployment/workflowconsumer/application-www-computing.properties
    cat /usr/local/MDPrivateDeployment/workflowconsumer/application-www-computing.properties
    sleep 20
    exec /Housekeeper/main -config /Housekeeper/config.yaml
    resources:
    limits:
    memory: DEPLOYMENT_MEMORYGi
    cpu: DEPLOYMENT_CPU
    requests:
    memory: 1Gi
    cpu: 0.25
    volumeMounts:
    - name: tz-config
    mountPath: /etc/localtime
    - name: workflowcompute
    mountPath: /usr/local/MDPrivateDeployment/workflowconsumer/application-www-computing.properties.template
    subPath: application-www-computing.properties
    - name: worksheetservice
    image: registry.cn-hangzhou.aliyuncs.com/mdpublic/mingdaoyun-computinginstance:6.4.0 # computinginstance image version
    env:
    - name: ENV_SERVERID
    value: "single:worksheet"
    command:
    - sh
    - -c
    - |
    cat /usr/local/MDPrivateDeployment/worksheet/Config/appsettingsMain.json
    exec /Housekeeper/main -config /Housekeeper/config.yaml
    resources:
    limits:
    memory: DEPLOYMENT_MEMORYGi
    cpu: DEPLOYMENT_CPU
    requests:
    memory: 1Gi
    cpu: 0.25
    readinessProbe:
    tcpSocket:
    port: 9422
    initialDelaySeconds: 10
    periodSeconds: 10
    livenessProbe:
    tcpSocket:
    port: 9422
    initialDelaySeconds: 60
    periodSeconds: 10
    volumeMounts:
    - name: tz-config
    mountPath: /etc/localtime
    volumes:
    - name: tz-config
    hostPath:
    path: /usr/share/zoneinfo/Etc/GMT-8
    - name: workflowcompute
    configMap:
    name: workflowcompute
    items:
    - key: application-www-computing.properties
    path: application-www-computing.properties

    kubeconfig.yaml: |- # Change to the actual content of $KUBECONFIG below, make sure to add four spaces before each line of the original $KUBECONFIG content
    apiVersion: v1
    clusters:
    - cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0...
    server: https://192.168.0.150:6443
    name: kubernetes
    contexts:
    - context:
    cluster: kubernetes
    user: kubernetes-admin
    name: kubernetes-admin@kubernetes
    current-context: kubernetes-admin@kubernetes
    kind: Config
    preferences: {}
    users:
    - name: kubernetes-admin
    user:
    client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0...
    client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQ...
    ---

    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: computinginstance
    namespace: default
    spec:
    replicas: 1
    selector:
    matchLabels:
    app: computinginstance
    template:
    metadata:
    labels:
    app: computinginstance
    dir: grpc
    annotations:
    md-update: "20231228184309"
    spec:
    imagePullSecrets:
    - name: hub.mingdao.com
    containers:
    - name: computinginstance
    image: registry.cn-hangzhou.aliyuncs.com/mdpublic/mingdaoyun-computinginstance:6.4.0 # computinginstance image version
    resources:
    limits:
    cpu: "2"
    memory: 2G
    requests:
    cpu: "0.01"
    memory: 128M
    readinessProbe:
    tcpSocket:
    port: 9157
    initialDelaySeconds: 3
    periodSeconds: 3
    livenessProbe:
    tcpSocket:
    port: 9157
    initialDelaySeconds: 20
    periodSeconds: 10
    volumeMounts:
    - name: configfile
    mountPath: /usr/local/computinginstance/config.yaml
    subPath: config.yaml
    - name: configfile
    mountPath: /usr/local/computinginstance/configmap-workflowcompute.yaml
    subPath: configmap-workflowcompute.yaml
    - name: configfile
    mountPath: /usr/local/computinginstance/deployment-workflowcompute.yaml
    subPath: deployment-workflowcompute.yaml
    - name: configfile
    mountPath: /usr/local/computinginstance/kubeconfig.yaml
    subPath: kubeconfig.yaml
    - name: tz-config
    mountPath: /etc/localtime
    volumes:
    - name: tz-config
    hostPath:
    path: /usr/share/zoneinfo/Asia/Shanghai
    - name: configfile
    configMap:
    name: computinginstance
    items:
    - key: config.yaml
    path: config.yaml
    - key: configmap-workflowcompute.yaml
    path: configmap-workflowcompute.yaml
    - key: deployment-workflowcompute.yaml
    path: deployment-workflowcompute.yaml
    - key: kubeconfig.yaml
    path: kubeconfig.yaml

    ---

    apiVersion: v1
    kind: Service
    metadata:
    name: computinginstance
    namespace: default
    spec:
    selector:
    app: computinginstance
    ports:
    - name: http-computinginstance
    port: 9157
    targetPort: 9157
    nodePort: 9157
    type: NodePort
  3. Restart the service:

    cd /data/mingdao/script/kubernetes/
    bash restart.sh

Verification

In the default namespace, fetch all deployments whose names start with workflowcompute- and computinginstance

ns="default"

deployments=$(kubectl -n $ns get deployments -o jsonpath='{.items[*].metadata.name}' | tr ' ' '\n' | grep '^workflowcompute-\|computinginstance')

for deploy in $deployments
do
echo "Namespace: $ns, Deployment: $deploy"

# Retrieve and print the deployment images
kubectl -n $ns get deployment $deploy -o jsonpath='{.spec.template.spec.containers[*].image}'
echo
done

Output example:

Namespace: default, Deployment: computinginstance
registry.cn-hangzhou.aliyuncs.com/mdpublic/mingdaoyun-computinginstance:6.4.0
Namespace: default, Deployment: workflowcompute-7e4wf0fea4ho
registry.cn-hangzhou.aliyuncs.com/mdpublic/mingdaoyun-computinginstance:6.4.0
  • Deployments starting with workflowcompute- exist only after a dedicated compute resource has been created in the UI

Check whether taints have taken effect

kubectl get pod -o wide | grep workflowcompute-
  • Normally, the containers should be running on the dedicated compute nodes

Check Kafka

/usr/local/kafka/bin/kafka-consumer-groups.sh --bootstrap-server ${ENV_KAFKA_ENDPOINTS:=127.0.0.1:9092} --describe --group md-workflow-consumer-7e4wf0fea4ho
  • Here, resource ID 7e4wf0fea4ho is used as an example