在上篇文章中我提到了Pod 的生命周期以及Probes的概念,本文我想把这几个管理Pod生命周期的k8s对象具体用实验的方式实现以下。因为在真正的生产环境中为了保证服务的稳定性这些是必不可少的环节。

1 概述

在k8s 的Pod 中一共有四种形式的Probes和Hooks可以嵌入在Pod(Container)的生命周期中,其中Liveness,Readness Probes是每个服务(Service)必须要嵌入的。

resource.spec.containers.lifecycle.postStart   # 用于容器初始化完成

resource.spec.containers.lifecycle.preStop     # 用于容器结束运行前

resource.spec.containers.readinessProbe      # 探测容器是否准备好提供服务

resource.spec.containers.livenessPorbe        # 探测容器健康状况   

2 容器探测 Container Probes

探测(Probe)是kubelet在容器中周期性执行的一种检查手段。kubelet 可以调用容器中植入的 Handler 来对容器进行检查。目前的k8s版本一共提供3中 Handler:

  • ExecAction: 在容器中执行一段命令,判定该命令是否执行成功取决于该命令的返回 status code 是否为 0 
  • TCPSocketAction: 执行对容器虚拟IP以及特定端口的TCP做检查,如果该端口是开放状态则判定该检查通过
  • HTTPGetAction: 执行对容器虚拟IP以及指定端口发HTTP GET 请求,如果返回Response结果在200与400之间,则判定该检查通过

为了对容器设置探测,你可以使用kubectl api。比如 pod.spec.containers.readinessProbe / livenessProbe 对象下有三种选择它们是  exec, httpGet and tcpSocket 它们的功能与 ExecAction, HTTPGetAction and TCPSocketAction 一一对应

且每个探测只有三种可能的返回结果:

  • Success: 容器通过了检测The Container passed the diagnostic.
  • Failure: 容器没有通过检测The Container failed the diagnostic.
  • Unknown: 检测没有通过检测,但不做任何反应

3 实现一个liveness exec probe

配置一个简单的例子:

apiVersion: v1
kind: Pod
metadata:
name: liveness-exec-pod
namespace: default
spec:
containers:
    - name: liveness-exec-container
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["/bin/sh","-c","touch /tmp/healthy; sleep 20; rm -rf /tmp/healthy; sleep 500"]
livenessProbe:
exec:
command: ["test","-e","/tmp/healthy"]
initialDelaySeconds: 2
periodSeconds: 3
restartPolicy: Always

* initialDelaySeconds: 2 -> 2 sec. 在容器启动完成后激活livenessProbe

* periodSeconds: 3 -> 探测每3秒钟执行一次

 

首先创建一个新Pod kubectl create -f liveness-exec-pod 使用 kubectl describe pod liveness-exec-pod 来查看具体信息,我们可以看到容器一直在启动失败和重新启动中来回切换,最终Livenessprobe失败

[root@k8smaster learning-kubernetes]# kubectl describe pod liveness-exec-pod
Name:               liveness-exec-pod
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               k8snode1/172.16.0.12
Start Time:         Fri, 04 Jan 2019 05:01:00 +0100
Labels:             <none>
Annotations:        <none>
Status:             Running
IP:                 10.244.1.25
Containers:
  liveness-exec-container:
    Container ID:  docker://c4d0b086b10426612603315f9dd5af739896d84f8f19a3a4715ed63059e39e3e
    Image:         busybox:latest
    Image ID:      docker-pullable://busybox@sha256:7964ad52e396a6e045c39b5a44438424ac52e12e4d5a25d94895f2058cb863a0
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c
      touch /tmp/healthy; sleep 20; rm -rf /tmp/healthy; sleep 500
    State:          Running
      Started:      Fri, 04 Jan 2019 05:02:00 +0100
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Fri, 04 Jan 2019 05:01:01 +0100
      Finished:     Fri, 04 Jan 2019 05:01:59 +0100
    Ready:          True
    Restart Count:  1
    Liveness:       exec [test -e /tmp/healthy] delay=2s timeout=1s period=3s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rxs5t (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-rxs5t:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-rxs5t
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  100s                 default-scheduler  Successfully assigned default/liveness-exec-pod to k8snode1
  Normal   Pulled     72s (x2 over 2m10s)  kubelet, k8snode1  Container image "busybox:latest" already present on machine
  Normal   Created    72s (x2 over 2m10s)  kubelet, k8snode1  Created container
  Normal   Killing    72s                  kubelet, k8snode1  Killing container with id docker://liveness-exec-container:Container failed liveness probe.. Container will be killed and recreated.
  Normal   Started    71s (x2 over 2m10s)  kubelet, k8snode1  Started container
  Warning  Unhealthy  45s (x6 over 108s)   kubelet, k8snode1  Liveness probe failed:

 

4 实现一个 liveness httpget probe

使用如下配置:

apiVersion: v1
kind: Pod
metadata:
  name: liveness-httpget-pod
  namespace: default
spec:
  containers:
    - name: liveness-httpget-container
      image: ikubernetes/myapp:v1
      imagePullPolicy: IfNotPresent
      ports:
        - name: http
          containerPort: 80
      livenessProbe:
        httpGet:
          port: http
          path: /index.html
        initialDelaySeconds: 1
        periodSeconds: 3

创建步骤,以及获取Pod信息命令如下:

[root@k8smaster probe_and_hook]# kubectl create -f liveness-httpget.yaml 
pod/liveness-httpget-pod created
[root@k8smaster probe_and_hook]# kubectl get pod
NAME                     READY   STATUS    RESTARTS   AGE
liveness-httpget-pod     1/1     Running   0          20s
nginx-79976cbb47-8dqnk   1/1     Running   0          14h
nginx-79976cbb47-p247g   1/1     Running   0          14h
nginx-79976cbb47-ppbqv   1/1     Running   0          14h
[root@k8smaster probe_and_hook]# kubectl describe pod liveness-httpget-pod
Name:               liveness-httpget-pod
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               k8snode1/172.16.0.12
Start Time:         Fri, 04 Jan 2019 05:17:59 +0100
Labels:             <none>
Annotations:        <none>
Status:             Running
IP:                 10.244.1.26
Containers:
  liveness-httpget-container:
    Container ID:   docker://2e211c471130073ae92d92ab9981857bc3e35d1d96009316a1d8660246c16dac
    Image:          ikubernetes/myapp:v1
    Image ID:       docker-pullable://ikubernetes/myapp@sha256:9c3dc30b5219788b2b8a4b065f548b922a34479577befb54b03330999d30d513
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Fri, 04 Jan 2019 05:18:00 +0100
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:80/index.html delay=1s timeout=1s period=3s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rxs5t (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-rxs5t:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-rxs5t
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Pulled     69s   kubelet, k8snode1  Container image "ikubernetes/myapp:v1" already present on machine
  Normal  Created    69s   kubelet, k8snode1  Created container
  Normal  Started    68s   kubelet, k8snode1  Started container
  Normal  Scheduled  39s   default-scheduler  Successfully assigned default/liveness-httpget-pod to k8snode1

 从结果可以看出,livenessprobe 的http 探测成功

5 设置 poststart 钩子

下面例子展示如何使用容器生命周期钩子:

apiVersion: v1
kind: Pod
metadata:
  name: poststart-pod
  namespace: default
spec:
  containers:
    - name: busybox-httpd
      image: busybox:latest
      imagePullPolicy: IfNotPresent
      lifecycle:
        postStart:
          exec:
            command: ["/bin/sh", "-c", "echo Home-Page >> /tmp/index.html"]
      #command: ['/bin/sh','-c','sleep 3600']
      command: ["/bin/httpd"]
      args: ["-f","-h /tmp"]
# this script is not correct!