2017-06-21 11 views
4

Ich habe folgendes Setup:Kubernetes: Wie debuggen CrashLoopBackOff

Ein Docker Bild omg/telperion auf Docker Nabe A Kubernetes Cluster (mit 4 Knoten, die jeweils mit ~ 50 GB RAM) und viele Ressourcen

Ich folgte Tutorials Bilder von dockerhub zu ziehen

SERVICE_NAME=telperion 
DOCKER_SERVER="https://index.docker.io/v1/" 
DOCKER_USERNAME=username 
DOCKER_PASSWORD=password 
DOCKER_EMAIL="[email protected]" 

# Create secret 
kubectl create secret docker-registry dockerhub --docker-server=$DOCKER_SERVER --docker-username=$DOCKER_USERNAME --docker-password=$DOCKER_PASSWORD --docker-email=$DOCKER_EMAIL 

# Create service yaml 
echo "apiVersion: v1 \n\ 
kind: Pod \n\ 
metadata: \n\ 
    name: ${SERVICE_NAME} \n\ 
spec: \n\ 
    containers: \n\ 
    - name: ${SERVICE_NAME} \n\ 
     image: omg/${SERVICE_NAME} \n\ 
     imagePullPolicy: Always \n\ 
     command: [ \"echo\",\"done deploying $SERVICE_NAME\" ] \n\ 
    imagePullSecrets: \n\ 
    - name: dockerhub" > $SERVICE_NAME.yaml 

# Deploy to kubernetes 
kubectl create -f $SERVICE_NAME.yaml 

, die in der Gondel in eine CrashLoopBackoff

gehen führt zu Kubernetes

docker run -it -p8080:9546 omg/telperion funktioniert gut.

Also meine Frage ist Ist das Debug-fähig ?, Wenn ja, wie debug ich dies?

Einige Protokolle:

kubectl get nodes                       
NAME     STATUS      AGE  VERSION 
k8s-agent-adb12ed9-0 Ready      22h  v1.6.6 
k8s-agent-adb12ed9-1 Ready      22h  v1.6.6 
k8s-agent-adb12ed9-2 Ready      22h  v1.6.6 
k8s-master-adb12ed9-0 Ready,SchedulingDisabled 22h  v1.6.6 

.

kubectl get pods                        
NAME      READY  STATUS    RESTARTS AGE 
telperion     0/1  CrashLoopBackOff 10   28m 

.

kubectl describe pod telperion 
Name:   telperion 
Namespace:  default 
Node:   k8s-agent-adb12ed9-2/10.240.0.4 
Start Time:  Wed, 21 Jun 2017 10:18:23 +0000 
Labels:   <none> 
Annotations: <none> 
Status:   Running 
IP:    10.244.1.4 
Controllers: <none> 
Containers: 
    telperion: 
    Container ID:  docker://c2dd021b3d619d1d4e2afafd7a71070e1e43132563fdc370e75008c0b876d567 
    Image:    omg/telperion 
    Image ID:   docker-pullable://omg/[email protected]:c7e3beb0457b33cd2043c62ea7b11ae44a5629a5279a88c086ff4853828a6d96 
    Port: 
    Command: 
     echo 
     done deploying telperion 
    State:    Waiting 
     Reason:   CrashLoopBackOff 
    Last State:   Terminated 
     Reason:   Completed 
     Exit Code:  0 
     Started:   Wed, 21 Jun 2017 10:19:25 +0000 
     Finished:   Wed, 21 Jun 2017 10:19:25 +0000 
    Ready:    False 
    Restart Count:  3 
    Environment:  <none> 
    Mounts: 
     /var/run/secrets/kubernetes.io/serviceaccount from default-token-n7ll0 (ro) 
Conditions: 
    Type   Status 
    Initialized True 
    Ready   False 
    PodScheduled True 
Volumes: 
    default-token-n7ll0: 
    Type:  Secret (a volume populated by a Secret) 
    SecretName: default-token-n7ll0 
    Optional: false 
QoS Class:  BestEffort 
Node-Selectors: <none> 
Tolerations: <none> 
Events: 
    FirstSeen  LastSeen  Count From       SubObjectPath         Type   Reason   Message 
    ---------  --------  ----- ----       -------------         --------  ------   ------- 
    1m   1m    1  default-scheduler                Normal   Scheduled  Successfully assigned telperion to k8s-agent-adb12ed9-2 
    1m   1m    1  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Normal   Created   Created container with id d9aa21fd16b682698235e49adf80366f90d02628e7ed5d40a6e046aaaf7bf774 
    1m   1m    1  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Normal   Started   Started container with id d9aa21fd16b682698235e49adf80366f90d02628e7ed5d40a6e046aaaf7bf774 
    1m   1m    1  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Normal   Started   Started container with id c6c8f61016b06d0488e16bbac0c9285fed744b933112fd5d116e3e41c86db919 
    1m   1m    1  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Normal   Created   Created container with id c6c8f61016b06d0488e16bbac0c9285fed744b933112fd5d116e3e41c86db919 
    1m   1m    2  kubelet, k8s-agent-adb12ed9-2             Warning   FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "telperion" with CrashLoopBackOff: "Back-off 10s restarting failed container=telperion pod=telperion_default(f4e36a12-566a-11e7-99a6-000d3aa32f49)" 

    1m 1m  1  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Normal Started   Started container with id 3b911f1273518b380bfcbc71c9b7b770826c0ce884ac876fdb208e7c952a4631 
    1m 1m  1  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Normal Created   Created container with id 3b911f1273518b380bfcbc71c9b7b770826c0ce884ac876fdb208e7c952a4631 
    1m 1m  2  kubelet, k8s-agent-adb12ed9-2             Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "telperion" with CrashLoopBackOff: "Back-off 20s restarting failed container=telperion pod=telperion_default(f4e36a12-566a-11e7-99a6-000d3aa32f49)" 

    1m 50s  4  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Normal Pulling   pulling image "omg/telperion" 
    47s 47s  1  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Normal Started   Started container with id c2dd021b3d619d1d4e2afafd7a71070e1e43132563fdc370e75008c0b876d567 
    1m 47s  4  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Normal Pulled   Successfully pulled image "omg/telperion" 
    47s 47s  1  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Normal Created   Created container with id c2dd021b3d619d1d4e2afafd7a71070e1e43132563fdc370e75008c0b876d567 
    1m 9s  8  kubelet, k8s-agent-adb12ed9-2 spec.containers{telperion}  Warning BackOff   Back-off restarting failed container 
    46s 9s  4  kubelet, k8s-agent-adb12ed9-2             Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "telperion" with CrashLoopBackOff: "Back-off 40s restarting failed container=telperion pod=telperion_default(f4e36a12-566a-11e7-99a6-000d3aa32f49)" 

Edit 1: Fehler von kubelet auf Master gemeldet:

journalctl -u kubelet 

.

Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: E0621 10:28:49.798140 1809 fsHandler.go:121] failed to collect filesystem stats - rootDiskErr: du command failed on /var/lib/docker/overlay/5cfff16d670f2df6520360595d7858fb5d16607b6999a88e5dcbc09e1e7ab9ce with output 
Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: , stderr: du: cannot access '/var/lib/docker/overlay/5cfff16d670f2df6520360595d7858fb5d16607b6999a88e5dcbc09e1e7ab9ce/merged/proc/13122/task/13122/fd/4': No such file or directory 
Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: du: cannot access '/var/lib/docker/overlay/5cfff16d670f2df6520360595d7858fb5d16607b6999a88e5dcbc09e1e7ab9ce/merged/proc/13122/task/13122/fdinfo/4': No such file or directory 
Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: du: cannot access '/var/lib/docker/overlay/5cfff16d670f2df6520360595d7858fb5d16607b6999a88e5dcbc09e1e7ab9ce/merged/proc/13122/fd/3': No such file or directory 
Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: du: cannot access '/var/lib/docker/overlay/5cfff16d670f2df6520360595d7858fb5d16607b6999a88e5dcbc09e1e7ab9ce/merged/proc/13122/fdinfo/3': No such file or directory 
Jun 21 10:28:49 k8s-master-ADB12ED9-0 docker[1622]: - exit status 1, rootInodeErr: <nil>, extraDiskErr: <nil> 

Edit 2: Scheite

kubectl logs $SERVICE_NAME -p                          
done deploying telperion 

Antwort

9

Sie die Protokolle mit

Ihrer Schoten zugreifen können
kubectl logs [podname] -p 

die Option -p werden die Protokolle der vorherigen (abgestürzt) Instanz lesen

Wenn der Absturz aus der Anwendung kommt, sollten Sie dort nützliche Protokolle haben.

+0

Vergessen, diese Protokolle zu posten, hilft aber nicht, sorry :( – ixaxaar

+0

Nun, ich bin mir nicht sicher, aber es scheint, dass Ihr Pod abgestürzt ist, weil es einfach 'beendet'. Ihr Exit-Code ist 0, was keinen Fehler bedeutet sehe ich einen Eintrag ‚Befehl‘ in Ihrem beschreiben pod, die zeigt:.? getan Echo Bereitstellung Telperion ist es möglich, diesen Befehl nur dann ausgeführt wird – Fabien

+0

wow so habe ich diese Zeile entfernen und alles hat funktioniert, ich sollte nicht in Panik geraten und gelesen haben! zu viele Dinge auf dem interwebz. vielen Dank! – ixaxaar