Back to ITRS Analytics FAQ

How can I resolve "Liveness probe failed" errors in pod events?

Issue Copied

A pod may enter a restart loop with an increasing restart count. kubectl get pods may show the pod as 0/1 Running.

Pod events often include repeated Liveness probe failed and Readiness probe failed messages, followed by Killing when the kubelet restarts the container:

Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Warning  Unhealthy  10h (x3 over 10h)    kubelet            Liveness probe failed: Get "http://**HIDDEN**:8080/healthcheck": dial tcp **HIDDEN**:8080: connect: connection refused
  Warning  Unhealthy  10h (x162 over 10h)  kubelet            Readiness probe failed: Get "http://**HIDDEN**:8080/healthcheck": dial tcp **HIDDEN**:8080: connect: connection refused
  Normal   Killing    10h (x9 over 10h)    kubelet            Container iax-app-capacity-daemon failed liveness probe, will be restarted

Container logs may not show a hard error. Instead, they stop at startup lines such as Starting metric indexer for config … or HikariPool-7 - Starting…. The service has not yet made /healthcheck responsive when the liveness probe runs.

The pod startup time exceeds the default liveness probe window. While the service is still initializing, the probe calls /healthcheck, receives connection refused, and Kubernetes marks the container unhealthy and restarts it, creating a loop. Default settings that may be too aggressive for busy environments:

Resolution Copied

Increase the liveness probe window so the service can finish initialization before the kubelet marks it unhealthy. Apply settings similar to:

livenessProbe:
  httpGet:
    path: /healthcheck
    port: 8080
  initialDelaySeconds: 300   # increased from 120
  timeoutSeconds: 10         # increased from 5
  failureThreshold: 5        # increased from 3
  periodSeconds: 10

After applying these changes, the pod should reach a healthy state, the /healthcheck endpoint should respond, and restarts should stop.

  1. Identify the pod’s parent deployment:

    kubectl get deployment -n <namespace>
    
  2. Edit the deployment and update the livenessProbe settings:

    kubectl edit deployment <deployment_name> -n <namespace>
    
["Geneos"] ["FAQ"]

Was this topic helpful?