Pods are stuck in "ContainerCreating" or "Terminating" status in OpenShift 3

Solution Unverified - Updated -

Issue

  • Pods in a specific node are stuck in ContainerCreating or Terminating status;

  • In project openshift-sdn, sdn and ovs pods are in CrashLoopBackOff status, event shows:

    3:13:18 PM  Warning     Unhealthy   Liveness probe errored: rpc error: code = DeadlineExceeded desc = context deadline exceeded
    
  • Creating or deleting pods fails with FailedCreatePodSandbox or FailedStopPodSandbox with these errors shown in the events or service logs:

    Creating:

    Warning   FailedCreatePodSandBox   kubelet, node.example.com   Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "77e9d45d52520f555a01684dea35d81fa617aa76b0b98647a4b8eb942c792a7a" network for pod "logging-curator-1574911800-nxptr": NetworkPlugin cni failed to set up pod "logging-curator-1574911800-nxptr_openshift-logging" network: failed to send CNI request: Post http://dummy/: dial unix /var/run/openshift-sdn/cni-server.sock: connect: connection refused, failed to clean up sandbox container "77e9d45d52520f555a01684dea35d81fa617aa76b0b98647a4b8eb942c792a7a" network for pod "logging-curator-1574911800-nxptr": NetworkPlugin cni failed to teardown pod "logging-curator-1574911800-nxptr_openshift-logging" network: failed to send CNI request: Post http://dummy/: dial unix /var/run/openshift-sdn/cni-server.sock: connect: connection refused]
    

    Deleting:

    StopPodSandbox "7d66044f8a993444729820d5c5b74c1fcfe67479b103974d137f044525d0fd3f" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "pod-example-1-3cuh84" network: failed to send CNI request: Post http://dummy/: dial unix /var/run/openshift-sdn/cni-server.sock: connect: connection refused
    
  • The docker service logs show these errors repeatedly:

    Nov 27 12:42:12 dockerd-current[1649]: time="2019-11-27T12:42:11.706413993Z" level=info msg="killing and restarting containerd"
    Nov 27 12:42:12 dockerd-current[1649]: time="2019-11-27T12:42:12.333064712Z" level=error msg="libcontainerd: failed to receive event from containerd: rpc error: code = 13 desc = transport is closing"
    Nov 27 12:42:12 dockerd-current[1649]: time="2019-11-27T12:42:12.772160707Z" level=info msg="libcontainerd: new containerd process, pid: 111814"
    Nov 27 12:42:13 dockerd-current[1649]: time="2019-11-27T12:42:13.348255275Z" level=info msg="killing and restarting containerd"
    Nov 27 12:42:13 dockerd-current[1649]: time="2019-11-27T12:42:13.449019674Z" level=info msg="libcontainerd: new containerd process, pid: 111841"
    Nov 27 12:42:14 dockerd-current[1649]: time="2019-11-27T12:42:14.047946349Z" level=error msg="Error running exec in container: rpc error: code = 14 desc = grpc: the connection is unavailable"
    

Environment

  • Red Hat OpenShift Container Platform (OCP)
    • 3.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content