Docker has a sudden failure, `aufs: deviceor resourcebusy’, unable to start the container.

  docker, question

Start

Deployed using docker-composebinux/pysiderCluster, use did not understanddockercloud/haproxy:latestThere is a link between each container.

Find a problem

After a few days of stable operation, all the crawler tasks suddenly failed, and the detection and judgment may be abnormal communication between containers.

First attempt

Attempt to remove all containers

$ docker-cpmpose down
 Removing pyspider_phantomjs_3 ... error
 ...
 ERROR: for pyspider_processor_2  b'driver "aufs" failed to remove root filesystem for 50e4c88bf6f91c2697f302cc4d124114bfc50d74dc0ee246fdb86bf2aa158f6e:   aufs: unmount error after retries: /var/lib/docker/aufs/mnt/288328abe7a32dcc04f8702fa5296d5417f412e14969b0c674760def626105a6: device or resource busy'
 ...
 
 $ docker rm pyspider_phantomjs_2
 Error response from daemon: driver "aufs" failed to remove root filesystem for 8613dcee7d20f927120a441102c10743615e7fcc0a35a42a2b4474cc70b7ec77: aufs:   unmount error after retries: /var/lib/docker/aufs/mnt/adf6c5a7e9416f0b866daa8ea5bcf4ac15d6b8b0bf0181fd9d9c773441c5e343: device or resource busy

Failure
Judging may be resource occupation

Attempt to Unoccupy Disk

$ sudo unmount /var/lib/docker/aufs/mnt/adf6c5a7e9416f0b866daa8ea5bcf4ac15d6b8b0bf0181fd9d9c773441c5e343
 umount: /var/lib/docker/aufs/mnt/adf6c5a7e9416f0b866daa8ea5bcf4ac15d6b8b0bf0181fd9d9c773441c5e343: target is busy
 
 # Failed
 # Use umount -l to Unoccupy
 Docker-compose down removes dead containers

Another problem has arisen.

Start a new container using docker-compose

$ $ docker-compose -f mysql.yml start
 ...
 compose.cli.verbose_proxy.proxy_callable: docker start <- ('3c1a225d10d5385f024649d8252fd94ff74941b0ba9c504b65aab37de8afe248')
 compose.parallel.feed_queue: Pending: set()
 compose.parallel.feed_queue: Pending: set()
 compose.parallel.feed_queue: Pending: set()
 ...
 
 # Always repeat the last exception
 
 ERROR: for mysql  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

Restart doc

service docker restart
The problem remains

Other attempts

  • View docker log
    dockerd -DStart debugging

Start updocker run -it --rm mysql /bin/bash

* **dockerd output * *
 DEBU[0033] Assigning addresses for endpoint upbeat_kirch's interface on network bridge
 DEBU[0034] Programming external connectivity on endpoint upbeat_kirch (7224793b5e81325574724c072f6c8dc42f71b5cae09b74a013f57f5f27c03b70)
 DEBU[0034] EnableService d2f705c568792c10ab06f3a054b9b94088a457f19ff5857234649bd01cbb6560 START
 DEBU[0034] EnableService d2f705c568792c10ab06f3a054b9b94088a457f19ff5857234649bd01cbb6560 DONE
 ERRO[0154] containerd: start container                   error="containerd: container did not start before the specified timeout" id=d2f705c568792c10ab06f3a054b9b94088a457f19ff5857234649bd01cbb6560
 ERRO[0154] Create container failed with error: containerd: container did not start before the specified timeout
 DEBU[0154] attach: stdout: end
 DEBU[0154] attach: stderr: end
 DEBU[0154] attach: stdin: end
 DEBU[0154] Closing buffered stdin pipe
 DEBU[0154] Revoking external connectivity on endpoint upbeat_kirch (7224793b5e81325574724c072f6c8dc42f71b5cae09b74a013f57f5f27c03b70)
 DEBU[0154] DeleteConntrackEntries purged ipv4:0, ipv6:0
 DEBU[0154] Releasing addresses for endpoint upbeat_kirch's interface on network bridge
 DEBU[0154] ReleaseAddress(LocalDefault/172.17.0.0/16, 172.17.0.2)
 DEBU[0154] Failed to unmount e6c100ba212bed7b2a01db39699242905e2a96833776dfcf33ded530e24caf99 aufs: device or resource busy
 ERRO[0154] Error unmounting container d2f705c568792c10ab06f3a054b9b94088a457f19ff5857234649bd01cbb6560: device or resource busy
 DEBU[0154] GetMountID id: d2f705c568792c10ab06f3a054b9b94088a457f19ff5857234649bd01cbb6560 -> mountID: e6c100ba212bed7b2a01db39699242905e2a96833776dfcf33ded530e24caf99
 DEBU[0154] Cleaning up old mountid e6c100ba212bed7b2a01db39699242905e2a96833776dfcf33ded530e24caf99: start.
 DEBU[0155] Cleaning up old mountid e6c100ba212bed7b2a01db39699242905e2a96833776dfcf33ded530e24caf99: done.
 DEBU[0155] Removing volume reference: driver local, name d96daf284f326a2ca342f519dc3b4a5571a2620c8720ff4fba55d5aac8b8ad63
 ERRO[0155] Handler for POST /v1.30/containers/d2f705c568792c10ab06f3a054b9b94088a457f19ff5857234649bd01cbb6560/start returned error: containerd: container did not start before the specified timeout
 
 **docker run output * *
 $ docker run -it --rm mysql /bin/  bash                                               │                                                                                                                                                                │64 bytes from 192.168.1.2
 docker: Error response from daemon: containerd: container did not start before the specified tim│                                                                                                                                                                │21: icmp_seq=867 ttl=62 t
 eout.
  • It is said on the Internet that the system load is too large, which has not solved the problem I encountered.

last

It has been two days since I met the same problem. The first problem was solved by restarting the machine. Today is the main server and cannot be restarted. What is the problem of seeking guidance?

After I stopped the docker, I found that there was still a phantomjs process. It was suspected that phantomjs had escaped the control of docker (memory overflow? ), finally restart to solve