Pivotal Knowledge Base

Follow

How to Recover Monit Processes from a `running - unmonitor pending` or Hung Status

Environment

 Product  Version
 Pivotal Cloud Foundry® (PCF)

 1.6.x,  1.7.x,  1.8.x

Symptoms

After applying changes in your Pivotal Cloud Foundry installation, the deployment fails with an error similar to the one below:

Error 450001: Action Failed get_task: Task 3b027294-f309-43e4-4910-2632bf732cb3 result: Unmonitoring services: Unmonitoring service <job_name>: Sending unmonitor request to monit: Post http://127.0.0.1:2822/<job_name>: net/http: request canceled

 

After seeing this error, you should run `bosh instances --ps` (PCF 1.7.x or higher) or you can `bosh ssh` into either of the failing VM(s), and then run `sudo monit summary` and then you'll see the Monit processes in a similar state to this:

root@7518265d-d108-4c53-9150-28e901e0cab2:/var/vcap/bosh_ssh/bosh_bcb81anuv# monit summary

The Monit daemon 5.2.5 uptime: 4d 20h 48m

Process 'consul_agent' running - unmonitor pending
Process 'cloud_controller_ng' running - unmonitor pending
Process 'cloud_controller_worker_local_1' running - unmonitor pending
Process 'cloud_controller_worker_local_2' running - unmonitor pending
Process 'nginx_cc' running - unmonitor pending
Process 'cloud_controller_migration' running - unmonitor pending
Process 'metron_agent' running - unmonitor pending
Process 'route_registrar' running - unmonitor pending
File 'nfs_mounter' accessible - unmonitor pending
Process 'statsd-injector' running - unmonitor pending
System 'system_localhost' running 

Cause

Monit is sometimes in a hung state because it is a single-threaded, waiting for an action to complete. However, that action won't respond causing Monit to hang until that action is complete or Monit is restarted.

Resolution

The general solution to recover the Monit is to restart the VM so that all of the Monit processes are restarted with it. To restart a VM, `bosh ssh` into the troubled VM, and then run `sudo reboot`. Once the VM restarts, Monit should be back to a normal running state.

Notes

If rebooting the VM does not resolve this issue, contact Pivotal Support.

Comments

Powered by Zendesk