Pivotal Knowledge Base

Follow

Failed updating instance diego_cell: Updating certificates with retries

Environment

Pivotal Cloud Foundry 1.8

OS: Ubuntu

Symptom

When attempting an upgrade build of Elastic Runtime,  Diego cells are not rolling properly:

Error Message:

Failed updating instance diego_cell > diego_cell/80bf75c3-caf9-4087-b0dd-f3b5571324fd (27): Timed out sending 'get_task' to 0f410cca-a50a-42d9-aa4a-a2599ad329e9 after 45 seconds (00:13:33) 
Failed updating instance diego_cell > diego_cell/8c595349-5de1-4c29-8c7f-769c2c8aed65 (63): Action Failed get_task: Task b697cd2d-64c7-41d3-498a-e78584ba38b4 result: Stopping Monitored Services: Stopping services '[consul_agent]' errored (00:11:04) 
Failed updating instance diego_cell > diego_cell/931389ef-3591-46ea-aad9-0a3c2139a6ea (23): Action Failed get_task: Task 533905df-5a62-48b4-586a-da092595ee34 result: Updating certificates with retries (00:13:30)

Cause 

 "Updating certificates with retries" errors could result from poor disk performance. The churn of restarting so many instances simultaneously is stressing out the IaaS too much

Resolution

The max-in-flight value in OpsManager needs to be reduced in order to prevent too many Diego cells from being updated simultaneously.

The Exact API to call to change max-in-flight can be found in OpsManager API here:

https://[FQDN Ops Manager]/docs#configuring-the-max_in_flight-settings-for-a-product-39-s-jobs

Follow the steps:

Authenticate & Get Token: https://[FQDN Ops Manager]/docs#authentication

1. Target your Ops Manager IP:

uaac target https://YOUR_OPSMAN_IP/uaa

2. Log in to your Ops Manager with the Client name “opsman”:

uaac token owner get

Client name: opsman
Client secret:
User name: YOUR_USERNAME_HERE
Password: YOUR_PASSWORD_HERE

3. Retrieve your Ops Manager access token:

uaac context

4. Get Products

curl "https://localhost/api/v0/deployed/products" -X GET -H "Authorization: Bearer

Result 
{
"installation_name": "cf-6595dd22a5007e3f6f93",
"guid": "cf-6595dd22a5007e3f6f93",
"type": "cf",
"product_version": "1.10.8-build.7"
}

5. Get Jobs

  curl "https://localhost/api/v0/staged/products/cf-6595dd22a5007e3f6f93/jobs" -k -X GET -H "Authorization: Bearer

Result
{
"name": "diego_cell",
"guid": "diego_cell-81b4916ae28d873c1988"
}

6. Get Max in Flight

 curl "https://localhost/api/v0/staged/products/cf-6595dd22a5007e3f6f93/max_in_flight" -k -X GET -H "Authorization: Bearer

Result
{
 "max_in_flight": {
...
   "diego_cell-81b4916ae28d873c1988": 10,    …}
}

7. Set Max in Flight

 curl "https://localhost/api/v0/staged/products/cf-6595dd22a5007e3f6f93/max_in_flight" -k -X PUT -H "Authorization: Bearer [UAA Token] -H "Content-Type: application/json" \
-d '{
"max_in_flight": {
"diego_cell-81b4916ae28d873c1988": 4
 }
}'

 

Comments

Powered by Zendesk