Pivotal Knowledge Base

Follow

Smoke Tests Errand Error: Expected to Hit all 2 App Instances in 30 Attempts, But Didn't

Environment

 Product  Version
 Pivotal Cloud Foundry® (PCF)

 All

Symptom

Smoke test Errand is failing for Elastic Runtime:

The errand tries to install a test app and then scales it to 2 instances. Both of these instances are meant to respond to the Load Balancer requests consecutively. If the Load Balancer is unable to respond to both the instances consecutively, it fails with Expected to hit all 2 app instances in 30 attempts, but didn't.

The final error for the smoke test is below:

Runtime: linux apps t] can be pushed, scaled and deleted /var/vcap/packages/smoke-tests/src/github.com/cloudfoundry/cf-smoke-tests/smoke/runtime/runtime_test.go:174

The above error is a result of only one of the two instances responding to the Load Balancer requests.

Cause

Load Balancer has sticky sessions enabled. The sticky sessions are used to always direct the traffic to a specific node unless it is unavailable. This improves performance in some cases. However, in Cloud Foundry, the load balancer must respond to each instance consecutively to maintain the load.

For example, if you are using an external Load Balancer such as F5, Web Acceleration feature can cause the sticky sessions.

See AskF5 Support Manual for details.

Resolution

To troubleshoot and find the root cause, we need to run the smoke tests errand using --keep-alive flag. This flag will keep the smoke-tests VM active while you are troubleshooting the issue.

Follow these steps for a complete analysis:

  1. Login to Ops Manager using user ubuntu
  2. Target and Login to Director using 'bosh target' and 'bosh login' commands
  3. Make sure the CF deployment cf-<deployment-ID> is selected: bosh deployment
  4. /var/tempest/workspaces/default/deployments/cf-<deployment-ID>.yml
  5. Run smoke tests errand: bosh run errand smoke-tests --keep-alive
  6. Once the errand completes, run: bosh ssh smoke-tests
  7. Run: sudo su -
  8. To understand the Load Balancer communications with individual instances, deploy a test app using cf
    • cf can be accessed by /var/vcap/packages/cli/bin/cf
        • OR
    • Add this path to .bashrc:
      • vi /root/.bashrc
      • Modify the file in the end with this line: export PATH=/var/vcap/packages/cli/bin:$PATH
      • exit to logout
      • sudo su -
  9. Login to the CF deployment:
    1. cf login -a <API URI>
    2. Use the UAA admin credentials from ERT tile of Ops Manager.
    3. Create and target smoke-tests Org and space
      • cf create-org CF_SMOKE_TEST_ORG
      • cf target -o CF_SMOKE_TEST_ORG
      • cf create-space CF_SMOKE_TEST_SPACE
      • cf target -s CF_SMOKE_TEST_SPACE
  10. Deploy the test app:
    • cd /var/vcap/packages/smoke-tests/src/github.com/cloudfoundry/cf-smoke-tests/assets/ruby_simple
    • Push the test app: cf push smokex
    • Scale the app instances: cf scale smokex -i 2
  11. Try to get a response from both the instances: curl smokex.<full-url-of-app>
    • Run the above curl command multiple times and notice that the "instance_index": switches between 0 and 1 sequentially.
  • Phase 1:
$$ curl smokex.xxxx-04.xxxx.pivotal.io
Healthy
It just needed to be restarted!
My application metadata: {"application_id":"99a8cdad-xxxx-xxxx-xxxx-e10e15a3a919",
"application_name":"smokex","application_uris":["smokex.xxxx-04.xxxx.pivotal.io"],
"application_version":"baaa0f8b-xxxx-xxxx-xxxx-a9044c7ad3c6","host":"0.0.0.0",
"instance_id":"99095672-xxxx-xxxx-xxxx-155f1284e593","instance_index":0,"limits":{"disk":1024,"fds":16384,"mem":1024},
"name":"smokex","port":8080,"space_id":"c0f3a403-xxxx-xxxx-xxxx-54cc30493f32",
"space_name":"CF_SMOKE_TEST_SPACE","uris":["smokex.xxxx-04.xxxx.pivotal.io"],"version":"baaa0f8b-xxxx-xxxx-xxxx-a9044c7ad3c6"}
My port: 8080
My custom env variable:
  • Phase 2:
$$ curl smokex.xxxx-04.xxxx.pivotal.io
Healthy
It just needed to be restarted!
My application metadata: {"application_id":"99a8cdad-xxxx-xxxx-xxxx-e10e15a3a919",
"application_name":"smokex","application_uris":["smokex.xxxx-04.xxxx.pivotal.io"],
"application_version":"baaa0f8b-xxxx-xxxx-xxxx-a9044c7ad3c6","host":"0.0.0.0",
"instance_id":"99095672-xxxx-xxxx-xxxx-155f1284e593","instance_index":1,"limits":{"disk":1024,"fds":16384,"mem":1024},
"name":"smokex","port":8080,"space_id":"c0f3a403-xxxx-xxxx-xxxx-54cc30493f32",
"space_name":"CF_SMOKE_TEST_SPACE","uris":["smokex.xxxx-04.xxxx.pivotal.io"],"version":"baaa0f8b-xxxx-xxxx-xxxx-a9044c7ad3c6"}
My port: 8080
My custom env variable:

Failing to switch between the two instances sequentially confirms that the Load Balancer is NOT set to Round Robin. For example, the sequential curl outputs fetching results from "instance_index":0 only all the time.

Once the analysis is completed, run the smoke tests errand again without --keep-alive flag to clean the instance created: bosh run errand smoke-tests

Additional Information

Make sure the external F5 Load Balancing Method is set to Round Robin.

See Configuring Load Balancer Pool Settings for details.

Comments

Powered by Zendesk