Pivotal Knowledge Base

Follow

PCF services do not start due to x509: certificate has expired or is not yet valid

Environment 

Pivotal Cloud Foundry, version 1.9 and above

Symptom

A preexisting foundation has been running for at least 2 years, and suddenly, PCF services are failing to start due to "Certificate has expired" errors.

Error Message:

consul_agent.stderror.log (any failed service could potentially hit this error if certs are expired)

consul_server/55e8ad09-5a22-4987-8ad6-93a97e43a2e2:
/var/vcap/sys/log/consul_agent$ cat consul_agent.stderr.log error during start: timeout exceeded:
"Unexpected response code: 500 (rpc error: failed to get conn: x509: certificate has expired or is not yet valid)" error during start: timeout exceeded:
"Unexpected response code: 500 (rpc error: failed to get conn: x509: certificate has expired or is not yet valid)"

Cause 

Certificates in Cloud Foundry have a 2 year expiration period. Certificates should be regenerated within 2 years of installation. In this instance, it is the non-configurable internal certificates that require regeneration.

Resolution

The internal non-configurable certificates on Cloud Foundry require regeneration. 

Follow the steps to resolve this issue:

  1. Prior to performing this procedure, it is advisable to scale etcd_server and consul to 1 instance temporarily in order to avoid certificate mismatches. Navigate to OpsManager > Elastic Runtime > Resource Config and set etcd_server and consul to 1 VM. 
  2. From your local machine, target your Ops Manager UAA server:

    $ uaac target https://OPS-MAN-FQDN/uaa
  3. Retrieve your token to authenticate. When prompted for a passcode, retrieve it from https://OPS-MAN-FQDN/uaa/passcode.

    $ uaac token owner get
    Client ID: opsman
    Client secret: [Leave Blank]
    User name: OPS-MAN-USERNAME (by default 'admin')
    Password: OPS-MAN-PASSWORD
    

    Replace OPS-MAN-USERNAME and OPS-MAN-PASSWORD with the credentials that you use to log in to the Ops Manager web interface.

  4. List your tokens:
    $ uaac contexts
    
    Locate the entry for your Ops Manager FQDN. Under,client_id: opsman record the value for access_token.
  5. Use curl to make an API call to regenerate all non-configurable certificates and apply the new CA to your existing Ops Manager Director:

    $ curl "https://OPS-MAN-FQDN/api/v0/certificate_authorities/active/regenerate" \ 
    -X POST \ 
    -H "Authorization: Bearer YOUR-UAA-ACCESS-TOKEN" \ 
    -H "Content-Type: application/json" \ 
    -d '{}'
  6. Click "Apply Changes in OpsManager UI.

  7. If either consul or etcd was scaled down (in step 1), then you may now scale these back up to original value once apply changes has successfully completed.

Thus the certificates will be regenerated and re-applied to the system. The certificate has expired errors should now be fixed.

Additional Information 

Reference on using OpsManager API https://docs.pivotal.io/pivotalcf/1-10/customizing/ops-man-api.html

Reference guide on rotating and regenerating certs in PCF https://docs.pivotal.io/pivotalcf/1-12/security/pcf-infrastructure/api-cert-rotation.html

 

Comments

Powered by Zendesk