Pivotal Knowledge Base

Follow

IPsec Deployed Instances Suddenly Stop Communicating as the Certificates Expire

Environment

All versions of IPsec

Symptom

IPsec is enabled and suddenly all of the VMs in the deployment stop communicating.

Cause 

This might happen if the SSL certificates used in the IPsec deployment expire. As discussed in IPsec documentation the operator is responsible for rotated these certs before they expire.

Resolution

Given we are now in a state where all of the IPsec enabled virtual instances can no longer communicate with each other over the network the procedures mentioned in the IPsec documentation will not work. We must manually restore network connectivity to the IPsec enabled virtual machines as followed.

  1. First, we must SSH into each VM and disable IPsec using monit. In most cases, users will be able to run "bosh ssh" from the Operations Manager VM to gain access to the virtual machines. This will work only when the Operations Manager IP subnet is not a subnet in the "ipsec_subnets" range. An operator needs to SSH from a virtual machine that is in the "no_ipsec_subnets" range as defined in the manifest. For an example see Installing IPSEC Documentation. Command to stop IPsec on the virtual machine is below:
    monit stop ipsec
  2. With IPsec disabled on all VMs, we need to update the runtime-config for IPsec and set the optional flag to "true". See IPsec Documentation which explains how to download and edit the runtime-config:
    optional: true
  3. Proceed to rotate the IPsec certifications following the documented procedures and put platform into a healthy state.
  4. Revert the optional flag back to false by updating the runtime-config.

Comments

Powered by Zendesk