Pivotal Knowledge Base


PCF Windows - DNS Resolvers Issues on Windows Cells


  • Pivotal Cloud Foundry® 1.10 and above
  • Pivotal Cloud Foundry Runtime for Windows 1.12


Windows Diego cells can become unresponsive or enter a failing state intermittently. The state of Windows Diego cells is very inconsistent.

This issue can manifest intermittently and on subsets of Windows cells in Runtime for Windows tile deployments, but has the potential to disrupt all the Windows cells in a deployment, meaning applications can no longer route traffic during and after a PCF upgrade (or other causal events).


Applications hosted on Windows cells become unresponsive and do not recover during PCF upgrades or other loss of network connectivity events because (Consul) gets dropped from the DNS resolvers list on the Windows hosts. When Cloud Foundry jobs cannot contact the Consul DNS, they cannot resolve cf.internal.* hostnames.

This happens when the bosh director becomes unavailable or loses connection with Windows VMs (stemcell 1200.6 or earlier), such as during a PCF upgrade, the BOSH Agent on BOSH-deployed Windows VMs (including Windows Diego cells) restarts. This is expected behavior that continues during the time it cannot contact a director; the BOSH Agent also exponentially backs off its restart timing, to a maximum interval of 5 minutes between restarts, to minimize CPU load on the cell.

During this multiple restart scenario, the BOSH agent was erroneously overwriting all DNS resolver entries in the OS with the list of cloud config resolvers, thus removing the necessary value, inserted by Consul during the Consul job’s pre-start process. This pre-start process is not executed again by the BOSH agent upon its restart, but the core issue lies in how the BOSH agent overwrites the DNS resolver entries

Since any loss of connectivity can cause this issue, it means that in addition to PCF upgrades, network events (like router replacements), director failures, increased ESX load, and possibly others, could cause this issue.


The permanent and suggested fix is to upgrade to Runtime for Windows stemcell 1200.7 or above. For Azure/GCP/AWS, see Stemcell 1200.7 for PCF (Windows).

If you're unable to do that at this time, you can perform the following steps as a temporary workaround.

For the stemcell versions 1200.6 and below, the IP Address can be added manually to DNS Resolvers to fix the issue. However, the bosh agent restart will remove this IP again, causing the same issue described above.

  • Connect to each Windows cell either via your IaaS virtual console or via RDP.
  • Edit the DNS configuration by navigating to the Control Panel > Network and Internet > Network and Sharing Center
  • Under network connections, choose Ethernet.


  • Chose Properties > TCP/IPv4 > Properties



Note above, that doesn’t appear in the DNS list under the section “Use the following DNS server addresses. This is the cause of the issue.

  • Click on Advanced.
  • In Advanced TCP/IP Settings pane, click on DNS tab, then click Add.
  • Type in the dialog box and click Add.


  • Use the up/down arrows to move into the first position
  • Click OK to close Advanced TCP/IP Settings.
  • Click OK to close TCP/IPv4 Properties.


  • Close Ethernet Properties to persist changes.

The fix has been applied. The BOSH jobs should eventually become healthy, and then apps can serve traffic and have new apps cf pushed to the cells.

Additional Information

For more information on how to connect to a Windows Cell via RDP, please see this article How to generate a randomized Administrator Password



Powered by Zendesk