Pivotal Cloud Foundry® (PCF) 1.10, 1.11
While running cf logs <app-name> we are seeing the following errors:
>cf logs admin-portal --recent
Retrieving logs for app admin-portal in org <org-name> / space <space-name> as admin...
Error dialing trafficcontroller server: Get https://doppler.<system-domain>:443/apps/<app guid>/recentlogs: net/http: request canceled (Client.Timeout exceeded while awaiting headers).
Please ask your Cloud Foundry Operator to check the platform configuration (trafficcontroller endpoint is wss://doppler.<system-domain>:443).
TrafficControllers would request recent logs and container metrics from Dopplers serially, only asking one Doppler at a time for their results. When you have a lot of Dopplers this often takes too long and the CLI times out.
To resolve this issue, please upgrade Pivotal Cloud Foundry Elastic Runtime version to 1.10.33  for PCF 1.10 and 1.11.19  for PCF 1.11. This release makes those requests concurrently and adds a 5 second timeout to each request.
Other temporary workarounds
If your can't upgrade, You can mitigate the issue by scaling up the Dopplers and Traffic Controllers (increasing CPU and Memory), but the problem still exists, however will occur less frequently because the VMs are not under stress and are able to respond quicker.
1. What created the high memory usage on the Dopplers and Traffic Controllers that we didn't have on PCF 1.9?
The reason we saw the high memory usage on PCF 1.10 Dopplers and Traffic Controllers in the foundation is that we switched to using gRPC rather than UDP for communication from metron to Doppler. gRPC  adds the reliability of TCP and causes increased load. With UDP the kernel would drop messages and we would never know about it (as it happens with UDP).