Pivotal Cloud Foundry (PCF) Metrics 1.3
After upgrading PCF Metrics tile from 1.2 to 1.3.x, smoke-tests fail.
`curl localhost:9200/_cluster/health?pretty` reports cluster health with status red
Ingests logs from firehose into elasticsearch [It]
Never received app logs - something in the firehose -> elasticsearch flow is broken
Summarizing 1 Failure:
[Fail] Elasticsearch flow [It] Ingests logs from firehose into elasticsearch
The elastic search indexes were created and when they were trying to replicate the upgrade happened and left them in a corrupt state.
The solution in this KB is a last resort if you are unable to fix elastic search master "red" status by restarting the app and other steps outlined in https://docs.pivotal.io/pcf-metrics/1-3/troubleshooting.html#smoke-test
DO NOT perform this procedure if many or all of the indices are in "red" status. This procedure is meant to address condition where a few indices are corrupt and stuck in "red" status.
Perform the steps:
1.) SSH to elasticsearch_master node:
$ bosh ssh elasticsearch_master/0
2.) Identify the indices with status red:
$ curl localhost:9200/_cat/indices?v | sort
green open app_logs_1504677600 1 1 209948 0 35.8mb 17.9mb
green open app_logs_1504699200 1 1 0 0 318b 159b
green open app_logs_1504785600 1 1 0 0 318b 159b
green open app_logs_1504807200 1 1 0 0 318b 159b
health status index pri rep docs.count docs.deleted store.size pri.store.size
red open app_logs_1504720800 1 1
red open app_logs_1504742400 1 1
red open app_logs_1504764000 1
3.) Delete the indices with status red:
$ curl -XDELETE http://localhost:9200/app_logs_1504720800
Note: This has potential to delete application log data. Do not execute if this logging is critical.