Pivotal Knowledge Base

Follow

Push Pivotal Account errand error: "FAILED Start app timeout" in Elastic Runtime 1.10

Environment

 Product  Version
 Pivotal Elastic Runtime  1.10.x

Symptom

During Elastic Runtime Deployment, the "Push Pivotal Account" errand fails to start the "pivotal-account-green" app

0 of 2 instances running, 2 starting
FAILED
Start app timeout

TIP: Application must be listening on the right port. Instead of hard coding the port, use the $PORT environment variable.

Use 'cf logs pivotal-account-green --recent' for more information

From the Bosh task debug output

+ set -e
+ CF_TRACE=true
+ cf restart pivotal-account-green
","exit_code":1}}

One of the Elastic Runtime MySQL servers reports an MDL conflict suggesting two processes were attempting to modify the primary key "schema_version_pk" of the "schema_version" table

2017-04-19 21:32:23 139765381577472 [Note] WSREP: MDL conflict db=account table=schema_version ticket=8 solved by abort
2017-04-19 21:32:23 139765484690176 [Note] WSREP: cluster conflict due to high priority abort for threads:
2017-04-19 21:32:23 139765484690176 [Note] WSREP: Winning thread:
   THD: 98, mode: total order, state: executing, conflict: no conflict, seqno: 3614
   SQL: ALTER TABLE `account`.`schema_version` ADD CONSTRAINT `schema_version_pk` PRIMARY KEY (`version`)
2017-04-19 21:32:23 139765484690176 [Note] WSREP: Victim thread:
   THD: 99, mode: local, state: idle, conflict: no conflict, seqno: -1
   SQL: (null)

Then all the subsequent executions of the "Push Pivotal Account" errand will report the following error in the "pivotal-account-green" application logs during startup

org.flywaydb.core.api.FlywayException: Schema `account` contains a failed migration to version 1 !

Cause

The "pivotal-account-green" app gets created with an instance count of 2. During startup, it will perform a database migration if it detects that the current version of the app is newer than the existing database schema. Since this is the first time starting when both app instances come up, they both compete to apply the Database schema which could result in variable failure scenarios.

Resolution

This will be fixed in future releases of Elastic Runtime. In the meantime, refer to the workaround below:

Workaround

CAUTION: This workaround only applies to this specific case and should the symptoms described here differ from what you see in any way then we advise you to engage Pivotal Support before proceeding.

  • Connect to the Elastic Runtime Database and determine how far the schema migration got by running this query in the "account" database
    MariaDB [account]> select * from schema_version where version = 1;
    +--------------+----------------+---------+--------------------------+------+----------------------------------+-------------+----------------------------------+---------------------+----------------+---------+
    | version_rank | installed_rank | version | description              | type | script                           | checksum    | installed_by                     | installed_on        | execution_time | success |
    +--------------+----------------+---------+--------------------------+------+----------------------------------+-------------+----------------------------------+---------------------+----------------+---------+
    |            1 |              1 | 1       | create zone client table | SQL  | V1__create_zone_client_table.sql | -1189286464 | tmFLeKoGv-isOZWinJhT9hIeLoDTmV1E | 2017-04-05 17:28:31 |             67 |       0 |
    +--------------+----------------+---------+--------------------------+------+----------------------------------+-------------+----------------------------------+---------------------+----------------+---------+
    1 row in set (0.00 sec)
  • The above query shows MySQL script V1__create_zone_client_table.sql failed to run because the "success" column has a 0 (false) instead of a 1 (true). The contents of this script are below and show that it only needs to create the zone_clients table
    CREATE TABLE zone_client (
      client_id varchar(255) NOT NULL,
      client_secret varchar(255) NOT NULL,
      subdomain varchar(255) NOT NULL,
      PRIMARY KEY (subdomain)
    );
  • Given the zone_clients table is empty and we understand the database migration failed because there were two app instances trying to run the migration at the same time, we can proceed to revert and re-run the migration
    MariaDB [account]> select * from zone_client;
    Empty set (0.00 sec)
  • Proceed to drop the zone_client table
    DROP TABLE zone_client;
  • We can truncate the schema_version table in this case because there is only one row and that row references the failed script "V1__create_zone_client_table.sql"
    TRUNCATE TABLE schema_version;
    Note: If you have more than one row in the schema version table, then it is likely that this article is not going to help and you should call Pivotal Support instead.
  • Target the "system" org and "system" space using CF CLI
    cf target -o system -s system
  • Scale down the number of "pivotal-acount-green" instances to 1
    cf scale pivotal-account-green -i 1
  • Restart the "pivotal-account-green" app so the migration will re-run with only one app instance
    cf restart pivotal-account-green
  • "pivotal-account-green" app should start right up pretty quickly. If the app starts up then you are ready to proceed with running the errand manually or re-running apply changes in Operations Manager. If the app still fails to come up then you will need to review app logs to determine what the new error could be. 

Comments

Powered by Zendesk