Pivotal Knowledge Base

Follow

HBase regionserver goes down automatically after startup

Environment

HDP 2.x

Symptom

As shown on the Ambari web console, each time the HBase regionserver starts up, it will go down automatically after a while.  

Cause

The clock skew between HBase master and regionserver nodes is too big.  

RCA

Whenever regionserver tries to connect to HBase master, it will verify if the gap of the system time among the servers is within a tolerable range. If not, regoinserver will shutdown itself. Messages similar to the following ones could be found in the regionserver log file.

2017-08-01 14:51:15,984 FATAL [regionserver/hdw3.example.com/192.0.2.1:16020] regionserver.HRegionServer: ABORTING region server hdw3.example.com,16
020,1501570274049: Unhandled: org.apache.hadoop.hbase.ClockOutOfSyncException: Server hdw3.example.com,16020,1501570274049 has been rejected; Reported t
ime is too far out of sync with master. Time difference of 311432ms > max allowed of 30000ms
at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:388)
at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:262)
at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:348)
......
org.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server hdw3.example.com,16020,1501570274049 has been r
ejected; Reported time is too far out of sync with master. Time difference of 311432ms > max allowed of 30000ms
at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:388)
at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:262)
at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:348)
at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.ja
va:8615)
......
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.ClockOutOfSyncException): org.apache.hadoop.hbase.ClockOutOfSync
Exception: Server hdw3.example.com,16020,1501570274049 has been rejected; Reported time is too far out of sync with master. Time difference of 311432ms
> max allowed of 30000ms
at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:388)
at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:262)
at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:348) 2017-08-01 14:51:16,013 INFO [regionserver/hdw3.example.com/192.0.2.1:16020] regionserver.HRegionServer: STOPPED: Unhandled: org.apache.hadoop.hbase
.ClockOutOfSyncException: Server hdw3.example.com,16020,1501570274049 has been rejected; Reported time is too far out of sync with master. Time differe
nce of 311432ms > max allowed of 30000ms
at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:388)
at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:262)
at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:348)
at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.ja
va:8615)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)

Resolution

The maximum tolerable clock skew is configured with HBase parameter hbase.master.maxclockskew which is 30000 ms by default. So there are two options to solve this issue:

1. Synchronize time on all nodes within a Hadoop cluster. It's recommended to achieve this via NTP

2. Increase hbase.master.maxclockskew. This option is not recommended. Only consider this one if synchronize time on all nodes is not possible

How to change hbase.master.maxclockskew

From Dashboard on Ambari web console:

  1. Choose HBase -> Configs -> Advanced -> "Customer hbase-site"
  2. Click "Add Property" if hbase.master.maxclockskew is not listed and enter Key/Value pair
  3. Save the change and restart HBase service

 

 

Comments

Powered by Zendesk