Pivotal Knowledge Base

Follow

Spark history server stopping randomly

Environment

Product Version
Pivotal HD  3.0.x
Pivotal HDP  2.3, 2.4
Spark   1.6

Symptom

Spark history server is stopping randomly.

Spark logs from /var/log/spark/ shows that the history server ran out of memory: 

16/09/16 12:07:46 INFO FsHistoryProvider: Replaying log path: hdfs://namenode.local:8020/spark-history/application_1472823615920_0112
16/09/16 12:07:56 ERROR FsHistoryProvider: Exception while merging application listings
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:188)
        at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$checkForLogs$4.apply(FsHistoryProvider.scala:313)
        at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$checkForLogs$4.apply(FsHistoryProvider.scala:308)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at org.apache.spark.deploy.history.FsHistoryProvider.checkForLogs(FsHistoryProvider.scala:308)
        at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$org$apache$spark$deploy$history$FsHistoryProvider$$startPolling$1.apply$mcV$sp(FsHistoryProvider.scala:205)
        at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1163)
        at org.apache.spark.deploy.history.FsHistoryProvider$$anon$1.run(FsHistoryProvider.scala:124)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOfRange(Arrays.java:2694)
        at java.lang.String.<init>(String.java:203)
        at java.lang.StringBuilder.toString(StringBuilder.java:405)
        at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:349)
        at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.getText(ReaderBasedJsonParser.java:235)
        at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:20)
        at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:42)
        at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:35)
        at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:42)
        at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:35)
        at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3066)
        at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2161)
        at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:19)
        at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:44)
        at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58)
        at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$replay(FsHistoryProvider.scala:579)
        at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$12.apply(FsHistoryProvider.scala:406)
        at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$12.apply(FsHistoryProvider.scala:403)
        at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
        at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
        at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
        at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$mergeApplicationListing(FsHistoryProvider.scala:403)
        at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$checkForLogs$3$$anon$4.run(FsHistoryProvider.scala:305)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        ... 3 more

Cause

When accessing logs via the history server, the full logs must be loaded into the memory. If the amount of memory available to the history server is less than the size of the log, the history server will crash with the above error.

Resolution

Increase the amount of memory available to this spark history server and restart.

1. Open Ambari and go to Spark configuration settings.

2. Under "Advanced spark-env" and in "spark-env template" modify the setting (or add it in if it does not exist) "SPARK_DAEMON_MEMORY":

3. The default value for SPARK_DAEMON_MEMORY is 1Gb. This can be increased to 2Gb or higher depending on the amount of memory available, and the size of the largest log the history server will read.

4. Restart services as requested by Ambari.

  

Comments

Powered by Zendesk