2016-08-23 3 views
1

Ich habe ein paar einfache Spark-Jobs und einige Tests für sie geschrieben. Ich habe alles in IntelliJ gemacht und es funktioniert großartig. Nun möchte ich sicherstellen, dass mein Code mit sbt erstellt wird. Kompilieren ist in Ordnung, aber ich bekomme seltsame Fehler beim Laufen und Testen.Kann Funke Jobs lokal nicht mit sbt ausführen, funktioniert aber in IntelliJ

Ich bin mit Scala Version 2.11.8 und sbt Version 0.13.8

Meine build.sbt Datei sieht wie folgt aus:

name := "test" 

version := "1.0" 

scalaVersion := "2.11.7" 

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0" 
libraryDependencies += "javax.mail" % "javax.mail-api" % "1.5.6" 
libraryDependencies += "com.sun.mail" % "javax.mail" % "1.5.6" 
libraryDependencies += "commons-cli" % "commons-cli" % "1.3.1" 
libraryDependencies += "org.scalatest" % "scalatest_2.11" % "3.0.0" % "test" 
libraryDependencies += "com.holdenkarau" % "spark-testing-base_2.11" % "2.0.0_0.4.4" % "test" intransitive() 

Ich versuche, meinen Code mit sbt "run-main com.test.email.processor.bin.Runner" Hier laufen die Ausgabe:

[info] Loading project definition from /Users/max/workplace/test/project 
[info] Set current project to test (in build file:/Users/max/workplace/test/) 
[info] Running com.test.email.processor.bin.Runner -j recipientCount -e /Users/max/workplace/data/test/enron_with_categories/*/*.txt 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
16/08/23 18:46:55 INFO SparkContext: Running Spark version 2.0.0 
16/08/23 18:46:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
16/08/23 18:46:55 INFO SecurityManager: Changing view acls to: max 
16/08/23 18:46:55 INFO SecurityManager: Changing modify acls to: max 
16/08/23 18:46:55 INFO SecurityManager: Changing view acls groups to: 
16/08/23 18:46:55 INFO SecurityManager: Changing modify acls groups to: 
16/08/23 18:46:55 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(max); groups with view permissions: Set(); users with modify permissions: Set(max); groups with modify permissions: Set() 
16/08/23 18:46:56 INFO Utils: Successfully started service 'sparkDriver' on port 61759. 
16/08/23 18:46:56 INFO SparkEnv: Registering MapOutputTracker 
16/08/23 18:46:56 INFO SparkEnv: Registering BlockManagerMaster 
16/08/23 18:46:56 INFO DiskBlockManager: Created local directory at /private/var/folders/75/4dydy_6110v0gjv7bg265_g40000gn/T/blockmgr-9eb526c0-b7e5-444a-b186-d7f248c5dc62 
16/08/23 18:46:56 INFO MemoryStore: MemoryStore started with capacity 408.9 MB 
16/08/23 18:46:56 INFO SparkEnv: Registering OutputCommitCoordinator 
16/08/23 18:46:56 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
16/08/23 18:46:56 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.1.11:4040 
16/08/23 18:46:56 INFO Executor: Starting executor ID driver on host localhost 
16/08/23 18:46:57 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 61760. 
16/08/23 18:46:57 INFO NettyBlockTransferService: Server created on 192.168.1.11:61760 
16/08/23 18:46:57 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.1.11, 61760) 
16/08/23 18:46:57 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.11:61760 with 408.9 MB RAM, BlockManagerId(driver, 192.168.1.11, 61760) 
16/08/23 18:46:57 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.1.11, 61760) 
16/08/23 18:46:57 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 128.0 KB, free 408.8 MB) 
16/08/23 18:46:57 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 14.6 KB, free 408.8 MB) 
16/08/23 18:46:57 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.1.11:61760 (size: 14.6 KB, free: 408.9 MB) 
16/08/23 18:46:57 INFO SparkContext: Created broadcast 0 from wholeTextFiles at RecipientCountJob.scala:22 
16/08/23 18:46:58 WARN ClosureCleaner: Expected a closure; got com.test.email.processor.util.cleanEmail$ 
16/08/23 18:46:58 INFO FileInputFormat: Total input paths to process : 1702 
16/08/23 18:46:58 INFO FileInputFormat: Total input paths to process : 1702 
16/08/23 18:46:58 INFO CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 0 
16/08/23 18:46:58 INFO SparkContext: Starting job: take at RecipientCountJob.scala:35 
16/08/23 18:46:58 WARN DAGScheduler: Creating new stage failed due to exception - job: 0 
java.lang.ClassNotFoundException: scala.Function0 
    at sbt.classpath.ClasspathFilter.loadClass(ClassLoaders.scala:63) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357) 
    at java.lang.Class.forName0(Native Method) 
    at java.lang.Class.forName(Class.java:348) 
    at com.twitter.chill.KryoBase$$anonfun$1.apply(KryoBase.scala:41) 
    at com.twitter.chill.KryoBase$$anonfun$1.apply(KryoBase.scala:41) 
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) 
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) 
    at scala.collection.immutable.Range.foreach(Range.scala:166) 
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) 
    at scala.collection.AbstractTraversable.map(Traversable.scala:104) 
    at com.twitter.chill.KryoBase.<init>(KryoBase.scala:41) 
    at com.twitter.chill.EmptyScalaKryoInstantiator.newKryo(ScalaKryoInstantiator.scala:57) 
    at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:86) 
    at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:274) 
    at org.apache.spark.serializer.KryoSerializerInstance.<init>(KryoSerializer.scala:259) 
    at org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:175) 
    at org.apache.spark.serializer.KryoSerializer.supportsRelocationOfSerializedObjects$lzycompute(KryoSerializer.scala:182) 
    at org.apache.spark.serializer.KryoSerializer.supportsRelocationOfSerializedObjects(KryoSerializer.scala:178) 
    at org.apache.spark.shuffle.sort.SortShuffleManager$.canUseSerializedShuffle(SortShuffleManager.scala:187) 
    at org.apache.spark.shuffle.sort.SortShuffleManager.registerShuffle(SortShuffleManager.scala:99) 
    at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:90) 
    at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:91) 
    at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:235) 
    at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:233) 
    at scala.Option.getOrElse(Option.scala:121) 
    at org.apache.spark.rdd.RDD.dependencies(RDD.scala:233) 
    at org.apache.spark.scheduler.DAGScheduler.visit$2(DAGScheduler.scala:418) 
    at org.apache.spark.scheduler.DAGScheduler.getAncestorShuffleDependencies(DAGScheduler.scala:433) 
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getShuffleMapStage(DAGScheduler.scala:288) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DAGScheduler.scala:394) 
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DAGScheduler.scala:391) 
    at scala.collection.immutable.List.foreach(List.scala:381) 
    at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:391) 
    at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:403) 
    at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:304) 
    at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:339) 
    at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:849) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1626) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618) 
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607) 
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 
16/08/23 18:46:58 INFO DAGScheduler: Job 0 failed: take at RecipientCountJob.scala:35, took 0.076653 s 
[error] (run-main-0) java.lang.ClassNotFoundException: scala.Function0 
java.lang.ClassNotFoundException: scala.Function0 
[trace] Stack trace suppressed: run last compile:runMain for the full output. 
16/08/23 18:46:58 ERROR ContextCleaner: Error in cleaning thread 
java.lang.InterruptedException 
    at java.lang.Object.wait(Native Method) 
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) 
    at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:175) 
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1229) 
    at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:172) 
    at org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:67) 
16/08/23 18:46:58 ERROR Utils: uncaught error in thread SparkListenerBus, stopping SparkContext 
java.lang.InterruptedException 
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) 
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) 
    at java.util.concurrent.Semaphore.acquire(Semaphore.java:312) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:67) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:66) 
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:65) 
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1229) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:64) 
java.lang.RuntimeException: Nonzero exit code: 1 
+0

Haben Sie Scala 2.11 installiert? –

+0

Ich habe es installiert, aber wie kann ich sbt wissen, wo es ist? – Max

+0

Solange SCALA_HOME gesetzt ist, sind Sie gut –

Antwort

0

Offenbar Funke nicht über sbt ausgeführt werden. Ich packte den gesamten Job in ein Glas mit dem assembly Plugin und lief es mit java.

0

Es scheint, dass Sie Ihre Scala-Bibliothek alsvermissenstammt aus der Standard-Scala-Bibliothek.

Sie könnten versuchen, den scala-lib in bestimmten Bereichen Zugabe

libraryDependencies += "org.scala-lang" % "scala-library" % scalaVersion.value

Aber es scheint, als ob der scala-lib nicht auf den Classpath des Laufs hinzugefügt wird.

Möglicherweise möchten Sie auch etwas hinzufügen, damit der gleiche Klassenpfad verwendet wird, um den Code in SBT auszuführen.

fullClasspath in run := (fullClasspath in Compile).value

+0

Es erscheint unwahrscheinlich, dass sich die Scala-Bibliothek nicht im Klassenpfad befindet. Auch das Hinzufügen beider Zeilen hat nichts bewirkt. Ist es möglich, dass die Spark-Aufgaben nicht den richtigen Klassenpfad erhalten? – Max

Verwandte Themen