2016-07-20 9 views
0

Ich bin in der Lage, einfache Hello World-Programm durch Spark auf Standalone-Maschine ausführen. Aber wenn ich ein Wortzählungsprogramm unter Verwendung von Spark Context ausführen und es mit Hilfe von pyspark ausführen lasse, erhalte ich den folgenden Fehler. Fehler SparkContext: Fehler beim Initialisieren von SparkContext. java.io.FileNotFoundException: Dateidatei hinzugefügt: /Users/tanyagupta/Documents/Internship/Zyudly%20Labs/Tanya-Programs/word_count.py existiert nicht. Ich bin auf Mac OS X. Ich installierte Spark durch brew brew installieren Apache-Funke. Irgendwelche Ideen, was jetzt schief geht?Apache Spark- Fehler beim Initialisieren von SparkContext. java.io.FileNotFoundException

Mit Spark Standard log4j Profil:

org/apache/spark/log4j-defaults.properties 
16/07/19 23:18:45 INFO SparkContext: Running Spark version 1.6.2 
16/07/19 23:18:45 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
16/07/19 23:18:45 INFO SecurityManager: Changing view acls to: tanyagupta 
16/07/19 23:18:45 INFO SecurityManager: Changing modify acls to: tanyagupta 
16/07/19 23:18:45 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(tanyagupta); users with modify permissions: Set(tanyagupta) 
16/07/19 23:18:46 INFO Utils: Successfully started service 'sparkDriver' on port 59226. 
16/07/19 23:18:46 INFO Slf4jLogger: Slf4jLogger started 
16/07/19 23:18:46 INFO Remoting: Starting remoting 
16/07/19 23:18:46 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:59227] 
16/07/19 23:18:46 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 59227. 
16/07/19 23:18:46 INFO SparkEnv: Registering MapOutputTracker 
16/07/19 23:18:46 INFO SparkEnv: Registering BlockManagerMaster 
16/07/19 23:18:46 INFO DiskBlockManager: Created local directory at /private/var/folders/2f/fltslxd54f5961xsc2wg1w680000gn/T/blockmgr-812de6f9-3e3d-4885-a7de-fc9c2e181c64 
16/07/19 23:18:46 INFO MemoryStore: MemoryStore started with capacity 511.1 MB 
16/07/19 23:18:46 INFO SparkEnv: Registering OutputCommitCoordinator 
16/07/19 23:18:46 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
16/07/19 23:18:46 INFO SparkUI: Started SparkUI at http://192.168.0.5:4040 
16/07/19 23:18:46 ERROR SparkContext: Error initializing SparkContext. 
java.io.FileNotFoundException: Added file file:/Users/tanyagupta/Documents/Internship/Zyudly%20Labs/Tanya-Programs/word_count.py does not exist. 
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1364) 
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1340) 
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491) 
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491) 
at scala.collection.immutable.List.foreach(List.scala:318) 
at org.apache.spark.SparkContext.<init>(SparkContext.scala:491) 
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) 
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) 
at py4j.Gateway.invoke(Gateway.java:214) 
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) 
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) 
at py4j.GatewayConnection.run(GatewayConnection.java:209) 
at java.lang.Thread.run(Thread.java:745) 
16/07/19 23:18:47 INFO SparkUI: Stopped Spark web UI at http://192.168.0.5:4040 
16/07/19 23:18:47 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 
16/07/19 23:18:47 INFO MemoryStore: MemoryStore cleared 
16/07/19 23:18:47 INFO BlockManager: BlockManager stopped 
16/07/19 23:18:47 INFO BlockManagerMaster: BlockManagerMaster stopped 
16/07/19 23:18:47 WARN MetricsSystem: Stopping a MetricsSystem that is not running 
16/07/19 23:18:47 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 
16/07/19 23:18:47 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 
16/07/19 23:18:47 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 
16/07/19 23:18:47 INFO SparkContext: Successfully stopped SparkContext 

Traceback (most recent call last): 
File "/Users/tanyagupta/Documents/Internship/Zyudly Labs/Tanya-Programs/word_count.py", line 7, in <module> 
sc=SparkContext(appName="WordCount_Tanya") 
File "/usr/local/Cellar/apache-spark/1.6.2/libexec/python/lib/pyspark.zip/pyspark/context.py", line 115, in __init__ 
File "/usr/local/Cellar/apache-spark/1.6.2/libexec/python/lib/pyspark.zip/pyspark/context.py", line 172, in _do_init 
File "/usr/local/Cellar/apache-spark/1.6.2/libexec/python/lib/pyspark.zip/pyspark/context.py", line 235, in _initialize_context 
File "/usr/local/Cellar/apache-spark/1.6.2/libexec/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 1064, in __call__ 
File "/usr/local/Cellar/apache-spark/1.6.2/libexec/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value 

py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. 
: java.io.FileNotFoundException: Added file  file:/Users/tanyagupta/Documents/Internship/Zyudly%20Labs/Tanya-Programs/word_count.py does not exist. 
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1364) 
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1340) 
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491) 
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491) 
at scala.collection.immutable.List.foreach(List.scala:318) 
at org.apache.spark.SparkContext.<init>(SparkContext.scala:491) 
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) 
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) 
at py4j.Gateway.invoke(Gateway.java:214) 
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) 
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) 
at py4j.GatewayConnection.run(GatewayConnection.java:209) 
at java.lang.Thread.run(Thread.java:745) 

16/07/19 23:18:47 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down. 
16/07/19 23:18:47 INFO ShutdownHookManager: Shutdown hook called 
16/07/19 23:18:47 INFO ShutdownHookManager: Deleting directory /private/var/folders/2f/fltslxd54f5961xsc2wg1w680000gn/T/spark-f69e5dfc-6561-4677-9ec0-03594eabc991 
+0

siehe [Frage] (http://stackoverflow.com/questions/32402094/spark-submit-fails-to-import-sparkcontext) die genau denselben Fehler hat, nach dem Sie hinzufügen müssen ** Tanya_programs ** Verzeichnis zu Ihrer ** PYTHONPATH ** Variable. – ashwinids

Antwort

1

Hinzufügen __init__.py Datei in meinem Ordner für mich gearbeitet!

Danke!

Verwandte Themen