Funken toDebugString nicht schön in Python

Dies ist, was ich bekomme, wenn ich toDebugString in scala verwenden:Funken toDebugString nicht schön in Python

scala> val a = sc.parallelize(Array(1,2,3)).distinct 
a: org.apache.spark.rdd.RDD[Int] = MappedRDD[3] at distinct at <console>:12 

scala> a.toDebugString 
res0: String = 
(4) MappedRDD[3] at distinct at <console>:12 
| ShuffledRDD[2] at distinct at <console>:12 
+-(4) MappedRDD[1] at distinct at <console>:12 
    | ParallelCollectionRDD[0] at parallelize at <console>:12

Dies ist das Äquivalent in Python:

>>> a = sc.parallelize([1,2,3]).distinct() 
>>> a.toDebugString() 
'(4) PythonRDD[6] at RDD at PythonRDD.scala:43\n | MappedRDD[5] at values at NativeMethodAccessorImpl.java:-2\n | ShuffledRDD[4] at partitionBy at NativeMethodAccessorImpl.java:-2\n +-(4) PairwiseRDD[3] at RDD at PythonRDD.scala:261\n | PythonRDD[2] at RDD at PythonRDD.scala:43\n | ParallelCollectionRDD[0] at parallelize at PythonRDD.scala:315'

Wie Sie kann sehen, die Ausgabe ist nicht so schön in Python wie in Scala. Gibt es einen Trick, um eine schönere Ausgabe dieser Funktion zu haben?

Ich benutze Spark 1.1.0.

Quelle

2014-10-13 poiuytrez

Versuchen das Hinzufügen einer print Anweisung, so dass die Debug-String tatsächlich gedruckt wird, anstatt die Anzeige seiner __repr__:

>>> a = sc.parallelize([1,2,3]).distinct() 
>>> print a.toDebugString() 
(8) PythonRDD[27] at RDD at PythonRDD.scala:44 [Serialized 1x Replicated] 
| MappedRDD[26] at values at NativeMethodAccessorImpl.java:-2 [Serialized 1x Replicated] 
| ShuffledRDD[25] at partitionBy at NativeMethodAccessorImpl.java:-2 [Serialized 1x Replicated] 
+-(8) PairwiseRDD[24] at distinct at <stdin>:1 [Serialized 1x Replicated] 
    | PythonRDD[23] at distinct at <stdin>:1 [Serialized 1x Replicated] 
    | ParallelCollectionRDD[21] at parallelize at PythonRDD.scala:358 [Serialized 1x Replicated]

Quelle

2014-10-13 14:55:36

es nicht excuted hat, nur zwischengespeichert sollten Sie verwenden:

a = sc.parallelize([1,2,3]).distinct() 
a.collect() 
[1, 2, 3]

Quelle

2015-12-07 14:08:04 user3409371

Funken toDebugString nicht schön in Python

Antwort

Verwandte Themen