Fehler bei der Verwendung der API "newAPIHadoopFile"

Ich schreibe den folgenden Code, um eine Datei mit der API newAPIHadoopFile in Spark zu laden.Fehler bei der Verwendung der API "newAPIHadoopFile"

val lines = sc.newAPIHadoopFile("new_actress.list",classOf[TextInputFormat],classOf[Text],classOf[Text])

Aber ich erhalte den folgenden Fehler:

scala> val lines = sc.newAPIHadoopFile("new_actress.list",classOf[TextInputFormat],classOf[Text],classOf[Text]) 
<console>:34: error: inferred type arguments [org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.mapred.TextInputFormat] do not conform to method newAPIHadoopFile's type parameter bounds [K,V,F <: org.apache.hadoop.mapreduce.InputFormat[K,V]] 
val lines = sc.newAPIHadoopFile("new_actress.list",classOf[TextInputFormat],classOf[Text],classOf[Text]) 
       ^
<console>:34: error: type mismatch; 
found : Class[org.apache.hadoop.mapred.TextInputFormat](classOf[org.apache.hadoop.mapred.TextInputFormat]) 
required: Class[F] 
val lines = sc.newAPIHadoopFile("new_actress.list",classOf[TextInputFormat],classOf[Text],classOf[Text]) 
                 ^
<console>:34: error: type mismatch; 
found : Class[org.apache.hadoop.io.Text](classOf[org.apache.hadoop.io.Text]) 
required: Class[K] 
val lines = sc.newAPIHadoopFile("new_actress.list",classOf[TextInputFormat],classOf[Text],classOf[Text]) 
                       ^
<console>:34: error: type mismatch; 
found : Class[org.apache.hadoop.io.Text](classOf[org.apache.hadoop.io.Text]) 
required: Class[V] 
val lines = sc.newAPIHadoopFile("new_actress.list",classOf[TextInputFormat],classOf[Text],classOf[Text]) 
                           ^

Was bin ich im Code falsch?

Quelle

2016-10-17 sarthak

TextInputFormat dauert <LongWritable,Text>.

Hinweis: auf einem Teil erstreckt **InputFormat sowohl

@InterfaceAudience.Public 
@InterfaceStability.Stable 
public class TextInputFormat 
extends FileInputFormat<LongWritable,Text>

fokussiert werden, das bedeutet, dass Sie nicht beide Typen für FileInputFormat als Text einstellen. Wenn Sie FileInputFormat verwenden möchten, müssen Sie etwas tun:

Sie können versuchen:

import org.apache.hadoop.mapreduce.lib.input.TextInputFormat 
import org.apache.hadoop.io.Text 
import org.apache.hadoop.io.LongWritable 
val lines = sc.newAPIHadoopFile("test.csv", classOf[TextInputFormat], classOf[LongWritable], classOf[Text])

aber falls Sie noch beide Arten verwenden, wie Text Sie KeyValueTextInputFormat verwenden können, das ist wie folgt definiert:

@InterfaceAudience.Public @InterfaceStability.Stable public class 
KeyValueTextInputFormat extends FileInputFormat<Text,Text>

können Sie versuchen:

import org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat 
import org.apache.hadoop.io.Text 
val lines = sc.newAPIHadoopFile("test.csv", classOf[KeyValueTextInputFormat], classOf[Text], classOf[Text])

Quelle

2016-10-17 19:49:28 VladoDemcak

Dank ... eine weitere Sache, gibt es einen Unterschied zwischen 'org.apache.hadoop.mapreduce.lib.input.TextInputFormat' und' org.apache.hadoop.mapred.TextInputFormat'? Welcher sollte ausgewählt werden? – sarthak

Überprüfen Sie http://stackoverflow.com/questions/16269922/hadoop-mapred-vs-hadoop-mapreduce – VladoDemcak

Fehler bei der Verwendung der API "newAPIHadoopFile"

Antwort

Verwandte Themen