Ich verwende Spark 2.0.0 und erstellen Dataset
mit SparkSession
. Wenn ich java.util.UUID
in createDataFrame
Methode verwende, funktioniert es gut. Aber wenn ich java.util.UUID
als ein Feld in Javabean habe und wenn ich dieses Javabean benutze, um Dataset zu erstellen, gibt es mir scala.MatchError
. Bitte beachten Sie den Code und das Konsolenprotokoll unten. Kann mir jemand sagen, was hier vorgeht und wie kann ich Dataset
mit UUID
in Javabean Klasse erstellen. Vielen Dank.Unterschiedliches Verhalten von Spark Dataset für java.util.UUID
UUIDTest.java
public class UUIDTest {
public static void main(String[] args) {
SparkSession spark = SparkSession
.builder()
.appName("UUIDTest")
.config("spark.sql.warehouse.dir", "/file:C:/temp")
.master("local[2]")
.getOrCreate();
System.out.println("====> Create Dataset using UUID");
//Working
List<UUID> uuids = Arrays.asList(UUID.randomUUID(),UUID.randomUUID());
Dataset<Row> uuidSet = spark.createDataFrame(uuids, UUID.class);
uuidSet.show();
System.out.println("====> Create Dataset using UserUUID");
//Not Working
List<UserUUID> userUuids = Arrays.asList(new UserUUID(UUID.randomUUID()),new UserUUID(UUID.randomUUID()));
Dataset<Row> userUuidSet = spark.createDataFrame(userUuids, UserUUID.class);//Exception at this line
userUuidSet.show();
spark.stop();
}
}
UserUUID.java
public class UserUUID implements Serializable{
private UUID uuid;
public UserUUID() {
}
public UserUUID(UUID uuid) {
this.uuid = uuid;
}
public UUID getUuid() {
return uuid;
}
public void setUuid(UUID uuid) {
this.uuid = uuid;
}
}
Console Ausgang
16/08/26 22:49:23 INFO SharedState: Warehouse path is '/file:C:/temp'.
====> Create Dataset using UUID
16/08/26 22:49:26 INFO CodeGenerator: Code generated in 248.230818 ms
16/08/26 22:49:26 INFO CodeGenerator: Code generated in 10.550477 ms
+--------------------+-------------------+
|leastSignificantBits|mostSignificantBits|
+--------------------+-------------------+
|-6786538026241948655|5045373365275148508|
|-9161219066266259673|6040751881536491488|
+--------------------+-------------------+
====> Create Dataset using UserUUID
Exception in thread "main" scala.MatchError: 4fa3941c-f312-4031-a61b-01f2acef751b (of class java.util.UUID)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:256)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:251)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:103)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:403)
at org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106)
at org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1106)
at org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1104)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$class.toStream(Iterator.scala:1322)
at scala.collection.AbstractIterator.toStream(Iterator.scala:1336)
at scala.collection.TraversableOnce$class.toSeq(TraversableOnce.scala:298)
at scala.collection.AbstractIterator.toSeq(Iterator.scala:1336)
at org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:373)
at com.UUIDTest.main(UUIDTest.java:30)
16/08/26 22:49:26 INFO SparkContext: Invoking stop() from shutdown hook
Ich bin das gleiche Problem haben. Hast du jemals eine Lösung gefunden? – ammills01