Ich arbeite in Apache Spark 1.6.0. Ich habe einen Datenrahmen von 280 Spalten, in denen einige der Spalten vom Typ Zeitstempel sind. Einige Werte des Zeitstempelfelds sind null. Wenn ich versuche, den gleichen Datenrahmen in Cassandra zu schreiben, bekomme ich eine IllegalArgumentException.Datum Typ Null Wert im Datenframe nicht in Cassandra speichern
Die Säule sieht aus wie -
+------------------------+
| LoginDate|
+-------------------------+
| null|
| 2014-06-25T12:27:...|
| 2014-06-25T12:27:...|
| null|
| 2014-06-25T12:27:...|
| 2014-06-25T12:27:...|
| null|
| null|
| 2014-06-25T12:27:...|
| 2014-06-25T12:27:...|
+-------------------------+
Wenn ich versuche, den ganzen Datenrahmen zu cassandra speichern, um den Fehler aufkommt -
05:39:22 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 106.0 (TID 5136,): java.lang.IllegalArgumentException: Invalid date:
at com.datastax.spark.connector.types.TimestampParser$.parse(TimestampParser.scala:50)
at com.datastax.spark.connector.types.TypeConverter$DateConverter$$anonfun$convertPF$13.applyOrElse(TypeConverter.scala:323)
at com.datastax.spark.connector.types.TypeConverter$class.convert(TypeConverter.scala:43)
at com.datastax.spark.connector.types.TypeConverter$DateConverter$.com$datastax$spark$connector$types$NullableTypeConverter$$super$convert(TypeConverter.scala:313)
at com.datastax.spark.connector.types.NullableTypeConverter$class.convert(TypeConverter.scala:56)
at com.datastax.spark.connector.types.TypeConverter$DateConverter$.convert(TypeConverter.scala:313)
at com.datastax.spark.connector.types.TypeConverter$OptionToNullConverter$$anonfun$convertPF$31.applyOrElse(TypeConverter.scala:812)
at com.datastax.spark.connector.types.TypeConverter$class.convert(TypeConverter.scala:43)
at com.datastax.spark.connector.types.TypeConverter$OptionToNullConverter.com$datastax$spark$connector$types$NullableTypeConverter$$super$convert(TypeConverter.scala:795)
at com.datastax.spark.connector.types.NullableTypeConverter$class.convert(TypeConverter.scala:56)
at com.datastax.spark.connector.types.TypeConverter$OptionToNullConverter.convert(TypeConverter.scala:795)
at com.datastax.spark.connector.writer.SqlRowWriter$$anonfun$readColumnValues$1.apply$mcVI$sp(SqlRowWriter.scala:26)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at com.datastax.spark.connector.writer.SqlRowWriter.readColumnValues(SqlRowWriter.scala:24)
at com.datastax.spark.connector.writer.SqlRowWriter.readColumnValues(SqlRowWriter.scala:12)
at com.datastax.spark.connector.writer.BoundStatementBuilder.bind(BoundStatementBuilder.scala:100)
at com.datastax.spark.connector.writer.GroupingBatchBuilder.next(GroupingBatchBuilder.scala:106)
at com.datastax.spark.connector.writer.GroupingBatchBuilder.next(GroupingBatchBuilder.scala:31)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at com.datastax.spark.connector.writer.GroupingBatchBuilder.foreach(GroupingBatchBuilder.scala:31)
at com.datastax.spark.connector.writer.TableWriter$$anonfun$write$1.apply(TableWriter.scala:157)
at com.datastax.spark.connector.writer.TableWriter$$anonfun$write$1.apply(TableWriter.scala:134)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:109)
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:139)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:109)
at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:134)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:37)
at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:37)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Die Art des jeweiligen Feldes in cassandra ist vom Timestamp-Typ.
Jeder kann helfen, das Problem zu lösen?
Nephilim: Ist in der Antwort noch eine Klärung erforderlich? Wenn die Antwort Ihr Problem behebt, akzeptieren Sie bitte die Antwort (das Häkchen) – dilsingi
dilsingi: Es funktioniert. Vielen Dank. – Nephilim