Ich habe eine Spark (1.4.1) -Anwendung, die auf einem nicht kerberisierten Cluster ausgeführt wird, und ich habe es in eine andere Instanz kopiert, auf der Kerberos ausgeführt wird. Die Anwendung nimmt Daten von HDFS und legt sie in Phoenix.Spark/Phoenix mit Kerberos auf YARN
Allerdings funktioniert es nicht:
ERROR ipc.AbstractRpcClient: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'.
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:611)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:156)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:737)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:734)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:734)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:887)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:856)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1200)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:50918)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1564)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1502)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1524)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1553)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1704)
at org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:124)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3917)
at org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:441)
at org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:463)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:815)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1215)
at org.apache.phoenix.query.DelegateConnectionQueryServices.createTable(DelegateConnectionQueryServices.java:112)
at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:1902)
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:744)
at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:186)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:304)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:296)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:294)
at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1243)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1893)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1862)
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1862)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:180)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:132)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:151)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:99)
at org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection(ConnectionUtil.java:57)
at org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection(ConnectionUtil.java:45)
at org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil.getSelectColumnMetadataList(PhoenixConfigurationUtil.java:263)
at org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:109)
at org.apache.phoenix.spark.SparkSqlContextFunctions.phoenixTableAsDataFrame(SparkSqlContextFunctions.scala:37)
at com.bosch.asc.utils.HBaseUtils$.scanPhoenix(HBaseUtils.scala:123)
at com.bosch.asc.SMTProcess.addLookup(SMTProcess.scala:1125)
at com.bosch.asc.SMTProcess.saveMountTraceLogToPhoenix(SMTProcess.scala:1039)
at com.bosch.asc.SMTProcess.runETL(SMTProcess.scala:87)
at com.bosch.asc.SMTProcessMonitor$delayedInit$body.apply(SMTProcessMonitor.scala:20)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.App$$anonfun$main$1.apply(App.scala:71)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
at scala.App$class.main(App.scala:71)
at com.bosch.asc.SMTProcessMonitor$.main(SMTProcessMonitor.scala:5)
at com.bosch.asc.SMTProcessMonitor.main(SMTProcessMonitor.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:486)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 70 more
Ich habe
hinzugefügtexport _JAVA_OPTIONS="-Djava.security.krb5.conf=/etc/hadoop/krb5.conf"
in meiner Spark-Vorlage Skript, aber ohne Erfolg. Muss ich den Code selbst ändern, um eine Authentifizierung zu ermöglichen? Ich hatte vorher angenommen, dass das Ticket nur zwischen Anwendungen geteilt wird, und der Code selbst ändert sich nicht.
Falls es hilft: in der Schale ich keine spark.authenticate
Option gesetzt sehen, wenn ich ausführen:
sc.getConf.getAll.foreach(println)
See: http://spark.apache.org/docs/latest/security.html
ich mit Kerberos sehr wenig Erfahrung haben, so dass jede Hilfe ist sehr geschätzt.
die Kerberos-Debug-Informationen zu aktivieren: 'Export HADOOP_JAAS_DEBUG = true 'plus' -Dsun.security.krb5.debug = true ' –
https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/secrets.html –
Führen Sie Spark im" lokalen "Modus aus? Andernfalls haben die Executoren möglicherweise kein gültiges Kerberos-Ticket auf dem Host, auf dem sie ausgeführt werden, und Sie müssen die Hadoop-Authentifizierung selbst verwalten, vgl. http://stackoverflow.com/questions/35332026/issue-scala-code-in-spark-shell-to-retrieve-data-from-hbase/35473941 –