2017-11-15 3 views
0

Hallo.Lesen von hdfs und Schreiben zu Oracle 12

Ich versuche, von hdfs lesen und schreiben in Oracle mit pyspark, aber ich haben einen Fehler. Ich lege den Code, den ich verwende und die Fehler, die ich bekommen :

pyspark --driver-class-path "/opt/oracle/app/oracle/product/12.1.0.2/dbhome_1/jdbc/lib/ojdbc7.jar" 
from pyspark import SparkContext, SparkConf 
from pyspark.sql import SQLContext, Row 
conf = SparkConf().setAppName("myFirstApp").setMaster("local") 
sc = SparkContext(conf=conf) 
sqlContext = SQLContext(sc) 
lines = sc.textFile("hdfs://bigdatalite.localdomain:8020/user/oracle/ACTIVITY/part-m-00000") 
parts = lines.map(lambda l: l.split(",")) 
people = parts.map(lambda p: Row(name=p[0], age=p[1])) 
schemaPeople = sqlContext.createDataFrame(people) 
url = "jdbc:oracle:[email protected]:1521/orcl" 
properties = { 
    "user": "MOVIEDEMO", 
    "password": "welcome1", 
    "driver": "oracle.jdbc.driver.OracleDriver" 
} 
schemaPeople.write.jdbc(url=url, table="ACTIVITY", mode="append", properties=properties) 

..und den Fehler, die zeigen, ist:

Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/usr/lib/spark/python/pyspark/sql/readwriter.py", line 530, in jdbc 
    self._jwrite.mode(mode).jdbc(url, table, jprop) 
    File "/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__ 
    File "/usr/lib/spark/python/pyspark/sql/utils.py", line 45, in deco 
    return f(*a, **kw) 
    File "/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value 
py4j.protocol.Py4JJavaError: An error occurred while calling o66.jdbc. 
: java.sql.SQLException: Invalid Oracle URL specified 
    at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:453) 
    at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45) 
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:61) 
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:52) 
    at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:278) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) 
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) 
    at py4j.Gateway.invoke(Gateway.java:259) 
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) 
    at py4j.commands.CallCommand.execute(CallCommand.java:79) 
    at py4j.GatewayConnection.run(GatewayConnection.java:209) 
    at java.lang.Thread.run(Thread.java:748) 

PD: Ich mit Funken 1.6.0

Antwort

0

URL sollte im "Service" -Format angegeben werden, dh.

jdbc:oracle:thin:@//myhost:1521/orcl