6

Ich verwende die Elasticsearch + Hbase-Version von Prediction.IO aus dem Docker-Image sphereio/docker-predictionio und die universelle Empfehlungsvorlage template-scala-parallel-universal-recommendation.Prediction.io - Pio-Zug schlägt fehl

pio-start-all und pio status funktionieren gut und der Eventserver ist perfekt funktional. Ich habe eine App erstellt und ein paar hundert Ereignisse importiert.

Nach pio build auf der Vorlage, pio train schlägt jedoch fehl, ein paar javax.naming.NameNotFoundException Warnungen geben. Auch pio.log enthält nichts anderes.

Hier ist meine engine.json:

{ 
    "comment": " This config file uses default settings for all but the required values see README.md for docs", 
    "id": "default", 
    "description": "Default settings", 
    "engineFactory": "com.test.RecommendationEngine", 
    "datasource": { 
     "params": { 
      "name": "sample-handmade-data.txt", 
      "appName": "testapp", 
      "eventNames": ["START"] 
     } 
    }, 
    "sparkConf": { 
     "spark.serializer": "org.apache.spark.serializer.KryoSerializer", 
     "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator", 
     "spark.kryo.referenceTracking": "false", 
     "spark.kryoserializer.buffer": "300m", 
     "spark.executor.memory": "4g", 
     "es.index.auto.create": "true" 
    }, 
    "algorithms": [{ 
     "comment": "simplest setup where all values are default, popularity based backfill, must add eventsNames", 
     "name": "ur", 
     "params": { 
      "appName": "testapp", 
      "indexName": "urindex", 
      "typeName": "items", 
      "comment": "must have data for the first event or the model will not build, other events are optional", 
      "eventNames": ["START"] 
     } 
    }] 
} 

Und die pio train Ausgabe:

[INFO] [Console$] Using existing engine manifest JSON at /PredictionIO-0.9.6/engines/universal-recommendation/manifest.json 
[INFO] [Runner$] Submission command: /PredictionIO-0.9.6/vendors/spark-1.5.1-bin-hadoop2.6/bin/spark-submit --class io.prediction.workflow.CreateWorkflow --jars file:/PredictionIO-0.9.6/engines/universal-recommendation/target/scala-2.10/template-scala-parallel-universal-recommendation-assembly-0.2.3-deps.jar,file:/PredictionIO-0.9.6/engines/universal-recommendation/target/scala-2.10/template-scala-parallel-universal-recommendation_2.10-0.2.3.jar --files file:/PredictionIO-0.9.6/conf/log4j.properties,file:/PredictionIO-0.9.6/vendors/hbase-1.0.0/conf/hbase-site.xml --driver-class-path /PredictionIO-0.9.6/conf:/PredictionIO-0.9.6/vendors/hbase-1.0.0/conf file:/PredictionIO-0.9.6/lib/pio-assembly-0.9.6.jar --engine-id FYOHZGlAmUH2xAYWNmQFIf9Jls201WVr --engine-version a892fe59be15dcf27a17f07fb76135a967309fda --engine-variant file:/PredictionIO-0.9.6/engines/universal-recommendation/engine.json --verbosity 0 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_VERSION=0.9.6,PIO_FS_BASEDIR=/root/.pio_store,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost,PIO_STORAGE_SOURCES_HBASE_HOME=/PredictionIO-0.9.6/vendors/hbase-1.0.0,PIO_HOME=/PredictionIO-0.9.6,PIO_FS_ENGINESDIR=/root/.pio_store/engines,PIO_STORAGE_SOURCES_LOCALFS_PATH=/root/.pio_store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=predictionio,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/PredictionIO-0.9.6/vendors/elasticsearch-1.4.4,PIO_FS_TMPDIR=/root/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_CONF_DIR=/PredictionIO-0.9.6/conf,PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs 
[INFO] [Engine] Extracting datasource params... 
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used. 
[INFO] [Engine] Datasource params: (,DataSourceParams(testapp,List(START))) 
[INFO] [Engine] Extracting preparator params... 
[INFO] [Engine] Preparator params: (,Empty) 
[INFO] [Engine] Extracting serving params... 
[INFO] [Engine] Serving params: (,Empty) 
[INFO] [Remoting] Starting remoting 
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://[email protected]:42582] 
[WARN] [MetricsSystem] Using default name DAGScheduler for source because spark.app.id is not set. 
[INFO] [Engine$] EngineWorkflow.train 
[INFO] [Engine$] DataSource: [email protected] 
[INFO] [Engine$] Preparator: [email protected] 
[INFO] [Engine$] AlgorithmList: List([email protected]) 
[INFO] [Engine$] Data sanity check is on. 
[WARN] [TableInputFormatBase] Cannot resolve the host name for 9a94fb2890b3/172.17.0.2 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '2.0.17.172.in-addr.arpa' 
[INFO] [Engine$] com.test.TrainingData does not support data sanity check. Skipping check. 
[WARN] [TableInputFormatBase] Cannot resolve the host name for 9a94fb2890b3/172.17.0.2 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '2.0.17.172.in-addr.arpa' 
+0

Es ist nur eine Warnung, kein Fehler, oder? Scheitert "pio train" wirklich oder erhalten Sie immer noch Ergebnisse? – Val

+0

Haben Sie das versucht? http://StackOverflow.com/a/12087073/689625 – jay

+0

@Val Wenn ich versuche, pio deploy es wird sagen, dass die Engine vor der Bereitstellung trainiert werden müsste. Also folgerte ich, dass das Training fehlschlug. –

Antwort

2

Es gibt einen Weg, um einen Kurzschluss dieses Problem. Bitte verwenden Sie google d.n.s, während Sie Ihren Andock-Container starten.

--dns=8.8.8.8