2017-07-04 3 views
1

Ich versuche, mit Stanford CoreNLP zu beginnen und kann nicht einmal über das allererste einfache Beispiel von hier hinaus. Stanford CoreNLP BasicPipelineBeispiel funktioniert nicht

https://stanfordnlp.github.io/CoreNLP/api.html

Hier ist mein Code:

package stanford.corenlp; 

import java.io.File; 
import java.io.IOException; 
import java.nio.charset.Charset; 
import java.util.List; 
import java.util.Map; 
import java.util.Properties; 

import com.google.common.io.Files; 

import edu.stanford.nlp.dcoref.CorefChain; 
import edu.stanford.nlp.dcoref.CorefCoreAnnotations.CorefChainAnnotation; 
import edu.stanford.nlp.ling.CoreAnnotations.NamedEntityTagAnnotation; 
import edu.stanford.nlp.ling.CoreAnnotations.PartOfSpeechAnnotation; 
import edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation; 
import edu.stanford.nlp.ling.CoreAnnotations.TextAnnotation; 
import edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation; 
import edu.stanford.nlp.ling.CoreLabel; 
import edu.stanford.nlp.pipeline.Annotation; 
import edu.stanford.nlp.pipeline.StanfordCoreNLP; 
import edu.stanford.nlp.semgraph.SemanticGraph; 
import edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation; 
import edu.stanford.nlp.trees.Tree; 
import edu.stanford.nlp.trees.TreeCoreAnnotations.TreeAnnotation; 
import edu.stanford.nlp.util.CoreMap; 
import java.util.logging.Level; 
import java.util.logging.Logger; 

    private void test2() { 
     // creates a StanfordCoreNLP object, with POS tagging, lemmatization, NER, parsing, and coreference resolution 
     Properties props = new Properties(); 
     props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref"); 
     StanfordCoreNLP pipeline = new StanfordCoreNLP(props); 

     // read some text in the text variable 
     String text = "Now is the time for all good men to come to the aid of their country."; 

     // create an empty Annotation just with the given text 
     Annotation document = new Annotation(text); 

     // run all Annotators on this text 
     pipeline.annotate(document); 

    } 

    public static void main(String[] args) throws IOException { 
     StanfordNLP nlp = new StanfordNLP(); 
     nlp.test2(); 
    } 

} 

Hier ist die Stacktrace:

Adding annotator tokenize 
No tokenizer type provided. Defaulting to PTBTokenizer. 
Adding annotator ssplit 
Adding annotator pos 
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: Error while loading a tagger model (probably missing model file) 
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:791) 
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:312) 
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:265) 
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:85) 
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:73) 
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.posTagger(AnnotatorImplementations.java:55) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$42(StanfordCoreNLP.java:496) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getDefaultAnnotatorPool$65(StanfordCoreNLP.java:533) 
    at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:118) 
    at edu.stanford.nlp.util.Lazy.get(Lazy.java:31) 
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:146) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:447) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:150) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:146) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:133) 
    at stanford.corenlp.StanfordNLP.test2(StanfordNLP.java:93) 
    at stanford.corenlp.StanfordNLP.main(StanfordNLP.java:108) 
Caused by: java.io.IOException: Unable to open "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as class path, filename or URL 
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:480) 
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:789) 
    ... 16 more 
C:\Users\Greg\AppData\Local\NetBeans\Cache\8.2\executor-snippets\run.xml:53: Java returned: 1 
BUILD FAILED (total time: 0 seconds) 

Was bin ich?

Antwort

1

Zuerst müssen Sie zum Klassenpfad stanford-corenlp-3.8.0.jar hinzufügen. Dadurch verschwinden die roten Fehlermarken in NetBeans. Sie müssen aber auch stanford-corenlp-3.8.0-models.jar zum Klassenpfad hinzufügen, um den von mir dokumentierten Fehler zu vermeiden. Das Hinzufügen des Ordners, in dem es sich befindet, zu dem Klassenpfad funktioniert nicht. Details wie diese sollten niemals in der Anfängerdokumentation fehlen!

Jetzt, wenn Sie mit dem Beispiel fortfahren und die neuen Sachen hinzufügen, treten mehr Fehler auf. Zum Beispiel wird der Code dann wie folgt aussehen:

package stanford.corenlp; 

    import java.io.File; 
    import java.io.IOException; 
    import java.nio.charset.Charset; 
    import java.util.List; 
    import java.util.Map; 
    import java.util.Properties; 

    import com.google.common.io.Files; 

    import edu.stanford.nlp.dcoref.CorefChain; 
    import edu.stanford.nlp.dcoref.CorefCoreAnnotations.CorefChainAnnotation; 
    import edu.stanford.nlp.ling.CoreAnnotations.NamedEntityTagAnnotation; 
    import edu.stanford.nlp.ling.CoreAnnotations.PartOfSpeechAnnotation; 
    import edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation; 
    import edu.stanford.nlp.ling.CoreAnnotations.TextAnnotation; 
    import edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation; 
    import edu.stanford.nlp.ling.CoreLabel; 
    import edu.stanford.nlp.pipeline.Annotation; 
    import edu.stanford.nlp.pipeline.StanfordCoreNLP; 
    import edu.stanford.nlp.semgraph.SemanticGraph; 
    import edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation; 
    import edu.stanford.nlp.trees.Tree; 
    import edu.stanford.nlp.trees.TreeCoreAnnotations.TreeAnnotation; 
    import edu.stanford.nlp.util.CoreMap; 
    import edu.stanford.nlp.util.PropertiesUtils; 
    import java.util.logging.Level; 
    import java.util.logging.Logger; 

     private void test2() { 
      // creates a StanfordCoreNLP object, with POS tagging, lemmatization, NER, parsing, and coreference resolution 
      Properties props = new Properties(); 
      props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref"); 
      StanfordCoreNLP pipeline = new StanfordCoreNLP(
      PropertiesUtils.asProperties(
       "annotators", "tokenize,ssplit,pos,lemma,parse,natlog", 
       "ssplit.isOneSentence", "true", 
       "parse.model", "edu/stanford/nlp/models/srparser/englishSR.ser.gz", 
       "tokenize.language", "en")); 

      // read some text in the text variable 
      String text = "Now is the time for all good men to come to the aid of their country."; 

      // create an empty Annotation just with the given text 
      Annotation document = new Annotation(text); 

      // run all Annotators on this text 
      pipeline.annotate(document); 

      // these are all the sentences in this document 
      // a CoreMap is essentially a Map that uses class objects as keys and has values with custom types 
      List<CoreMap> sentences = document.get(SentencesAnnotation.class); 

      for (CoreMap sentence: sentences) { 
       // traversing the words in the current sentence 
       // a CoreLabel is a CoreMap with additional token-specific methods 
       for (CoreLabel token: sentence.get(TokensAnnotation.class)) { 
        // this is the text of the token 
        String word = token.get(TextAnnotation.class); 
        // this is the POS tag of the token 
        String pos = token.get(PartOfSpeechAnnotation.class); 
        // this is the NER label of the token 
        String ne = token.get(NamedEntityTagAnnotation.class); 

        System.out.println("word="+word +", pos="+pos +", ne="+ne); 
       } 

       // this is the parse tree of the current sentence 
       Tree tree = sentence.get(TreeAnnotation.class); 

       // this is the Stanford dependency graph of the current sentence 
       SemanticGraph dependencies = sentence.get(CollapsedCCProcessedDependenciesAnnotation.class); 
      } 

      // This is the coreference link graph 
      // Each chain stores a set of mentions that link to each other, 
      // along with a method for getting the most representative mention 
      // Both sentence and token offsets start at 1! 
      Map<Integer, CorefChain> graph = 
       document.get(CorefChainAnnotation.class); 
     } 

     public static void main(String[] args) throws IOException { 
      StanfordNLP nlp = new StanfordNLP(); 
      nlp.test2(); 
     } 

    } 

Und der Stack-Trace wird:

run: 
Adding annotator tokenize 
Adding annotator ssplit 
Adding annotator pos 
Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0.6 sec]. 
Adding annotator lemma 
Adding annotator parse 
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.IOException: Unable to open "edu/stanford/nlp/models/srparser/englishSR.ser.gz" as class path, filename or URL 
    at edu.stanford.nlp.parser.common.ParserGrammar.loadModel(ParserGrammar.java:187) 
    at edu.stanford.nlp.pipeline.ParserAnnotator.loadModel(ParserAnnotator.java:219) 
    at edu.stanford.nlp.pipeline.ParserAnnotator.<init>(ParserAnnotator.java:121) 
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.parse(AnnotatorImplementations.java:115) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$50(StanfordCoreNLP.java:504) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getDefaultAnnotatorPool$65(StanfordCoreNLP.java:533) 
    at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:118) 
    at edu.stanford.nlp.util.Lazy.get(Lazy.java:31) 
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:146) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:447) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:150) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:146) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:133) 
    at stanford.corenlp.StanfordNLP.test2(StanfordNLP.java:95) 
    at stanford.corenlp.StanfordNLP.main(StanfordNLP.java:145) 
Caused by: java.io.IOException: Unable to open "edu/stanford/nlp/models/srparser/englishSR.ser.gz" as class path, filename or URL 
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:480) 
    at edu.stanford.nlp.io.IOUtils.readObjectFromURLOrClasspathOrFileSystem(IOUtils.java:309) 
    at edu.stanford.nlp.parser.common.ParserGrammar.loadModel(ParserGrammar.java:184) 
    ... 14 more 
C:\Users\Greg\AppData\Local\NetBeans\Cache\8.2\executor-snippets\run.xml:53: Java returned: 1 
BUILD FAILED (total time: 1 second) 

ich an dieser Stelle endlich durch das Herunterladen und zur Ergänzung des Classpath Stanford-Englisch-corenlp-2017 -06-09-models.jar die Sie von dem "englischen" Download-Link hier bekommen:

https://stanfordnlp.github.io/CoreNLP/download.html

Sie müssen dies trotz der Nachricht auf der Download-Seite tun, die besagt, dass alles, was für Englisch benötigt wird, bereits im corenlp-Download enthalten ist!

Verwandte Themen