2017-02-17 20 views

Antwort

1

können Sie das Paket verwenden tm Wort zählt zu finden und dann berechnen die euklidische Entfernung

> library(tm) 
> s1 <- " The quick brown fox jumps over the lazy dog" 
> s2 <- "A quick brown dog outpaces a quick fox" 
> 
> VS <- VectorSource(c(s1,s2)) 
> corp <- Corpus(VS) 
> dtm <- DocumentTermMatrix(corp) 
> d <- dist(t(dtm), method = 'euclidean') 
> d 



     brown  dog  fox jumps  lazy outpaces  over quick 
dog  0.000000                
fox  0.000000 0.000000              
jumps 1.000000 1.000000 1.000000            
lazy  1.000000 1.000000 1.000000 0.000000          
outpaces 1.000000 1.000000 1.000000 1.414214 1.414214       
over  1.000000 1.000000 1.000000 0.000000 0.000000 1.414214     
quick 1.000000 1.000000 1.000000 2.000000 2.000000 1.414214 2.000000   
the  1.414214 1.414214 1.414214 1.000000 1.000000 2.236068 1.000000 2.236068 
Verwandte Themen