2017-12-16 4 views
0

in der Spalte WEBSITE meines Datasets gibt es Listen (aus einer JSON-Datei). Hier ist ein Beispiel der Spalte Website:Split-Liste in mehrere Spalten in R

> dataset$WEBSITE[[1]]) 
[1] "list(Headers = list(MaxTopicsRootDomain = 30, MaxTopicsSubDomain = 20, MaxTopicsURL = 10, TopicsCount = 3), Data = list(ItemNum = 0, Item = \"https://mywebsite.com/\", ResultCode = \"OK\", Status = \"Found\", ExtBackLinks = 1398, RefDomains = 452, AnalysisResUnitsCost = 1398, ACRank = 4, ItemType = 3, IndexedURLs = 1, GetTopBackLinksAnalysisResUnitsCost = 5000, DownloadBacklinksAnalysisResUnitsCost = 25000, DownloadRefDomainBacklinksAnalysisResUnitsCost = 25000, RefIPs = 323, \n RefSubNets = 273, RefDomainsEDU = 0, ExtBackLinksEDU = 0, RefDomainsGOV = 0, ExtBackLinksGOV = 0, RefDomainsEDU_Exact = 0, ExtBackLinksEDU_Exact = 0, RefDomainsGOV_Exact = 0, ExtBackLinksGOV_Exact = 0, CrawledFlag = \"True\", LastCrawlDate = \"2017-10-05\", LastCrawlResult = \"HTTP_404_NotFound\", RedirectFlag = \"False\", FinalRedirectResult = \"\", OutDomainsExternal = \"5\", OutLinksExternal = \"11\", OutLinksInternal = \"162\", OutLinksPages = \"1\", LastSeen = \"\"... <truncated> 

> dataset$WEBSITE[[2]]) 
[2] "list(Headers = list(MaxTopicsRootDomain = 30, MaxTopicsSubDomain = 20, MaxTopicsURL = 10, TopicsCount = 3), Data = list(ItemNum = 0, Item = \"http://www.website.uk\", ResultCode = \"OK\", Status = \"Found\", ExtBackLinks = 254, RefDomains = 76, AnalysisResUnitsCost = 254, ACRank = 9, ItemType = 3, IndexedURLs = 1, GetTopBackLinksAnalysisResUnitsCost = 5000, DownloadBacklinksAnalysisResUnitsCost = 25000, DownloadRefDomainBacklinksAnalysisResUnitsCost = 25000, RefIPs = 75, RefSubNets = 56, \n RefDomainsEDU = 0, ExtBackLinksEDU = 0, RefDomainsGOV = 0, ExtBackLinksGOV = 0, RefDomainsEDU_Exact = 0, ExtBackLinksEDU_Exact = 0, RefDomainsGOV_Exact = 0, ExtBackLinksGOV_Exact = 0, CrawledFlag = \"True\", LastCrawlDate = \"2017-12-14\", LastCrawlResult = \"DownloadedSuccessfully\", RedirectFlag = \"False\", FinalRedirectResult = \"\", OutDomainsExternal = \"2\", OutLinksExternal = \"2\", OutLinksInternal = \"19\", OutLinksPages = \"1\", LastSeen = \"\", Title = \"Dedic... <truncated> 

> dataset$WEBSITE[[3]]) 
[3] "list(Headers = list(MaxTopicsRootDomain = 30, MaxTopicsSubDomain = 20, MaxTopicsURL = 10, TopicsCount = 3), Data = list(ItemNum = 0, Item = \"http://www.website.uk\", ResultCode = \"OK\", Status = \"Found\", ExtBackLinks = 254, RefDomains = 76, AnalysisResUnitsCost = 254, ACRank = 9, ItemType = 3, IndexedURLs = 1, GetTopBackLinksAnalysisResUnitsCost = 5000, DownloadBacklinksAnalysisResUnitsCost = 25000, DownloadRefDomainBacklinksAnalysisResUnitsCost = 25000, RefIPs = 75, RefSubNets = 56, \n RefDomainsEDU = 0, ExtBackLinksEDU = 0, RefDomainsGOV = 0, ExtBackLinksGOV = 0, RefDomainsEDU_Exact = 0, ExtBackLinksEDU_Exact = 0, RefDomainsGOV_Exact = 0, ExtBackLinksGOV_Exact = 0, CrawledFlag = \"True\", LastCrawlDate = \"2017-12-14\", LastCrawlResult = \"DownloadedSuccessfully\", RedirectFlag = \"False\", FinalRedirectResult = \"\", OutDomainsExternal = \"2\", OutLinksExternal = \"2\", OutLinksInternal = \"19\", OutLinksPages = \"1\", LastSeen = \"\", Title = \"Dedic... <truncated> 

Mein-Datensatz sieht wie folgt aus:

COLOR  | SIZE  | WEBSITE 
Blue  | 13456  | list(Headers = list(MaxTopicsRootDomain = 30, MaxTopicsSubDomain = 20, MaxTopicsURL = 10 
Green  | 17487  | list(Headers = list(MaxTopicsRootDomain = 30, MaxTopicsSubDomain = 20, MaxTopicsURL = 10, 
Red   | 65438  | list(Headers = list(MaxTopicsRootDomain = 30, MaxTopicsSubDomain = 20, MaxTopicsURL = 10, To 

Mein Ziel ist es, jeden json Knoten in einer eigenen Spalte zu drehen, um meine Daten-Set wie folgt aussehen:

COLOR  | SIZE  | MaxTopicsRootDomain | MaxTopicsSubDomain | MaxTopicsURL 
Blue  | 13456  | 30     | 20     | 10 
Green  | 17487  | 30     | 20     | 10 
Red   | 65438  | 30     | 20     | 10 

habe ich versucht, eine Methode, aber ich nicht sicher, ob ich bin auf dem richtigen Weg ...

dataset$WEBSITE <- as.character(dataset$WEBSITE) #character needed for a strsplit() 
hello <- strsplit(dataset$WEBSITE, split = ",") 
hello <- data.frame(COLOR = rep(dataset$Color, 
          sapply(hello, length)), 
          WEBSITE = unlist(hello)) 

Jede Hilfe sehr dankbar erhalten!

+0

Sie erhalten eher Hilfe, wenn Sie ein reproduzierbares Beispiel veröffentlichen. Vielleicht möchten Sie sich die Funktion 'jsonlite :: flatten' anschauen .... – A5C1D2H2I1M1N2O1R2T1

Antwort

0

Ich finde endlich die anwser.

Es ist wahrscheinlich nicht perfekt, aber es funktioniert!

dataset_2 <- do.call(rbind, dataset$WEBSITE) 
dataset_2 <- cbind(dataset[c("COLOR")], dataset_2) 
dataset <- merge(dataset,dataset_2,by="COLOR") 
dataset <- unique (dataset) 
-1

Mit purrr und map_df sollte es funktionieren. Aber ich bin jetzt nicht an meinem Laptop