2017-04-18 7 views
3

Ich versuche, alle diese grauen Namen in einem Datenrahmen, den ich von einer Regierungsbehörde erhalten habe, umzubenennen.Ersetzen Sie alle nicht-alphanumerischen durch einen Punkt

> colnames(thedata) 
[1] "Region"          "Resource Assessment Site ID"     
[3] "Site Name/Facility"       "Design Head (feet)"       
[5] "Design Flow (cfs)"       "Installed Capacity (kW)"      
[7] "Annual Production (MWh)"      "Plant Factor"        
[9] "Total Construction Cost (1,000 $)"   "Annual O&M Cost (1,000 $)"     
[11] "Cost per Installed Capacity ($/kW)"   "Benefit Cost Ratio with Green Incentives" 
[13] "IRR with Green Incentives"     "Benefit Cost Ratio without Green Incentives" 
[15] "IRR without Green Incentives" 

die Spaltenüberschriften haben spezielle nicht-alphanumerische Zeichen und Leerzeichen, so ihnen bezieht, ist unmöglich, so dass ich sie umbenennen. Ich möchte alle nicht-alphanumerischen Zeichen durch einen Punkt ersetzen. Aber ich habe versucht:

old.col.names <- colnames(thedata) 
new.col.names <- gsub("^a-z0-9", ".", old.col.names) 

Die^ist eine „nicht“ Abgrenzung, so dachte ich, es würde alles ersetzen, die nicht alphanumerische mit einer Periode in der old.col.names ist.

Kann jemand helfen?

+2

Sie müssen '[^ [: alnum:] ] + ' – akrun

+2

können Sie' make.names (Spaltennamen (thedata), allow_ = FALSE) versuchen ' – Jimbou

+1

versuchen Sie dieses' pattern = "[^ a-zA-Z0-9]" '. Es ist wichtig, dem Regex zu zeigen, dass eine Kombination aus dem, was zwischen '' 'und' '' steht. Das '^' muss neben diesem Muster auch haben. Viel Glück! –

Antwort

0

Hier sind drei Optionen zu prüfen:

make.names(x) 
gsub("[^A-Za-z0-9]", ".", x) 
names(janitor::clean_names(setNames(data.frame(matrix(NA, ncol = length(x))), x))) 

Hier ist, was jeder sieht aus wie:

make.names(x) 
## [1] "Region"          "Resource.Assessment.Site.ID"     
## [3] "Site.Name.Facility"       "Design.Head..feet."       
## [5] "Design.Flow..cfs."       "Installed.Capacity..kW."      
## [7] "Annual.Production..MWh."      "Plant.Factor"        
## [9] "Total.Construction.Cost..1.000..."   "Annual.O.M.Cost..1.000..."     
## [11] "Cost.per.Installed.Capacity....kW."   "Benefit.Cost.Ratio.with.Green.Incentives" 
## [13] "IRR.with.Green.Incentives"     "Benefit.Cost.Ratio.without.Green.Incentives" 
## [15] "IRR.without.Green.Incentives"    

gsub("[^A-Za-z0-9]", ".", x) 
## [1] "Region"          "Resource.Assessment.Site.ID"     
## [3] "Site.Name.Facility"       "Design.Head..feet."       
## [5] "Design.Flow..cfs."       "Installed.Capacity..kW."      
## [7] "Annual.Production..MWh."      "Plant.Factor"        
## [9] "Total.Construction.Cost..1.000..."   "Annual.O.M.Cost..1.000..."     
## [11] "Cost.per.Installed.Capacity....kW."   "Benefit.Cost.Ratio.with.Green.Incentives" 
## [13] "IRR.with.Green.Incentives"     "Benefit.Cost.Ratio.without.Green.Incentives" 
## [15] "IRR.without.Green.Incentives"    

library(janitor) 
names(clean_names(setNames(data.frame(matrix(NA, ncol = length(x))), x))) 
## [1] "region"          "resource_assessment_site_id"     
## [3] "site_name_facility"       "design_head_feet"       
## [5] "design_flow_cfs"        "installed_capacity_kw"      
## [7] "annual_production_mwh"      "plant_factor"        
## [9] "total_construction_cost_1_000"    "annual_o_m_cost_1_000"      
## [11] "cost_per_installed_capacity_kw"    "benefit_cost_ratio_with_green_incentives" 
## [13] "irr_with_green_incentives"     "benefit_cost_ratio_without_green_incentives" 
## [15] "irr_without_green_incentives"    

Beispieldaten:

x <- c("Region", "Resource Assessment Site ID", "Site Name/Facility", 
    "Design Head (feet)", "Design Flow (cfs)", "Installed Capacity (kW)", 
    "Annual Production (MWh)", "Plant Factor", "Total Construction Cost (1,000 $)", 
    "Annual O&M Cost (1,000 $)", "Cost per Installed Capacity ($/kW)", 
    "Benefit Cost Ratio with Green Incentives", "IRR with Green Incentives", 
    "Benefit Cost Ratio without Green Incentives", "IRR without Green Incentives") 
Verwandte Themen