keys, *data =<<_.split(/\n/).map { |line| line.split /,\s+/ }
First Name, Last Name, Age, Income, Household Size, Gender, Education
Jon, Smith, 25, 50000, 1, Male, College
Jane, Davies, 30, 60000, 3, Female, High School
Sam, Farelly, 32, 80000, 2, Unspecified, College
Joan, Favreau, 35, 65000, 4, Female, College
Sam, McNulty, 38, 63000, 3, Male, College
Mark, Minahan, 48, 78000, 5, Male, High School
Susan, Umani, 45, 75000, 2, Female, College
Bill, Perault, 24, 45000, 1, Male, Did Not Complete High School
Doug, Stamper, 45, 75000, 1, Male, College
Francis, Underwood, 52, 100000, 2, Male, College
_
Wir haben jetzt die folgenden Werte für keys
und data
.
keys
#=> ["First Name", "Last Name", "Age", "Income", "Household Size",
# "Gender", "Education"]
data
#=> [["Jon", "Smith", "25", "50000", "1", "Male", "College"],
# ["Jane", "Davies", "30", "60000", "3", "Female", "High School"],
# ["Sam", "Farelly", "32", "80000", "2", "Unspecified", "College"],
# ["Joan", "Favreau", "35", "65000", "4", "Female", "College"],
# ["Sam", "McNulty", "38", "63000", "3", "Male", "College"],
# ["Mark", "Minahan", "48", "78000", "5", "Male", "High School"],
# ["Susan", "Umani", "45", "75000", "2", "Female", "College"],
# ["Bill", "Perault", "24", "45000", "1", "Male", "Did Not Complete High School"],
# ["Doug", "Stamper", "45", "75000", "1", "Male", "College"],
# ["Francis", "Underwood", "52", "100000", "2", "Male", "College"]]
Als nächstes erstellen Sie den folgenden Hash.
h = keys.zip(data.transpose).to_h
#=> {"First Name" =>["Jon", "Jane", "Sam", "Joan", "Sam", "Mark", "Susan",
# "Bill", "Doug", "Francis"],
# "Last Name" =>["Smith", "Davies", "Farelly", "Favreau", "McNulty", "Minahan",
# "Umani", "Perault", "Stamper", "Underwood"],
# "Age" =>["25", "30", "32", "35", "38", "48", "45", "24", "45", "52"],
# "Income" =>["50000", "60000", "80000", "65000", "63000", "78000",
# "75000", "45000", "75000", "100000"],
# "Household Size"=>["1", "3", "2", "4", "3", "5", "2", "1", "1", "2"],
# "Gender" =>["Male", "Female", "Unspecified", "Female", "Male", "Male",
# "Female", "Male", "Male", "Male"],
# "Education" =>["College", "High School", "College", "College", "College",
# "High School", "College", "Did Not Complete High School",
# "College", "College"]}
Es ist jetzt einfach, die verschiedenen Statistiken zu berechnen.
n = arr.size.to_f
#=> 10.0
avg_age = h["Age"].map(&:to_i).reduce(:+)/n.to_f
#=> 37.4
avg_income = h["Income"].map(&:to_i).reduce(:+)/n.to_f
#=> 69100.0
avg_hsize = h["Household Size"].map(&:to_i).reduce(:+)/n.to_f
#=> 2.4
pct_female= 100*h["Gender"].count("Female")/n.to_f
#=> 30.0
und so weiter.
Computing andere Statistiken
Angenommen, Sie wollten Statistiken berechnen, die mehrere Schlüssel, wie das Durchschnittsalter der Frauen beteiligt. Der einfachste Weg, dies zu tun (und die einfachen Mittelwerte und Prozentsätze zu berechnen), besteht darin, die Daten in eine Datenbank zu stellen und SQL-Abfragen zu verwenden. Wir können das aber auch, indem wir zuerst ein Array von Hashes erstellen.
arr = data.map { |row| keys.zip(row).to_h }
#=> [{"First Name"=>"Jon", "Last Name"=>"Smith", "Age"=>"25", "Income"=>"50000",
# "Household Size"=>"1", "Gender"=>"Male", "Education"=>"College"},
# {"First Name"=>"Jane", "Last Name"=>"Davies", "Age"=>"30", "Income"=>"60000",
# "Household Size"=>"3", "Gender"=>"Female", "Education"=>"High School"},
# {"First Name"=>"Sam", "Last Name"=>"Farelly", "Age"=>"32", "Income"=>"80000",
# "Household Size"=>"2", "Gender"=>"Unspecified", "Education"=>"College"},
# {"First Name"=>"Joan", "Last Name"=>"Favreau", "Age"=>"35", "Income"=>"65000",
# "Household Size"=>"4", "Gender"=>"Female", "Education"=>"College"},
# {"First Name"=>"Sam", "Last Name"=>"McNulty", "Age"=>"38", "Income"=>"63000",
# "Household Size"=>"3", "Gender"=>"Male", "Education"=>"College"},
# {"First Name"=>"Mark", "Last Name"=>"Minahan", "Age"=>"48", "Income"=>"78000",
# "Household Size"=>"5", "Gender"=>"Male", "Education"=>"High School"},
# {"First Name"=>"Susan", "Last Name"=>"Umani", "Age"=>"45", "Income"=>"75000",
# "Household Size"=>"2", "Gender"=>"Female", "Education"=>"College"},
# {"First Name"=>"Bill", "Last Name"=>"Perault", "Age"=>"24", "Income"=>"45000",
# "Household Size"=>"1", "Gender"=>"Male",
# "Education"=>"Did Not Complete High School"},
# {"First Name"=>"Doug", "Last Name"=>"Stamper", "Age"=>"45", "Income"=>"75000",
# "Household Size"=>"1", "Gender"=>"Male", "Education"=>"College"},
# {"First Name"=>"Francis", "Last Name"=>"Underwood", "Age"=>"52",
# "Income"=>"100000", "Household Size"=>"2", "Gender"=>"Male",
# "Education"=>"College"}]
dann das Durchschnittsalter von Frauen zu berechnen, ein Array von Alter für Frauen schaffen, dann seine Elemente summieren und diese Summe durch die Größe des Feldes unterteilt.
a = arr.each_with_object([]) { |h,a| a << h["Age"].to_i if h["Gender"]=="Female" }
#=> [30, 35, 45]
a.empty? ? 0.0 : a.reduce(:+)/a.size.to_f
#=> 36.666666666666664
Andere Berechnungen sind ähnlich.
'voter_demographics.first [: Alter] .instance_eval {inject (: +)/size}' –
Willkommen bei Stack Overflow. Während wir mit der Aufgabe des Lernens betonen können, ist es _REIELL_ wichtig, dass Sie es versuchen, dann versuchen Sie es erneut und fahren Sie fort, bis Sie es nicht mehr versuchen können, wenn Sie mit Hausaufgaben oder selbst zugewiesenem Lernen zu tun haben. "[ask]", "[mcve]", "[Wie viel Forschungsaufwand wird von Stack Overflow-Benutzern erwartet?] (http://meta.stackoverflow.com/q/261592)". "Fragen, die nach Hausaufgaben fragen, müssen eine Zusammenfassung der bisherigen Arbeit enthalten, um das Problem zu lösen, und eine Beschreibung der Schwierigkeit, die Sie bei der Lösung des Problems haben." und http://meta.stackoverflow.com/questions/334822 –