2017-03-29 11 views
-1

Ich habe DatenrahmenPandas: GROUPBY einige Daten

datetime city state country shape duration (seconds) duration (hours/min) comments date posted latitude longitude 
10/10/1949 20:30 san marcos tx us cylinder 2700 45 minutes This event took place in early fall around 1949-50. It occurred after a Boy Scout meeting in the Baptist Church. The Baptist Church sit 4/27/2004 29.8830556 -97.9411111 
10/10/1949 21:00 lackland afb tx  light 7200 1-2 hrs 1949 Lackland AFB&#44 TX. Lights racing across the sky & making 90 degree turns on a dime. 12/16/2005 29.38421 -98.581082 
10/10/1955 17:00 chester (uk/england)  gb circle 20 20 seconds Green/Orange circular disc over Chester&#44 England 1/21/2008 53.2 -2.916667 
10/10/1956 21:00 edna tx us circle 20 1/2 hour My older brother and twin sister were leaving the only Edna theater at about 9 PM&#44...we had our bikes and I took a different route home 1/17/2004 28.9783333 -96.6458333 
10/10/1960 20:00 kaneohe hi us light 900 15 minutes AS a Marine 1st Lt. flying an FJ4B fighter/attack aircraft on a solo night exercise&#44 I was at 50&#44000&#39 in a "clean" aircraft (no ordinan 1/22/2004 21.4180556 -157.8036111 

ich tun Gruppe versuchen, state Ich benutze

result = df.groupby("state").\ 
    agg({"state": pd.Series.nunique, "duration (seconds)": np.sum}).\ 
    rename(columns={"state": "frequency", "duration (seconds)": "whole time"}).\ 
    reset_index() 

Aber es gibt Fehler TypeError: must be str, not float. Ich versuche, duration (seconds) zu konvertieren, aber es gibt duration (seconds) zurück. Wie kann ich dieses Problem überprüfen?

+0

, was den Fehler tatsächlich ist zu werfen? tut 'Ergebnis = df.groupby (" state "). agg ({" state ": pd.Series.nunique})' Arbeit? (dh, die Hälfte Ihrer groupby) – Stael

+0

wir haben keine Ahnung, woher der Fehler kommt. poost den ganzen Fehler bro –

Antwort

0

so etwas wie:

# Group df by df.state, then apply a sum lambda function to df.duration(seconds) 
df.groupby('state')['duration (seconds)'].apply(lambda x:x.mean()) 

Oder wenn Sie eine rollende Summe wollen:

df.groupby('state')['duration (seconds)'].apply(lambda x:x.rolling(center=False,window=2).sum())