2017-05-30 9 views
0

Ich versuche, ein Skript gehen Zeile für Zeile in Python mit (Pandas). Ich will es so, dass, wenn Spalte A = "Etwas" und Spalte B> 25 dann schreibe in Spalte C = "Kategorie".Zeile für Zeile (Pandas) - Wenn Spalte A = 'Etwas' und Spalte B> 25 Dann Spalte C = "Kategorie"

Ich habe eine Monatsspalte und eine Tagesspalte. So zum Beispiel:

Wenn Monat = August und Tag> = 25 dann Woche = 25. August

ich ein paar Dinge ausprobiert, weder arbeitet ...

Zuerst habe ich versucht:

import os    ### OS library is imported. 
import pandas as pd  ### Pandas library is imporated as 'pd'. 

counter = 1    ### Counter starts at the first iteration. 

while os.path.exists("CSV-Iteration-"'{0}'"/".format(counter)):  ### Runs the loop until all iteration's folders have been processed. 

    a = pd.read_csv("output-"'{0}'".csv".format(counter))   ### Sets 'a' dataframe as holding data from a CSV file. 
    a['Week'] = "" 

    a[(a['Month'] is 'June') & (a['Day'] < 25)]['Week'] = 'June 18' 
    a[(a['Month'] is 'June') & (a['Day'] >= 25)]['Week'] = 'June 25' 
    a[(a['Month'] is 'July') & (a['Day'] < 2)]['Week'] = 'June 25' 
    a[(a['Month'] is 'July') & (a['Day'] >= 2) & (a['Day'] < 9)]['Week'] = 'July 2' 
    a[(a['Month'] is 'July') & (a['Day'] >= 9) & (a['Day'] < 16)]['Week'] = 'July 9' 
    a[(a['Month'] is 'July') & (a['Day'] >= 16) & (a['Day'] < 23)]['Week'] = 'July 16' 
    a[(a['Month'] is 'July') & (a['Day'] >= 23) & (a['Day'] < 30)]['Week'] = 'July 23' 
    a[(a['Month'] is 'July') & (a['Day'] >= 31) & (a['Day'] < 16)]['Week'] = 'July 30' 
    a[(a['Month'] is 'August') & (a['Day'] < 6)]['Week'] = 'July 30' 

    a[(a['Month'] is 'August') & (a['Day'] >= 6) & (a['Day'] < 13)]['Week'] = 'August 6' 
    a[(a['Month'] is 'August') & (a['Day'] >= 13) & (a['Day'] < 20)]['Week'] = 'August 13' 
    a[(a['Month'] is 'August') & (a['Day'] >= 20) & (a['Day'] < 27)]['Week'] = 'August 20' 
    a[(a['Month'] is 'August') & (a['Day'] >= 27)]['Week'] = 'August 27' 
    a[(a['Month'] is 'September') & (a['Day'] < 3)]['Week'] = 'August 27' 

    a[(a['Month'] is 'September') & (a['Day'] >= 3) & (a['Day'] < 10)]['Week'] = 'September 3' 
    a[(a['Month'] is 'September') & (a['Day'] >= 10) & (a['Day'] < 17)]['Week'] = 'September 10' 
    a[(a['Month'] is 'September') & (a['Day'] >= 17) & (a['Day'] < 24)]['Week'] = 'September 17' 
    a[(a['Month'] is 'September') & (a['Day'] >= 24)] = 'September 24' 

    a[(a['Month'] is 'October') & (a['Day'] >= 1) & (a['Day'] < 8)]['Week'] = 'October 1' 
    a[(a['Month'] is 'October') & (a['Day'] >= 8) & (a['Day'] < 15)]['Week'] = 'October 8' 
    a[(a['Month'] is 'October') & (a['Day'] >= 15) & (a['Day'] < 22)]['Week'] = 'October 15' 
    a[(a['Month'] is 'October') & (a['Day'] >= 22) & (a['Day'] < 29)]['Week'] = 'October 22' 
    a[(a['Month'] is 'October') & (a['Day'] >= 29)]['Week'] = 'October 29' 
    a[(a['Month'] is 'November') & (a['Day'] < 5)]['Week'] = 'October 29' 

    a[(a['Month'] is 'November') & (a['Day'] >= 5) & (a['Day'] < 12)]['Week'] = 'November 5' 
    a[(a['Month'] is 'November') & (a['Day'] >= 12) & (a['Day'] < 19)]['Week'] = 'November 12' 
    a[(a['Month'] is 'November') & (a['Day'] >= 19) & (a['Day'] < 26)]['Week'] = 'November 19' 
    a[(a['Month'] is 'November') & (a['Day'] >= 26)]['Week'] = 'November 26' 
    a[(a['Month'] is 'December') & (a['Day'] < 3)]['Week'] = 'November 26' 

    a[(a['Month'] is 'December') & (a['Day'] >= 3) & (a['Day'] < 10)]['Week'] = 'December 3' 
    a[(a['Month'] is 'December') & (a['Day'] >= 10) & (a['Day'] < 17)]['Week'] = 'December 10' 
    a[(a['Month'] is 'December') & (a['Day'] >= 17) & (a['Day'] < 24)]['Week'] = 'December 17' 
    a[(a['Month'] is 'December') & (a['Day'] >= 24) & (a['Day'] < 31)]['Week'] = 'December 24' 
    a[(a['Month'] is 'December') & (a['Day'] >= 31)]['Week'] = 'December 31' 
    a[(a['Month'] is 'January') & (a['Day'] < 7)]['Week'] = 'December 31' 

    a[(a['Month'] is 'January') & (a['Day'] >= 7) & (a['Day'] < 14)]['Week'] = 'January 7' 
    a[(a['Month'] is 'January') & (a['Day'] >= 14) & (a['Day'] < 21)]['Week'] = 'January 14' 
    a[(a['Month'] is 'January') & (a['Day'] >= 21) & (a['Day'] < 28)]['Week'] = 'January 21' 
    a[(a['Month'] is 'January') & (a['Day'] >= 28)]['Week'] = 'January 28' 

    a.to_csv("TESToutput-"'{0}'".csv".format(counter), index=False)   ### 'a' dataframe becomes 'TESToutput-#.csv' and does not print fields for indexing (index=False). 

    counter += 1  ### Adds 1 to the counter. 

print 'Date Corrections - All Done!' 

Dann habe ich versucht:

import os    ### OS library is imported. 
import pandas as pd  ### Pandas library is imporated as 'pd'. 

counter = 1    ### Counter starts at the first iteration. 

while os.path.exists("CSV-Iteration-"'{0}'"/".format(counter)):  ### Runs the loop until all iteration's folders have been processed. 

    a = pd.read_csv("output-"'{0}'".csv".format(counter))   ### Sets 'a' dataframe as holding data from a CSV file. 
    a['Week'] = "" 

    def this_week (row): 
     if row[(a['Month'] is 'June') + (a['Day'] < 25)]: 
      return 'June 18' 
     if row[(a['Month'] is 'June') + (a['Day'] >= 25)]: 
      return 'June 25' 
     if row[(a['Month'] is 'July') + (a['Day'] < 2)]: 
      return 'June 25' 
     if row[(a['Month'] is 'July') + (a['Day'] >= 2) + (a['Day'] < 9)]: 
      return 'July 2' 
     if row[(a['Month'] is 'July') + (a['Day'] >= 9) + (a['Day'] < 16)]: 
      return 'July 9' 
     if row[(a['Month'] is 'July') + (a['Day'] >= 16) + (a['Day'] < 23)]: 
      return 'July 16' 
     if row[(a['Month'] is 'July') + (a['Day'] >= 23) + (a['Day'] < 30)]: 
      return 'July 23' 
     if row[(a['Month'] is 'July') + (a['Day'] >= 31) + (a['Day'] < 16)]: 
      return 'July 30' 
     if row[(a['Month'] is 'August') + (a['Day'] < 6)]: 
      return 'July 30' 
     if row[(a['Month'] is 'August') + (a['Day'] >= 6) + (a['Day'] < 13)]: 
      return 'August 6' 
     if row[(a['Month'] is 'August') + (a['Day'] >= 13) + (a['Day'] < 20)]: 
      return 'August 13' 
     if row[(a['Month'] is 'August') + (a['Day'] >= 20) + (a['Day'] < 27)]: 
      return 'August 20' 
     if row[(a['Month'] is 'August') + (a['Day'] >= 27)]: 
      return 'August 27' 
     if row[(a['Month'] is 'September') + (a['Day'] < 3)]: 
      return 'August 27' 
     if row[(a['Month'] is 'September') + (a['Day'] >= 3) + (a['Day'] < 10)]: 
      return 'September 3' 
     if row[(a['Month'] is 'September') + (a['Day'] >= 10) + (a['Day'] < 17)]: 
      return 'September 10' 
     if row[(a['Month'] is 'September') + (a['Day'] >= 17) + (a['Day'] < 24)]: 
      return 'September 17' 
     if row[(a['Month'] is 'September') + (a['Day'] >= 24)]: 
      return 'September 24' 
     if row[(a['Month'] is 'October') + (a['Day'] >= 1) + (a['Day'] < 8)]: 
      return 'October 1' 
     if row[(a['Month'] is 'October') + (a['Day'] >= 8) + (a['Day'] < 15)]: 
      return 'October 8' 
     if row[(a['Month'] is 'October') + (a['Day'] >= 15) + (a['Day'] < 22)]: 
      return 'October 15' 
     if row[(a['Month'] is 'October') + (a['Day'] >= 22) + (a['Day'] < 29)]: 
      return 'October 22' 
     if row[(a['Month'] is 'October') + (a['Day'] >= 29)]: 
      return 'October 29' 
     if row[(a['Month'] is 'November') + (a['Day'] < 5)]: 
      return 'October 29' 
     if row[(a['Month'] is 'November') + (a['Day'] >= 5) + (a['Day'] < 12)]: 
      return 'November 5' 
     if row[(a['Month'] is 'November') + (a['Day'] >= 12) + (a['Day'] < 19)]: 
      return 'November 12' 
     if row[(a['Month'] is 'November') + (a['Day'] >= 19) + (a['Day'] < 26)]: 
      return 'November 19' 
     if row[(a['Month'] is 'November') + (a['Day'] >= 26)]: 
      return 'November 26' 
     if row[(a['Month'] is 'December') + (a['Day'] < 3)]: 
      return 'November 26' 
     if row[(a['Month'] is 'December') + (a['Day'] >= 3) + (a['Day'] < 10)]: 
      return 'December 3' 
     if row[(a['Month'] is 'December') + (a['Day'] >= 10) + (a['Day'] < 17)]: 
      return 'December 10' 
     if row[(a['Month'] is 'December') + (a['Day'] >= 17) + (a['Day'] < 24)]: 
      return 'December 17' 
     if row[(a['Month'] is 'December') + (a['Day'] >= 24) + (a['Day'] < 31)]: 
      return 'December 24' 
     if row[(a['Month'] is 'December') + (a['Day'] >= 31)]: 
      return 'December 31' 
     if row[(a['Month'] is 'January') + (a['Day'] < 7)]: 
      return 'December 31' 
     if row[(a['Month'] is 'January') + (a['Day'] >= 7) + (a['Day'] < 14)]: 
      return 'January 7' 
     if row[(a['Month'] is 'January') + (a['Day'] >= 14) + (a['Day'] < 21)]: 
      return 'January 14' 
     if row[(a['Month'] is 'January') + (a['Day'] >= 21) + (a['Day'] < 28)]: 
      return 'January 21' 
     if row[(a['Month'] is 'January') + (a['Day'] >= 28)]: 
      return 'January 28' 

    a['Week'] = a.apply (lambda row: this_week (row), axis=1) 

    a.to_csv("TESToutput-"'{0}'".csv".format(counter), index=False)   ### 'a' dataframe becomes 'TESToutput-#.csv' and does not print fields for indexing (index=False). 

    counter += 1  ### Adds 1 to the counter. 

print 'Date Corrections - All Done!' 

Der zweite mir diesen Fehler gibt: „IndexingError: ('Un Alignable booleschen Serienschlüssel zur Verfügung gestellt ', u'ccurred bei Index 0') "

Ich bin sehr neu in Python, also habe ich diese zusammen basierend auf dem, was ich in den Foren gelesen habe. Bitte lassen Sie mich wissen, ob es einen einfacheren Weg gibt, oder ob es eine Korrektur oder einen Zusatz gibt, damit eines dieser beiden Skripte funktioniert.

Danke!

============================================== ======

BREAK - NEU INFO UNTER

Das, was Datei wie (minus überschüssige Spalten) sieht ein Daten vorhanden sind.

Ich möchte eine Spalte hinzufügen, die die Spalte Monat und Tag überprüft, um einen IE "Woche" zuzuweisen. Unten:

Month  Day C.Sym F.Sym D.Sym Week 
September 3 1    1  Sept 3 
September 27 1      Sept 24 
October  14   1    Oct 8 
October  15   1    Oct 15 
October  17   1    Oct 15 
October  21   1    Oct 15 
October  29   1    Oct 29 
November 30 1      Oct 29 
December 16   1  1  Dec 10 
December 17   1    Dec 10 
December 27   1    Dec 24 
January  6   1    Dec 31 
January  8 1      Jan 7 
January  20   1    Jan 14 

Ein Beispiel für eine elif, die ich jetzt zu übernehmen versuchen:

elif a[(a['Month'] is 'January') & (a['Day'] >= 14) & (a['Day'] < 21)]: 
     ['Week'] = 'January 14' 

Ich hoffe, das präziser und hilft ...

Antwort

0
>>> df = pd.DataFrame({'column_A': ['something', 'day', 'something'], 'column_B' : [30, 40, 10]}) 
>>> df 
    column_A column_B 
0 something  30 
1  day  40 
2 something  10 


>>> df = df.assign(column_C=((df.column_A == 'something') & (df.column_B > 25))) 
>>> df.column_C.replace(True, 'something', inplace=True) 
>>> df 
    column_A column_B column_C 
0 something  30 something 
1  day  40  False 
2 something  10  False 
+0

Das Problem I‘ Wenn ich hineingelaufen bin, brauche ich mehr als Richtig/Falsch für meine Antworten. Ich muss in der Lage sein, "18. Juni", "25. Juni", "2. Juli" usw. zu setzen, abhängig von den Column_Month- und Column_Day-Werten. Im Moment denke ich, dass ich mehrere Ifelse-Abfragen innerhalb des nächsten Datums ausführen muss, um das Ziel zu erreichen. Oder fehlt mir etwas in dieser Antwort? –

+0

@DavidMills Sie müssen spezifisch sein. Es ist viel wahrscheinlicher, dass Sie die Antwort erhalten, um Ihr Problem zu lösen, wenn Sie angeben, wie Ihre Daten aktuell aussehen und was die gewünschte Ausgabe ist. – spies006

+0

Ich habe Kommentare zu den obigen Informationen hinzugefügt, wie die Daten aussehen und was meine gewünschte Ausgabe ist. Vielen Dank! –

Verwandte Themen