Pandas löst seltsame Ausnahme aus, wenn versucht wird, Funktion auf doppelte Spalten anzuwenden

Warum erhalte ich die folgende Fehlermeldung? Ich versuche, eine Funktion auf eine doppelte Spalte anzuwenden. Bitte sagen Sie mir nicht, dass die Lösung etwas wie df["a"] = 2 * df["a"] ist; Dies ist ein heruntergekommenes Beispiel für etwas komplizierteres, an dem ich gerade arbeite.Pandas löst seltsame Ausnahme aus, wenn versucht wird, Funktion auf doppelte Spalten anzuwenden

>>> df = pd.DataFrame({"a" : [0,1,2], "b" : [1,2,3]}) 
>>> df[["a", "a"]].apply(lambda x: x[0] + x[1], axis = 1) 
Traceback (most recent call last): 
    File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\indexes\base.py", line 1980, in get_value 
    tz=getattr(series.dtype, 'tz', None)) 
    File "pandas\index.pyx", line 103, in pandas.index.IndexEngine.get_value (pandas\index.c:3332) 
    File "pandas\index.pyx", line 111, in pandas.index.IndexEngine.get_value (pandas\index.c:3035) 
    File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:3955) 
    File "pandas\index.pyx", line 169, in pandas.index.IndexEngine._get_loc_duplicates (pandas\index.c:4236) 
TypeError: unorderable types: str() > int() 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4061, in apply 
    return self._apply_standard(f, axis, reduce=reduce) 
    File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4157, in _apply_standard 
    results[i] = func(v) 
    File "<stdin>", line 1, in <lambda> 
    File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\core\series.py", line 583, in __getitem__ 
    result = self.index.get_value(self, key) 
    File "C:\Users\Alexander\Anaconda3\lib\site-packages\pandas\indexes\base.py", line 2000, in get_value 
    raise IndexError(key) 
IndexError: (0, 'occurred at index 0')

Quelle

2016-06-11 Alex

Ich habe versucht, es gibt keinen Fehler, es funktioniert vielleicht Version Problem. – shivsn

Welche Version von Python/Pandas benutzt du? Ich benutze 3.5.1, Pandas 0.18.1 – Alex

Hmm, es scheint, unter 2.7.11 und Pandas 0.18.0 zu arbeiten. – Alex

IIUC müssen Sie x[0] und x['1']-x.a ändern, weil es keine Spalten 0 und 1 ist:

a = df[["a", "a"]].apply(lambda x: x.a + x.a, axis = 1) 
print (a) 
    a a 
0 0 0 
1 2 2 
2 4 4

Aber wenn Duplizität Spalten unterschiedliche Werte haben verwenden iloc:

import pandas as pd 

df = pd.DataFrame({"a" : [0,1,2], "b" : [1,2,3]}) 
df.columns = ['a','a'] 
print (df) 
    a a 
0 0 1 
1 1 2 
2 2 3 

df['sum'] = df.iloc[:,0] + df.iloc[:,1] 
print (df) 
    a a sum 
0 0 1 1 
1 1 2 3 
2 2 3 5

Was ist das gleiche wie:

df['sum'] = df.a.apply(lambda x: x.iloc[0] + x.iloc[1], axis = 1) 
print (df) 
    a a sum 
0 0 1 1 
1 1 2 3 
2 2 3 5

Quelle

2016-06-11 19:32:47 jezrael

Eine weitere Problemumgehung, die der Autor auf der github-Seite zeigte, war df [['a', 'a']]. Apply (lambda x: x .iloc [0] + x.iloc [1], Achse = 1) – Alex

Pandas löst seltsame Ausnahme aus, wenn versucht wird, Funktion auf doppelte Spalten anzuwenden

Antwort

Verwandte Themen