2017-07-05 4 views
1

Ich versuche, zwei Panda Datenrahmen zu kombinieren, wie untenTypeerror: eine ganze Zahl in pd.merge Python erforderlich ist

gezeigt df_aviris

  0 1 2   3   4 
0   0.0 0.0 0.0 482636.5 4155009.5 
1   0.0 0.0 0.0 482637.5 4155009.5 
2   0.0 0.0 0.0 482638.5 4155009.5 
3   0.0 0.0 0.0 482639.5 4155009.5 
4   0.0 0.0 0.0 482640.5 4155009.5 
5   0.0 0.0 0.0 482641.5 4155009.5 
6   0.0 0.0 0.0 482642.5 4155009.5 
7   0.0 0.0 0.0 482643.5 4155009.5 
8   0.0 0.0 0.0 482644.5 4155009.5 
     ... ... ...  ...  ... 

16730996 0.0 0.0 0.0 485932.5 4149940.5 
16730997 0.0 0.0 0.0 485933.5 4149940.5 
16730998 0.0 0.0 0.0 485934.5 4149940.5 
16730999 0.0 0.0 0.0 485935.5 4149940.5 
[16731000 rows x 5 columns] 

df_geomap

   0  1  2   x   y 
0   255.0 255.0 255.0 477642.5 4158373.5 
1   255.0 255.0 255.0 477643.5 4158373.5 
2   255.0 255.0 255.0 477644.5 4158373.5 
3   255.0 255.0 255.0 477645.5 4158373.5 
4   255.0 255.0 255.0 477646.5 4158373.5 
5   255.0 255.0 255.0 477647.5 4158373.5 
6   255.0 255.0 255.0 477648.5 4158373.5 
     ... ... ...  ...  ... 

79026747 255.0 255.0 255.0 487218.5 4150124.5 
79026748 255.0 255.0 255.0 487219.5 4150124.5 
79026749 255.0 255.0 255.0 487220.5 4150124.5 
[79026750 rows x 5 columns] 

Ich habe versucht, zu fusionieren Diese beiden basieren auf x und y.

DFinal = pd.merge(df_aviris,df_geomap,how='outer',on=['x','y'],left_index=False,right_index=False,copy=False) 

und mit concat auch

DFinal = pd.concat([df_aviris.set_index(['x','y']),df_geomap.set_index(['x','y'])],join='inner',axis=1) 

aber der Fehler immer wie unten

Traceback (most recent call last): 
    File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:4279) 
    File "pandas/src/hashtable_class_helper.pxi", line 404, in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:8543) 
TypeError: an integer is required 
During handling of the above exception, another exception occurred: 
Traceback (most recent call last): 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/indexes/base.py", line 2134, in get_loc 
    return self._engine.get_loc(key) 
    File "pandas/index.pyx", line 132, in pandas.index.IndexEngine.get_loc (pandas/index.c:4433) 
    File "pandas/index.pyx", line 156, in pandas.index.IndexEngine.get_loc (pandas/index.c:4363) 
KeyError: 'x' 
During handling of the above exception, another exception occurred: 
Traceback (most recent call last): 
    File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:4279) 
    File "pandas/src/hashtable_class_helper.pxi", line 404, in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:8543) 
TypeError: an integer is required 
During handling of the above exception, another exception occurred: 
Traceback (most recent call last): 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code 
    exec(code_obj, self.user_global_ns, self.user_ns) 
    File "<ipython-input-38-0a4bfba1b1f4>", line 1, in <module> 
    DFinal = pd.concat([df_aviris.set_index(['x','y']),df_geomap.set_index(['x','y'])],join='inner',axis=1) 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py", line 2917, in set_index 
    level = frame[col]._values 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py", line 2059, in __getitem__ 
    return self._getitem_column(key) 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py", line 2066, in _getitem_column 
    return self._get_item_cache(key) 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/generic.py", line 1386, in _get_item_cache 
    values = self._data.get(item) 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/internals.py", line 3543, in get 
    loc = self.items.get_loc(item) 
    File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/indexes/base.py", line 2136, in get_loc 
    return self._engine.get_loc(self._maybe_cast_indexer(key)) 
    File "pandas/index.pyx", line 132, in pandas.index.IndexEngine.get_loc (pandas/index.c:4433) 
    File "pandas/index.pyx", line 156, in pandas.index.IndexEngine.get_loc (pandas/index.c:4363) 
KeyError: 'x' 

Ich verwende Python 3.6.1

+1

Ihr df_aviris Datenrahmen erscheint nicht auf einen Spaltennamen ‚x‘ noch ‚y‘ zu haben, so dass Sie left_on und right_on für verschiedene Spalten in dem Datenrahmen zu verwenden. Oder benennen Sie die Spalten in Ihrem df_aviris so um, dass sie dem entsprechen, was Sie erwarten. –

Antwort

1

Es Problem ist kein x und y Spalte in df_aviris.

Also für outer müssen beitreten:

DFinal = pd.merge(df_aviris,df_geomap,how='outer',left_on=[3,4], right_on=['x','y']) 

#default outer join, join='outer' can be omit 
DFinal = pd.concat([df_aviris.set_index([3,4]), 
        df_geomap.set_index(['x','y'])],axis=1) 
      .reset_index() 

und für inner beitreten:

#default inner join, how='inner' can be omit 
DFinal = pd.merge(df_aviris,df_geomap,left_on=[3,4], right_on=['x','y']) 

DFinal = pd.concat([df_aviris.set_index([3,4]), 
        df_geomap.set_index(['x','y'])],join='inner',axis=1) 
      .reset_index() 

EDIT:

Ich kann nicht simulieren:

TypeError: an integer is required

vielleicht Upgrade Pandas hilft.

oder wenn es nur eine Zahl nach dem Punkt floating besteht die Möglichkeit, einen kleinen Hack - mehrere von 10 und konvertieren zu int und nach merge Division durch 10: es

df_aviris1 = df_aviris.mul(10).astype(int) 
df_geomap1 = df_geomap.mul(10).astype(int) 

#choose method what need 
DFinal = pd.merge(df_aviris1,df_geomap1,how='outer',left_on=[3,4], right_on=['x','y']) 

DFinal = DFinal.div(10) 
+0

Danke @jezrael – bibinwilson

+0

Froh kann helfen! Ich bin ein bisschen neugierig, war es notwendig, zu "int" mit mehreren 10 zu konvertieren? – jezrael

+1

Nein, DFinal = pd.merge (df_aviris, df_geomap, left_on = [3,4], right_on = ['x', 'y']) das hat selbst funktioniert – bibinwilson

1

konvertiert Typen integer AsType mit (int) wie folgt:

DFinal = pd.merge(df_aviris.astype(int),df_geomap.astype(int),how='outer',on=['x','y'],left_index=False,right_index=False,copy=False) 
Verwandte Themen