2017-09-01 2 views
0

Ich bin neu in Python und experimentieren mit Pandas ATM. Ich weiß, es gibt eine Menge Antwort auf die Transformation von Datenrahmen mit Index und Spalten. Allerdings ist es mir bisher nicht gelungen, eine Antwort zu finden, die mir bei dem folgenden Problem hilft. Ich habe eine datafram, die wie folgt aussieht:Python Pandas drehen einen mehrspaltigen Index in Spaltenwerte

style     VALUE  GROWTH  QUALITY 
factor   EarningsYield OPER_MARGIN RETURN_COM_EQY GEARING 
VEDG LX Equity    NaN 18.604873   NaN 1.04020 
DPW DU Equity   0.057845 36.001430  10.957723 0.438649e 

Was würde ich so etwas wie dies erreichen möchte, ist:

ticker   style   factor  value 
VEDG LX Equity VALUE EarningsYield  NaN 
VEDG LX Equity GROWTH OPER_MARGIN 18.604873 
VEDG LX Equity QUALITY RETURN_COM_EQY  NaN 
VEDG LX Equity QUALITY  GEARING 1.04020 
DPW DU Equity VALUE EarningsYield 0.057845 
DPW DU Equity GROWTH OPER_MARGIN 36.001430 
DPW DU Equity QUALITY RETURN_COM_EQY 10.957723 
DPW DU Equity QUALITY  GEARING 0.438649e 

Jede Hilfe mehr als geschätzt würde. Beachten Sie, dass die Ticker-Spalte derzeit der Index ist.

Vielen Dank im Voraus Jungs.

Cheer

danke für Ihre Antworten. Was ich bisher versucht, ist die folgende:

ich eine Excel-Datei importiert und dann das:

# Load the xls file's Sheet1 as a dataframe 
df = xls_file.parse('Sheet1') 
style_names = np.array(['VALUE','GROWTH','QUALITY','QUALITY']) 
factor_names = np.array([ 'EarningsYield ', 'OPER_MARGIN ' ,'RETURN_COM_EQY','GEARING']) 
df.columns = [style_names,factor_names] 
df.columns.names = ['style','factor'] 
print(df) 

style     VALUE  GROWTH  QUALITY   
factor   EarningsYield OPER_MARGIN RETURN_COM_EQY GEARING 
VEDG LX Equity   NaN 18.604873   NaN 1.040200 
DPW DU Equity  0.057946 36.001430  10.957723 0.438649 
SVST LI Equity   NaN 25.405680  41.356272 0.306917 
STM IM Equity  0.016426  3.068980  7.371885 0.227296 
NYR BB Equity  -0.334866 -5.-32.771536 0.509514 
MDC LN Equity  0.000400 13.168425   NaN 0.324293 
TIT IM Equity  0.110168 19.563732  6.842755 0.574045 
OCI NA Equity  -0.002449 15.971676  12.469365 0.751047 
BESI NA Equity  0.031403 20.024775  33.685981 0.263089 
IMPN SW Equity  0.041195  2.808368  6.870435 0.390823 
MHG NO Equity  0.009682 26.454333  29.083558 0.324450 
IAG LN Equity  0.001430 11.450348  42.105263 0.586250 
DG FP Equity   0.057341 10.504060  16.673043 0.496945 

z = zscore(df) 
z[z>3]=3 
z[z<-3]=-3 

print(z.unstack()) 

style factor    
index     0  VEDG LX Equity 
         1  DPW DU Equity 
         2  SVST LI Equity 
         3  STM IM Equity 
         4  NYR BB Equity 
         5  MDC LN Equity 
         6  TIT IM Equity 
         7  OCI NA Equity 
         8  BESI NA Equity 
         9  IMPN SW Equity 
         10  MHG NO Equity 
         11  IAG LN Equity 
         12  DG FP Equity 
VALUE EarningsYield 0    NaN 
         1   0.509544 
         2    NaN 
         3   0.150815 
         4   -2.88433 
         5   0.
         6   0.960737 
         7   -0.0122652 
         8   0.280218 
         9   0.364819 
         10   0.0925452 
         11   0.0212482 
         12   0.504316 
GROWTH OPER_MARGIN  0   0.305012 
         1   1.87814 
         2   0.919993 
         3   -1.09986 
            ...  
         9   -1.12343 
         10   1.01482 
         11   -0.341955 
         12   -0.427525 
QUALITY RETURN_COM_EQY 0    NaN 
         1   -0.232745 
         2   1.20556 
         3   -0.402409 
         4   -2.30179 
         5    NaN 
         6   -0.427444 
         7   -0.161222 
         8   0.842639 
         9   -0.426135 
         10   0.624876 
         11    1.241 
         12   0.0376744 
     GEARING   0   2.49189 
         1   -0.18156 
         2   -0.76701 
         3   -1.12087 
         4   0.133384 
         5   -0.689786 
         6   0.420179 
         7   1.20682 
         8   -0.961795 
         9   -0.39411 
         10   -0.68909 
         11   0.474417 
         12   0.0775232 

So, während es sieht näher an, was ich brauche, ist es nicht ganz so aussehen, was Luce bekam als er es tat.

Irgendwelche Vorschläge, warum das ist?

Dank Gerrit

UPDATE:

Ich denke, dass ich es bekommen haben:

df = xls_file.parse('Sheet1') 
style_names = np.array(['VALUE','GROWTH','QUALITY','QUALITY']) 
factor_names = np.array([ 'EarningsYield ', 'OPER_MARGIN ' ,'RETURN_COM_EQY','GEARING']) 

df.columns = [style_names,factor_names] 
df.columns.names = ['style','factor'] 





print(df) 


z = zscore(df) 
z[z>3]=3 
z[z<-3]=-3 

print(z) 
print(z.unstack()) 

zu = z.unstack() 
zur = zu.reset_index() 

print(zur) 

Das gibt jetzt:

style   factor   level_2   0 
0  VALUE EarningsYield VEDG LX Equity  NaN 
1  VALUE EarningsYield DPW DU Equity 0.509544 
2  VALUE EarningsYield SVST LI Equity  NaN 
3  VALUE EarningsYield STM IM Equity 0.150815 
4  VALUE EarningsYield NYR BB Equity -2.884326 
5  VALUE EarningsYield MDC LN Equity 0.
6  VALUE EarningsYield TIT IM Equity 0.960737 
7  VALUE EarningsYield OCI NA Equity -0.012265 
8  VALUE EarningsYield BESI NA Equity 0.280218 
9  VALUE EarningsYield IMPN SW Equity 0.364819 
10 VALUE EarningsYield MHG NO Equity 0.092545 
11 VALUE EarningsYield IAG LN Equity 0.021248 
12 VALUE EarningsYield  DG FP Equity 0.504316 
13 GROWTH OPER_MARGIN VEDG LX Equity 0.305012 
14 GROWTH OPER_MARGIN DPW DU Equity 1.878142 
15 GROWTH OPER_MARGIN SVST LI Equity 0.919993 
16 GROWTH OPER_MARGIN STM IM Equity -1.099862 
17 GROWTH OPER_MARGIN NYR BB Equity -1.830634 
18 GROWTH OPER_MARGIN MDC LN Equity -0.186593 
19 GROWTH OPER_MARGIN TIT IM Equity 0.391720 
20 GROWTH OPER_MARGIN OCI NA Equity 0.066898 
21 GROWTH OPER_MARGIN BESI NA Equity 0.433411 
22 GROWTH OPER_MARGIN IMPN SW Equity -1.123429 
23 GROWTH OPER_MARGIN MHG NO Equity 1.014821 
24 GROWTH OPER_MARGIN IAG LN Equity -0.341955 
25 GROWTH OPER_MARGIN  DG FP Equity -0.427525 
26 QUALITY RETURN_COM_EQY VEDG LX Equity  NaN 
27 QUALITY RETURN_COM_EQY DPW DU Equity -0.232745 
28 QUALITY RETURN_COM_EQY SVST LI Equity 1.205558 
29 QUALITY RETURN_COM_EQY STM IM Equity -0.402409 
30 QUALITY RETURN_COM_EQY NYR BB Equity -2.301789 
31 QUALITY RETURN_COM_EQY MDC LN Equity  NaN 
32 QUALITY RETURN_COM_EQY TIT IM Equity -0.427444 
33 QUALITY RETURN_COM_EQY OCI NA Equity -0.161222 
34 QUALITY RETURN_COM_EQY BESI NA Equity 0.842639 
35 QUALITY RETURN_COM_EQY IMPN SW Equity -0.426135 
36 QUALITY RETURN_COM_EQY MHG NO Equity 0.624876 
37 QUALITY RETURN_COM_EQY IAG LN Equity 1.240996 
38 QUALITY RETURN_COM_EQY DG FP Equity 0.037674 
39 QUALITY   GEARING VEDG LX Equity 2.491893 
40 QUALITY   GEARING DPW DU Equity -0.181560 
41 QUALITY   GEARING SVST LI Equity -0.767010 
42 QUALITY   GEARING STM IM Equity -1.120867 
43 QUALITY   GEARING NYR BB Equity 0.133384 
44 QUALITY   GEARING MDC LN Equity -0.689786 
45 QUALITY   GEARING TIT IM Equity 0.420179 
46 QUALITY   GEARING OCI NA Equity 1.206820 
47 QUALITY   GEARING BESI NA Equity -0.961795 
48 QUALITY   GEARING IMPN SW Equity -0.394110 
49 QUALITY   GEARING MHG NO Equity -0.689090 
50 QUALITY   GEARING IAG LN Equity 0.474417 
51 QUALITY   GEARING DG FP Equity 0.077523 

Dank für den Hinweis mich in die richtige Richtung!

Viele

geschätzt
+0

Was haben Sie bisher versucht? Bitte zeigen Sie etwas Aufwand/Code. –

Antwort

0

Es sieht aus wie Sie eine Matrix zu transponieren versuchen. Hier ist eine Möglichkeit dies mit Pandas zu tun:

import pandas as pd 
from io import StringIO 
csv_data = '''A,B,C,D 
1.0, 2.0, 3.0, 4.0 
5.0, 6.0, 7.0, 8.0 
9.0, 10.0, 11.0, 12.0''' 
df = pd.read_csv(StringIO(csv_data)) 
df.transpose() 
0

Unstack verwendet werden kann, Wide-Format in langes Format (die entgegengesetzte Funktion ist pivot) zu drehen.

>> import pandas as pd 
>> from io import StringIO 

>> csv = StringIO(u'''style VALUE GROWTH QUALITY QUALITY 
factor EarningsYield OPER_MARGIN RETURN_COM_EQY GEARING 
VEDG LX Equity NaN 18.604873 NaN 1.04020 
DPW DU Equity 0.057845 36.001430 10.957723 0.438649e''') 

>> df = pd.read_csv(csv, sep='\t', header=[0, 1], index_col=0) 
>> df 
style     VALUE  GROWTH  QUALITY   
factor   EarningsYield OPER_MARGIN RETURN_COM_EQY GEARING 
VEDG LX Equity   NaN 18.604873   NaN 1.04020 
DPW DU Equity  0.057845 36.001430  10.957723 0.438649e 

>> df.unstack() 
style factor       
VALUE EarningsYield VEDG LX Equity   NaN 
         DPW DU Equity  0.057845 
GROWTH OPER_MARGIN  VEDG LX Equity  18.6049 
         DPW DU Equity  36.0014 
QUALITY RETURN_COM_EQY VEDG LX Equity   NaN 
         DPW DU Equity  10.9577 
     GEARING   VEDG LX Equity  1.04020 
         DPW DU Equity  0.438649e 

Um Ihr genaues Format zu erhalten, müssen Sie etwas herumspielen.

Verwandte Themen