Try this:
In [71]: import pandas_datareader.data as web
In [110]: df = web.DataReader('SBIN.NS', 'yahoo', '2014-10-21', '2014-11-25')
In [111]: df
Out[111]:
Open High Low Close Volume Adj Close
Date
2014-10-21 2580.0000 2607.0001 2569.5999 2584.1501 15022300 251.8850
2014-10-22 2608.9999 2613.5999 2565.1001 2575.2499 14511100 251.0175
2014-10-23 2591.4001 2593.7000 2573.9999 2578.5501 2376200 251.3392
2014-10-24 2578.5501 2578.5501 2578.5501 2578.5501 0 251.3392
2014-10-27 2592.0001 2619.8999 2581.0001 2597.8000 13429500 253.2155
2014-10-28 2607.9999 2664.2999 2606.0001 2656.7499 22963400 258.9616
2014-10-29 2677.0001 2678.9999 2631.0001 2643.7500 17372900 257.6944
2014-10-30 2649.8999 2653.0499 2622.0001 2637.9999 15544200 257.1339
2014-10-31 265.2000 270.9800 264.6000 270.2800 20770200 26.3450 # <bad_data>
2014-11-03 270.6000 274.3500 269.4250 272.3450 17780600 26.5463
2014-11-04 272.3450 272.3450 272.3450 272.3450 0 26.5463
2014-11-05 273.3000 279.9800 272.4050 278.1900 26605100 27.1160
2014-11-06 278.1900 278.1900 278.1900 278.1900 0 27.1160
2014-11-07 277.5000 278.1000 273.0000 274.2500 18163000 26.7320
2014-11-10 275.9000 276.9000 273.3000 273.9500 12068800 26.7027
2014-11-11 274.7900 276.2500 270.5000 274.0350 17405900 26.7110
2014-11-12 275.3000 277.1500 273.5550 274.6050 16233200 26.7666
2014-11-13 275.6100 276.2250 269.5000 271.9300 16859000 26.5059
2014-11-14 273.0000 280.6900 272.0000 278.7850 50846600 27.1740
2014-11-17 279.4000 295.1300 279.2200 294.0600 49164100 28.6629
2014-11-18 295.6950 297.9000 292.4100 294.5750 32898300 28.7131
2014-11-19 294.9000 296.8000 290.3550 291.0500 20735900 28.3695 # </bad_data>
2014-11-20 294.7500 298.7500 291.2500 297.1000 18099500 289.5925
2014-11-21 299.9000 307.0000 297.2500 305.5000 21009200 297.7802
2014-11-24 307.8000 309.8500 306.0500 308.8500 18631400 301.0456
2014-11-25 309.9000 309.9500 301.0000 304.4500 26776600 296.7568
HINWEIS: Adj Close
Spalte hat gewonnen von 2014-11-20
starten, andere Spalten - nicht, so werde ich auf Adj Close
nur konzentrieren:
wir Ausreißer finden (ich überprüfe für diejenigen, die für 50+%
vom Vortag geändert haben - Sie können diesen Schwellenwert ändern):
Hier
In [112]: bad_idx = df.index[df['Adj Close'].pct_change().abs().ge(0.5)]
In [113]: bad_idx
Out[113]: DatetimeIndex(['2014-10-31', '2014-11-20'], dtype='datetime64[ns]', name='Date', freq=None)
In [114]: df.loc[(df.index >= bad_idx.min()) & (df.index < bad_idx.max()), 'Adj Close'] *= 10
In [115]: df
Out[115]:
Open High Low Close Volume Adj Close
Date
2014-10-21 2580.0000 2607.0001 2569.5999 2584.1501 15022300 251.8850
2014-10-22 2608.9999 2613.5999 2565.1001 2575.2499 14511100 251.0175
2014-10-23 2591.4001 2593.7000 2573.9999 2578.5501 2376200 251.3392
2014-10-24 2578.5501 2578.5501 2578.5501 2578.5501 0 251.3392
2014-10-27 2592.0001 2619.8999 2581.0001 2597.8000 13429500 253.2155
2014-10-28 2607.9999 2664.2999 2606.0001 2656.7499 22963400 258.9616
2014-10-29 2677.0001 2678.9999 2631.0001 2643.7500 17372900 257.6944
2014-10-30 2649.8999 2653.0499 2622.0001 2637.9999 15544200 257.1339
2014-10-31 265.2000 270.9800 264.6000 270.2800 20770200 263.4500
2014-11-03 270.6000 274.3500 269.4250 272.3450 17780600 265.4630
2014-11-04 272.3450 272.3450 272.3450 272.3450 0 265.4630
2014-11-05 273.3000 279.9800 272.4050 278.1900 26605100 271.1600
2014-11-06 278.1900 278.1900 278.1900 278.1900 0 271.1600
2014-11-07 277.5000 278.1000 273.0000 274.2500 18163000 267.3200
2014-11-10 275.9000 276.9000 273.3000 273.9500 12068800 267.0270
2014-11-11 274.7900 276.2500 270.5000 274.0350 17405900 267.1100
2014-11-12 275.3000 277.1500 273.5550 274.6050 16233200 267.6660
2014-11-13 275.6100 276.2250 269.5000 271.9300 16859000 265.0590
2014-11-14 273.0000 280.6900 272.0000 278.7850 50846600 271.7400
2014-11-17 279.4000 295.1300 279.2200 294.0600 49164100 286.6290
2014-11-18 295.6950 297.9000 292.4100 294.5750 32898300 287.1310
2014-11-19 294.9000 296.8000 290.3550 291.0500 20735900 283.6950
2014-11-20 294.7500 298.7500 291.2500 297.1000 18099500 289.5925
2014-11-21 299.9000 307.0000 297.2500 305.5000 21009200 297.7802
2014-11-24 307.8000 309.8500 306.0500 308.8500 18631400 301.0456
2014-11-25 309.9000 309.9500 301.0000 304.4500 26776600 296.7568
ist eine andere Lösung, die Interpolation verwendet:
In [119]: df.loc[(df.index >= bad_idx.min()) & (df.index < bad_idx.max()), 'Adj Close'] = np.nan
In [120]: df
Out[120]:
Open High Low Close Volume Adj Close
Date
2014-10-21 2580.0000 2607.0001 2569.5999 2584.1501 15022300 251.8850
2014-10-22 2608.9999 2613.5999 2565.1001 2575.2499 14511100 251.0175
2014-10-23 2591.4001 2593.7000 2573.9999 2578.5501 2376200 251.3392
2014-10-24 2578.5501 2578.5501 2578.5501 2578.5501 0 251.3392
2014-10-27 2592.0001 2619.8999 2581.0001 2597.8000 13429500 253.2155
2014-10-28 2607.9999 2664.2999 2606.0001 2656.7499 22963400 258.9616
2014-10-29 2677.0001 2678.9999 2631.0001 2643.7500 17372900 257.6944
2014-10-30 2649.8999 2653.0499 2622.0001 2637.9999 15544200 257.1339
2014-10-31 265.2000 270.9800 264.6000 270.2800 20770200 NaN
2014-11-03 270.6000 274.3500 269.4250 272.3450 17780600 NaN
2014-11-04 272.3450 272.3450 272.3450 272.3450 0 NaN
2014-11-05 273.3000 279.9800 272.4050 278.1900 26605100 NaN
2014-11-06 278.1900 278.1900 278.1900 278.1900 0 NaN
2014-11-07 277.5000 278.1000 273.0000 274.2500 18163000 NaN
2014-11-10 275.9000 276.9000 273.3000 273.9500 12068800 NaN
2014-11-11 274.7900 276.2500 270.5000 274.0350 17405900 NaN
2014-11-12 275.3000 277.1500 273.5550 274.6050 16233200 NaN
2014-11-13 275.6100 276.2250 269.5000 271.9300 16859000 NaN
2014-11-14 273.0000 280.6900 272.0000 278.7850 50846600 NaN
2014-11-17 279.4000 295.1300 279.2200 294.0600 49164100 NaN
2014-11-18 295.6950 297.9000 292.4100 294.5750 32898300 NaN
2014-11-19 294.9000 296.8000 290.3550 291.0500 20735900 NaN
2014-11-20 294.7500 298.7500 291.2500 297.1000 18099500 289.5925
2014-11-21 299.9000 307.0000 297.2500 305.5000 21009200 297.7802
2014-11-24 307.8000 309.8500 306.0500 308.8500 18631400 301.0456
2014-11-25 309.9000 309.9500 301.0000 304.4500 26776600 296.7568
In [122]: df['Adj Close'] = df['Adj Close'].interpolate()
In [123]: df
Out[123]:
Open High Low Close Volume Adj Close
Date
2014-10-21 2580.0000 2607.0001 2569.5999 2584.1501 15022300 251.885000
2014-10-22 2608.9999 2613.5999 2565.1001 2575.2499 14511100 251.017500
2014-10-23 2591.4001 2593.7000 2573.9999 2578.5501 2376200 251.339200
2014-10-24 2578.5501 2578.5501 2578.5501 2578.5501 0 251.339200
2014-10-27 2592.0001 2619.8999 2581.0001 2597.8000 13429500 253.215500
2014-10-28 2607.9999 2664.2999 2606.0001 2656.7499 22963400 258.961600
2014-10-29 2677.0001 2678.9999 2631.0001 2643.7500 17372900 257.694400
2014-10-30 2649.8999 2653.0499 2622.0001 2637.9999 15544200 257.133900
2014-10-31 265.2000 270.9800 264.6000 270.2800 20770200 259.297807
2014-11-03 270.6000 274.3500 269.4250 272.3450 17780600 261.461713
2014-11-04 272.3450 272.3450 272.3450 272.3450 0 263.625620
2014-11-05 273.3000 279.9800 272.4050 278.1900 26605100 265.789527
2014-11-06 278.1900 278.1900 278.1900 278.1900 0 267.953433
2014-11-07 277.5000 278.1000 273.0000 274.2500 18163000 270.117340
2014-11-10 275.9000 276.9000 273.3000 273.9500 12068800 272.281247
2014-11-11 274.7900 276.2500 270.5000 274.0350 17405900 274.445153
2014-11-12 275.3000 277.1500 273.5550 274.6050 16233200 276.609060
2014-11-13 275.6100 276.2250 269.5000 271.9300 16859000 278.772967
2014-11-14 273.0000 280.6900 272.0000 278.7850 50846600 280.936873
2014-11-17 279.4000 295.1300 279.2200 294.0600 49164100 283.100780
2014-11-18 295.6950 297.9000 292.4100 294.5750 32898300 285.264687
2014-11-19 294.9000 296.8000 290.3550 291.0500 20735900 287.428593
2014-11-20 294.7500 298.7500 291.2500 297.1000 18099500 289.592500
2014-11-21 299.9000 307.0000 297.2500 305.5000 21009200 297.780200
2014-11-24 307.8000 309.8500 306.0500 308.8500 18631400 301.045600
2014-11-25 309.9000 309.9500 301.0000 304.4500 26776600 296.756800
Wie haben Sie diese Daten erhalten? Hast du versucht das 'pandas_datareader' Modul zu benutzen? – MaxU
Wofür ist das? –
Sind das die gesamten Daten? Weil es scheint, dass das Verhältnis konstant ist. – Itay