2017-01-31 1 views
1

Ich konvertiere csv Datei in Datenrahmen. Ich muss jede Zeile im Datenrahmen in verschiedene Zeilen aufteilen, abhängig von der Anzahl der in einer einzelnen Zeile vorhandenen Stapel. Hierwie man jede Reihe im Datenrahmen teilt, die viele Reihen hat und unterschiedliche Reihen für jede Stapelausgabe druckt

ist der Eingang ich bin auch nur 2 Reihen

A002,R051,02-00-00,05-21-11,00:00:00,REGULAR,003169391,001097585,05-21-11,04:00:00,REGULAR,003169415,001097588,05-21-11,08:00:00,REGULAR,003169431,001097607,05-21-11,12:00:00,REGULAR,003169506,001097686,05-21-11,16:00:00,REGULAR,003169693,001097734,05-21-11,20:00:00,REGULAR,003169998,001097769,05-22-11,00:00:00,REGULAR,003170119,001097792,05-22-11,04:00:00,REGULAR,003170146,001097801    
A002,R051,02-00-00,05-22-11,08:00:00,REGULAR,003170164,001097820,05-22-11,12:00:00,REGULAR,003170240,001097867,05-22-11,16:00:00,REGULAR,003170388,001097912,05-22-11,20:00:00,REGULAR,003170611,001097941,05-23-11,00:00:00,REGULAR,003170695,001097964,05-23-11,04:00:00,REGULAR,003170701,001097964,05-23-11,08:00:00,REGULAR,003170746,001098069,05-23-11,12:00:00,REGULAR,003170897,001098378 
+0

Gibt es einige csv-Header? – jezrael

+0

Was genau versuchst du zu tun? Können Sie die erwartete Ausgabe bereitstellen? – AndreyF

Antwort

1

Setup

import pandas as pd 
from io import StringIO 

txt = """A002,R051,02-00-00,05-21-11,00:00:00,REGULAR,003169391,001097585,05-21-11,04:00:00,REGULAR,003169415,001097588,05-21-11,08:00:00,REGULAR,003169431,001097607,05-21-11,12:00:00,REGULAR,003169506,001097686,05-21-11,16:00:00,REGULAR,003169693,001097734,05-21-11,20:00:00,REGULAR,003169998,001097769,05-22-11,00:00:00,REGULAR,003170119,001097792,05-22-11,04:00:00,REGULAR,003170146,001097801    
A002,R051,02-00-00,05-22-11,08:00:00,REGULAR,003170164,001097820,05-22-11,12:00:00,REGULAR,003170240,001097867,05-22-11,16:00:00,REGULAR,003170388,001097912,05-22-11,20:00:00,REGULAR,003170611,001097941,05-23-11,00:00:00,REGULAR,003170695,001097964,05-23-11,04:00:00,REGULAR,003170701,001097964,05-23-11,08:00:00,REGULAR,003170746,001098069,05-23-11,12:00:00,REGULAR,003170897,001098378 
""" 

df = pd.read_csv(StringIO(txt), header=None, index_col=[0, 1, 2]) 

pandas

Verwendung % + // + stack

idx = pd.RangeIndex(len(df.columns)) 
df.columns = [idx % 5, idx // 5] 
df.stack().rename_axis([None] * 4) 

          0   1  2  3  4 
A002 R051 02-00-00 0 05-21-11 00:00:00 REGULAR 3169391 1097585 
        1 05-21-11 04:00:00 REGULAR 3169415 1097588 
        2 05-21-11 08:00:00 REGULAR 3169431 1097607 
        3 05-21-11 12:00:00 REGULAR 3169506 1097686 
        4 05-21-11 16:00:00 REGULAR 3169693 1097734 
        5 05-21-11 20:00:00 REGULAR 3169998 1097769 
        6 05-22-11 00:00:00 REGULAR 3170119 1097792 
        7 05-22-11 04:00:00 REGULAR 3170146 1097801 
        0 05-22-11 08:00:00 REGULAR 3170164 1097820 
        1 05-22-11 12:00:00 REGULAR 3170240 1097867 
        2 05-22-11 16:00:00 REGULAR 3170388 1097912 
        3 05-22-11 20:00:00 REGULAR 3170611 1097941 
        4 05-23-11 00:00:00 REGULAR 3170695 1097964 
        5 05-23-11 04:00:00 REGULAR 3170701 1097964 
        6 05-23-11 08:00:00 REGULAR 3170746 1098069 
        7 05-23-11 12:00:00 REGULAR 3170897 1098378 
+0

@jezrael getan, thx – piRSquared

Verwandte Themen