Python json.loads zu Pandas Dataframe

Ich habe eine URL, die JSON-Daten zurückgibt, wie folgt:Python json.loads zu Pandas Dataframe

{ 
    u 'fields': [{ 
      u 'keyField': False, 
      u 'name': u '_blockid', 
      u 'fieldType': u 'long' 
     }, { 
      u 'keyField': False, 
      u 'name': u '_collector', 
      u 'fieldType': u 'string' 
     }, { 
      u 'keyField': False, 
      u 'name': u '_collectorid', 
      u 'fieldType': u 'long' 
     }, { 
      u 'keyField': False, 
      u 'name': u '_messageid', 
      u 'fieldType': u 'long' 
     } 
    ], 
    u 'messages': [{ 
      u 'map': { 
       u '_messageid': u '-9223368783568280026', 
       u '_collectorid': u '135927517', 
       u '_blockid': u '-9223372036519990555', 
       u '_collector': u 'collector1', 
      } 
     }, { 
      u 'map': { 
       u '_messageid': u '-92233645345280026', 
       u '_collectorid': u '13545342517', 
       u '_blockid': u '-92234254242343219990555', 
       u '_collector': u 'collector2', 
      } 
     } 
    ] 
}

, dass ein Snippet ist. Die wirkliche JSON enthält Tausende von Werten unter [ ‚Nachrichten‘] [ ‚Karte‘]

Ich habe ein Skript, das als

rJSON = requests.get(JsonURL, auth=(username, password)) 
DATA = json.loads(rJSON.text) 
for x in DATA[u'messages']: 
    print type(x[u'map']) 
    for i in x[u'map']: 
     print np.isscalar(x[u'map'][i]) 

    df = pd.DataFrame.from_dict(x[u'map']) 
    break ### TESTING ###

Das gibt die folgenden

folgte läuft

<type 'dict'> 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 
True 

--------------------------------------------------------------------------- 
ValueError        Traceback (most recent call last) 
<ipython-input-151-1b71c28d4d83> in <module>() 
    11  for i in x[u'map']: 
    12   print np.isscalar(q[i]) 
---> 13  df = pd.DataFrame.from_dict(x[u'map']) 
    14 
    15  #if isinstance(msgData, pd.DataFrame): # If the variable is a dataframe, append to it... 

C:\Users\USERID\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.pyc in from_dict(cls, data, orient, dtype) 
    849    raise ValueError('only recognize index or columns for orient') 
    850 
--> 851   return cls(data, index=index, columns=columns, dtype=dtype) 
    852 
    853  def to_dict(self, orient='dict'): 

C:\Users\USERID\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.pyc in __init__(self, data, index, columns, dtype, copy) 
    273         dtype=dtype, copy=copy) 
    274   elif isinstance(data, dict): 
--> 275    mgr = self._init_dict(data, index, columns, dtype=dtype) 
    276   elif isinstance(data, ma.MaskedArray): 
    277    import numpy.ma.mrecords as mrecords 

C:\Users\USERID\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.pyc in _init_dict(self, data, index, columns, dtype) 
    409    arrays = [data[k] for k in keys] 
    410 
--> 411   return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype) 
    412 
    413  def _init_ndarray(self, values, index, columns, dtype=None, copy=False): 

C:\Users\USERID\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.pyc in _arrays_to_mgr(arrays, arr_names, index, columns, dtype) 
    5494  # figure out the index, if necessary 
    5495  if index is None: 
-> 5496   index = extract_index(arrays) 
    5497  else: 
    5498   index = _ensure_index(index) 

C:\Users\USERID\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.pyc in extract_index(data) 
    5533 
    5534   if not indexes and not raw_lengths: 
-> 5535    raise ValueError('If using all scalar values, you must pass' 
    5536        ' an index') 
    5537 

ValueError: If using all scalar values, you must pass an index

Ich verstehe Es ist verrückt, weil das Wörterbuch Skalarwerte enthält, aber ich kann nicht herausfinden, warum sie von json.loads() als Skalar in das Wörterbuch geladen werden oder wie man sie von Skalaren in Strings konvertiert.

Mein Endziel ist es, alle ['messages'] ['map'] Daten und pd.concat sie in der Schleife in einen großen Datenrahmen zu nehmen, den ich analysieren kann.

Ist es möglich, zu verhindern, dass json.loads sie als Skalare laden? Oder gibt es eine Möglichkeit, sie von Skalaren in etwas anderes zu konvertieren, das in einen Datenrahmen geladen werden kann?

Quelle

2017-09-25 user3246693

Versuchen Sie, die 'orientieren = 'index'-Parameter? – ako

Die Nachrichten in der Daten ist eine Liste der Wörterbücher, können Sie es mit DataFrame.from_records laden und dann apply(pd.Series) verwenden, um die inneren Wörterbücher Zeilen des letzten Datenrahmen zu konvertieren:

pd.DataFrame.from_records(data['messages']).map.apply(pd.Series) 

#     _blockid _collector _collectorid   _messageid 
#0  -9223372036519990555 collector1 135927517 -9223368783568280026 
#1 -92234254242343219990555 collector2 13545342517 -92233645345280026

Quelle

2017-09-26 00:47:12 Psidom

Danke !!!! Das hat es geschafft! – user3246693

Python json.loads zu Pandas Dataframe

Antwort

Verwandte Themen