Hallo, ich habe folgendes json:Wie zu überwinden, das folgende Problem beim Parsen einer JSON-Datei?
j = """[
[
{
"created": "2017-02-02T11:57:41+0000",
"from": "Bank",
"message": "Hi Alex, if you have not perform the modification to the data, please verify your DNI, celphone and the operator to verify it. Thanks."
},
{
"created": "2017-02-01T22:19:58+0000" ,
"from": "Alex ",
"message": "Could someone please help me?, I am callig to CC and they don't answer"
},
{
"created": "2017-02-01T22:19:42+0000",
"from": "Alex ",
"message": "the sms with the corresponding key and token has not arrived"
},
{
"created": "2017-02-01T22:19:28+0000",
"from": "Alex ",
"message": "I have issues to make payments from the app"
},
{
"created": "2017-02-01T22:19:18+0000",
"from": "Alex ",
"message": "Good afternoon"
}
],
[
{
"created": "2017-02-01T22:19:12+0000",
"from": "Bank",
"message": " Hello Alexander, the money is available to be withdrawn, you could go to any store the number is 70307002459"
},
{
"created": "2017-02-01T16:22:30+0000",
"from": "Alex",
"message": "hello they have deposited the money into my account, I don't have account from this bank, Could I know if I can withdraw the money? DNI 427 thanks a lot"
}
]
]"""
Da ich brauche eine spezifische Struktur ich es zu analysieren versucht, wie folgt:
js = json.loads(j)
df = pd.concat({i: pd.DataFrame(j) for i, j in enumerate(js)})
df.created = pd.to_datetime(df.created)
df.assign(qna=np.where(df['from'] == 'Bank', 'Answer', 'Question')).set_index(['created', 'qna']).message.unstack(fill_value='')
Alles ist bis zu diesem Zeitpunkt in Ordnung, aber wenn ich hinzufügen, ein anderes Feld mit ich wiederholte Datum bekam folgende Fehlermeldung:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-5652e92adbdc> in <module>()
69 df['from'] = df['from'].str.strip()
70 df = df.drop_duplicates()
---> 71 df.assign(qna=np.where(df['from'] == 'Bank', 'Answer', 'Question')) .set_index(['created', 'qna']) .unstack()
72
73
/usr/local/lib/python3.5/dist-packages/pandas/core/frame.py in unstack(self, level, fill_value)
4034 """
4035 from pandas.core.reshape import unstack
-> 4036 return unstack(self, level, fill_value)
4037
4038 # ----------------------------------------------------------------------
/usr/local/lib/python3.5/dist-packages/pandas/core/reshape.py in unstack(obj, level, fill_value)
406 if isinstance(obj, DataFrame):
407 if isinstance(obj.index, MultiIndex):
--> 408 return _unstack_frame(obj, level, fill_value=fill_value)
409 else:
410 return obj.T.stack(dropna=False)
/usr/local/lib/python3.5/dist-packages/pandas/core/reshape.py in _unstack_frame(obj, level, fill_value)
449 unstacker = _Unstacker(obj.values, obj.index, level=level,
450 value_columns=obj.columns,
--> 451 fill_value=fill_value)
452 return unstacker.get_result()
453
/usr/local/lib/python3.5/dist-packages/pandas/core/reshape.py in __init__(self, values, index, level, value_columns, fill_value)
101
102 self._make_sorted_values_labels()
--> 103 self._make_selectors()
104
105 def _make_sorted_values_labels(self):
/usr/local/lib/python3.5/dist-packages/pandas/core/reshape.py in _make_selectors(self)
139
140 if mask.sum() < len(self.index):
--> 141 raise ValueError('Index contains duplicate entries, '
142 'cannot reshape')
143
ValueError: Index contains duplicate entries, cannot reshape
ich mit diesem neuen json bin versucht, aber es wird von dem Zeitpunkt versagt, so möchte ich recei zu ve unterstützen, diese Aufgabe zu bewältigen:
dies der json ist, die versagt:
j = """[
[
{
"created": "2017-02-02T11:57:41+0000",
"from": "Bank",
"message": "Hi Alex, if you have not perform the modification to the data, please verify your DNI, celphone and the operator to verify it. Thanks."
},
{
"created": "2017-02-01T22:19:58+0000" ,
"from": "Alex ",
"message": "Could someone please help me?, I am callig to CC and they don't answer"
},
{
"created": "2017-02-01T22:19:42+0000",
"from": "Alex ",
"message": "the sms with the corresponding key and token has not arrived"
},
{
"created": "2017-02-01T22:19:28+0000",
"from": "Alex ",
"message": "I have issues to make payments from the app"
},
{
"created": "2017-02-01T22:19:18+0000",
"from": "Alex ",
"message": "Good afternoon"
}
],
[
{
"created": "2017-02-01T22:19:12+0000",
"from": "Bank",
"message": " Hello Alexander, the money is available to be withdrawn, you could go to any store the number is 70307002459"
},
{
"created": "2017-02-01T16:22:30+0000",
"from": "Alex",
"message": "hello they have deposited the money into my account, I don't have account from this bank, Could I know if I can withdraw the money? DNI 427 thanks a lot"
}
],
[
{
"created": "2017-02-01T22:19:13+0000",
"from": "Bank",
"message": " Hello Adolfo, the money is available."
},
{
"created": "2017-02-01T16:22:33+0000",
"from": "Omar",
"message": "hello they have deposited the money into my account."
}
]
]"""
Nur meine Antwort bearbeitet. Es ist nicht notwendig, "append = True" zu haben. Das Problem bestand darin, dass die Anweisung assign eine eigene Zeile benötigte. –