<h2>使用<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.io.json.json_normalize.html" rel="nofollow noreferrer">pandas.io.json.json_normalize</a>:</h2>
<h3>数据:</h3>
<ul>
<li>在名为<code>test.json</code>的文件中以<code>list</code>的<code>dicts</code>形式给出数据</li>
</ul>
<pre class="lang-py prettyprint-override"><code>[{
"id": "99014576299056245",
"created_at": "2017-11-16T14:28:53.919Z",
"sensitive": false,
"spoiler_text": "",
"language": "en",
"uri": "mastodon.gamedev.place/users/jaggy/statuses/99014576299056245",
"instance": "mastodon.gamedev.place",
"content": "<p>Coding a cheeky skill before bed. Not as much as I&apos;d like but had drinks with co-workers after work so shrug ^_^</p>",
"account_id": "434",
"tag_list": [],
"media_attachments": [],
"emojis": [],
"mentions": []
}, {
"id": "99014544879467317",
"created_at": "2017-11-16T14:20:54.462Z",
"sensitive": false,
"spoiler_text": "",
"language": "en",
"uri": "mastodon.gamedev.place/users/jaggy/statuses/99014544879467317",
"instance": "mastodon.gamedev.place",
"content": "<p>Coding a cheeky skill before bed. Not as much as I&apos;d like but had drinks with co-workers after work so shrug ^_^</p>",
"account_id": "434",
"tag_list": [],
"media_attachments": [],
"emojis": [],
"mentions": []
}
]
</code></pre>
<h3>读取数据的代码:</h3>
<pre class="lang-py prettyprint-override"><code>import pandas as pd
import json
from pathlib import Path
from pandas.io.json import json_normalize
# path to file
p = Path(r'c:\some_directory_with_data\test.json')
# read the file in and load using the json module
with p.open('r', encoding='utf-8') as f:
data = json.loads(f.read())
# create a dataframe
df = json_normalize(data)
# dataframe view
id created_at sensitive spoiler_text language uri instance content account_id tag_list media_attachments emojis mentions
99014576299056245 2017-11-16T14:28:53.919Z False en mastodon.gamedev.place/users/jaggy/statuses/99014576299056245 mastodon.gamedev.place <p>Coding a cheeky skill before bed. Not as much as I&apos;d like but had drinks with co-workers after work so shrug ^_^</p> 434 [] [] [] []
99014544879467317 2017-11-16T14:20:54.462Z False en mastodon.gamedev.place/users/jaggy/statuses/99014544879467317 mastodon.gamedev.place <p>Coding a cheeky skill before bed. Not as much as I&apos;d like but had drinks with co-workers after work so shrug ^_^</p> 434 [] [] [] []
</code></pre>
<h2>方案2:</h2>
<h3>数据</h3>
<ul>
<li>数据以dict行的形式存在于一个文件中
<ul>
<li>不在列表中</li>
<li>用换行符分开</li>
</ul></li>
<li>这不是有效的JSON文件</li>
</ul>
<pre class="lang-py prettyprint-override"><code>{"id": "99014576299056245", "created_at": "2017-11-16T14:28:53.919Z", "sensitive": false, "spoiler_text": "", "language": "en", "uri": "mastodon.gamedev.place/users/jaggy/statuses/99014576299056245", "instance": "mastodon.gamedev.place", "content": "<p>Coding a cheeky skill before bed. Not as much as I&apos;d like but had drinks with co-workers after work so shrug ^_^</p>", "account_id": "434", "tag_list": [], "media_attachments": [], "emojis": [], "mentions": []}
{"id": "99014544879467317", "created_at": "2017-11-16T14:20:54.462Z", "sensitive": false, "spoiler_text": "", "language": "en", "uri": "mastodon.gamedev.place/users/jaggy/statuses/99014544879467317", "instance": "mastodon.gamedev.place", "content": "<p>Coding a cheeky skill before bed. Not as much as I&apos;d like but had drinks with co-workers after work so shrug ^_^</p>", "account_id": "434", "tag_list": [], "media_attachments": [], "emojis": [], "mentions": []}
</code></pre>
<h3>读取此数据的代码</h3>
<ul>
<li>使用以下代码读取中的文件
<ul>
<li><code>data</code>将是<code>str</code>的列表,其中文件的每一行都是列表中的<code>str</code></li>
<li>使用<code>ast.literal_eval</code>将<code>str</code>转换回dict</li>
<li>^如果<code>str</code>中存在无效值,{<cd9>}将不起作用(例如,false代替false,true代替true)。你知道吗</li>
<li>这将导致<code>ValueError: malformed node or string: <_ast.Name object at 0x000002B7240B7888></code>,这不是一个特别有用的错误</li>
</ul></li>
<li>我已经添加了一个<code>try-except</code>块来打印引起问题的任何行,添加到<code>values_to_fix</code><code>dict</code>直到您得到所有行。你知道吗</li>
</ul>
<pre class="lang-py prettyprint-override"><code>import pandas as pd
import json
from pathlib import Path
from pandas.io.json import json_normalize
from ast import literal_eval
# path to file
p = Path(r'c:\some_directory_with_data\test.json')
list_of_dicts = list()
with p.open('r', encoding='utf-8') as f:
data = f.readlines()
for x in data:
values_to_fix = {'false': 'False',
'true': 'True',
'none': 'None'}
for k, v in values_to_fix.items():
x = x.replace(k, v)
try:
x = literal_eval(x)
list_of_dicts.append(x)
except ValueError as e:
print(e)
print(x)
df = json_normalize(list_of_dicts)
# this output is the same as that shown above
</code></pre>