<p>不幸的是,<a href="https://github.com/isagalaev/ijson" rel="nofollow noreferrer">ijson</a>库(截至2018年3月的v2.3)不处理解析多个JSON对象。它只能处理一个整体对象,如果您试图解析第二个对象,将得到一个错误:<code>"ijson.common.JSONError: Additional data"</code>。请参阅此处的错误报告:</p>
<ul>
<li><a href="https://github.com/isagalaev/ijson/issues/40" rel="nofollow noreferrer">https://github.com/isagalaev/ijson/issues/40</a></li>
<li><a href="https://github.com/isagalaev/ijson/issues/42" rel="nofollow noreferrer">https://github.com/isagalaev/ijson/issues/42</a></li>
<li><a href="https://github.com/isagalaev/ijson/issues/67" rel="nofollow noreferrer">https://github.com/isagalaev/ijson/issues/67</a></li>
<li><a href="https://stackoverflow.com/questions/34217042/python-how-do-i-parse-a-stream-of-json-arrays-with-ijson-library">python: how do I parse a stream of json arrays with ijson library</a></li>
</ul>
<p>这是一个很大的限制。但是,只要在每个JSON对象后面都有换行符(新行字符),就可以独立地逐行分析每个<em>,如下所示:</p>
<pre><code>import io
import ijson
with open(filename, encoding="UTF-8") as json_file:
cursor = 0
for line_number, line in enumerate(json_file):
print ("Processing line", line_number + 1,"at cursor index:", cursor)
line_as_file = io.StringIO(line)
# Use a new parser for each line
json_parser = ijson.parse(line_as_file)
for prefix, type, value in json_parser:
print ("prefix=",prefix, "type=",type, "value=",value)
cursor += len(line)
</code></pre>
<p>您仍然在对文件进行流式处理,并且没有将其完全加载到内存中,因此它可以处理大型JSON文件。它还使用来自:<a href="https://stackoverflow.com/questions/620367/how-to-jump-to-a-particular-line-in-a-huge-text-file">How to jump to a particular line in a huge text file?</a>的行流技术,并使用来自:<a href="https://stackoverflow.com/questions/522563/accessing-the-index-in-python-for-loops">Accessing the index in 'for' loops?</a>的<code>enumerate()</code></p>