如何从4GB JSON文件中提取数据？

2024-05-15 02:17:58 发布

男 | 程序猿一只，喜欢编程写python代码。

我有一个4GB JSON文件，其结构如下：

{
    rows: [
        { id: 1, names: { first: 'john', last: 'smith' }, dates: ...},
        { id: 2, names: { first: 'tim', middle: ['james', 'andrew'], last: 'wilson' }, dates: ... },
    ]
}

我只想遍历所有行，对于每一行，提取ID、名称和其他一些细节，并将其写入CSV文件。在

如果我试图以标准方式打开文件，它就会挂起。我一直在尝试使用IJSON，如下所示：

^{pr2}$

这对文件的一个简短的摘录很好，但是对于大文件，它永远挂起。在

我也尝试过这个IJSON方法，它似乎对大的4GB文件有效：

for prefix, the_type, value in ijson.parse(open(fname)):
    print prefix, value

但这似乎是依次打印每个叶节点，没有将每个顶层行作为单独的项的概念—对于具有任意数量的叶节点的JSON数据来说，这一点非常快。要获取所有名称的数组，我需要执行以下操作：

names = []
name = {}
for prefix, the_type, value in ijson.parse(open(fname)):
    print prefix, value
    name[prefix] = 'value'
    if 'first' in name and 'last' in name and 'middle' in name:
        # This is the last of the leaf nodes, we can add it to our list...
        # except.... how to deal with the fact that middle may not 
        # always be present?
        names.append(name)
        name = {}

在这么大的文件中，有没有任何方法可以依次迭代每一行（而不是每个叶）？在

Tags：文件 the name in 名称 id json middle

0条回答

目前没有回答

如何从4GB JSON文件中提取数据？

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何从4GB JSON文件中提取数据？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >