从JSONL文件中提取嵌套数组

"entities": { "hashtags": [ { "text": "NoJusticeNoPeace", "indices": [ 65, 82 ] }, { "text": "justiceforNaledi", "indices": [ 83, 100 ] },

2条回答

网友

1楼 · 编辑于 2024-05-14 07:36:52

json模块：json编码器和解码器

JSON (JavaScript Object Notation), specified by RFC 7159 (which obsoletes RFC 4627) and by ECMA-404, is a lightweight data interchange format inspired by JavaScript object literal syntax (although it is not a strict subset of JavaScript 1 )...

我鼓励您阅读更多python文档json encoder decoder module

在我的评论之后，json模块和json.load()为您完成了所有工作。只需导入它并调用它的API

如果您使用的是python 3.xx：

import json
import pprint
json_file_path="t.json"

json_data = {}

with open(json_file_path,'r') as jp:
    json_data=json.load(jp)
    pprint.pprint(json_data)
    # sinse hashtags is a list (json array) we access its elements like:
    var = json_data['entities']['hashtags'][0]['text']
    print("var is : {}".format(var))
    print("var type is : {}".format(type(var)))

上述代码的python 3.xx控制台输出

{'entities': {'hashtags': [{'indices': [65, 82], 'text': 'NoJusticeNoPeace'},
                           {'indices': [83, 100], 'text': 'justiceforNaledi'}]}}
var is : NoJusticeNoPeace
var type is : <class 'str'>

在Python2.xx上，唯一的更改是从打印行中省略参数。但上述脚本的输出之间有一个主要区别

在Python3上，字典项类型为str。已经可以使用了。但是在python 2中，字典项的类型是：<type 'unicode'>。所以请注意。您需要将其转换为str，只需执行以下操作：str(var)

网友
2楼 · 编辑于 2024-05-14 07:36:52

正如Adam已经说过的，您可以使用json模块访问此类文件
例如，当我在file.jsonl中有以下内容时：
{ "entities": { "hashtags": [ { "text": "NoJusticeNoPeace", "indices": [ 65, 82 ] }, { "text": "justiceforNaledi", "indices": [ 83, 100 ] } ] } }
要访问此文件中存储的信息，可以执行以下操作：
import json with open('file.jsonl','r') as file: jsonl = json.load(file)
这个jsonl变量现在只是一个字典，您可以像平常一样访问它
hashtags = jsonl['entities']['hashtags'] print(hashtags[0]['text']) >>> NoJusticeNoPeace print(hashtags[1]['indices']) >>> [83, 100]

相关问题更多 >

编程相关推荐

热门问题

热门文章