从三个lis中查找相关实体

2024-04-25 02:24:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我有三个列表包含以下数据:

Entities:  ['Ashraf', 'Afghanistan', 'Afghanistan', 'Kabul']
Relations:  ['Born', 'President', 'employee', 'Capital', 'Located', 'Lecturer', 'University']
sentence_list: ['Ashraf','Born', 'in', 'Kabul', '.' 'Ashraf', 'is', 'the', 'president', 'of', 'Afghanistan', '.', ...]

因为sentence_list是一个句子列表。在每个句子中,我想检查是否有EntitiesRelations的任何单词,特定单词的组合应该添加到另一个列表中。例如,第一句中的(Ashraf, born, Kabul)。你知道吗

我所做的:

第一个不完整的解决方案:

# read file
with open('../data/parse.txt', 'r') as myfile:
    json_data = json.load(myfile)

for i in range(len(json_data)): # the dataset was in json format
     if json_data[i]['word'] in relation(json_data)[0]: # I extract the relations
         print(json_data[i]['word'])
     if json_data[i]['word'] in entities(json_data)[0]:
         print(json[i]['word'])

输出:(Ashraf, Born, Ashraf),我想要(Ashraf, Born, Kabul)

下一个不完整的解决方案:我将json_data存储到一个列表中,然后执行以下操作:

json_data2 = []
for i in range(len(json_data)):
    json2_data.append(json_data[i]['word'])
print(json_data2)


'''
Now I tried if I can find any element of `Entities` list and `Relations` list
in each sentence of `sentence_list`. And then it should store matched 
entities and relations based on sentence to a list. '''

for line in json_data2:
    for rel in relation(obj):
        for ent in entities(obj):
            match = re.findall(rel,  line['word'])
            if match:
                print('word matched relations: %s ==> word: %s' % (rel,  line['address']))
            match2 = re.findall(ent, line['word'])
            if match2:
                print('word matched entities: %s ==> word: %s' % (ent,  line['address']))

不幸的是,没有工作?你知道吗


Tags: injson列表fordataiflinesentence
1条回答
网友
1楼 · 发布于 2024-04-25 02:24:06

您可以使用以下list comprehension

to_match = set(Entities+Relations)
l = [{j for j in to_match if j in i} 
        for i in ' '.join(sentence_list).split('.')[:-1]]

输出

[{'Ashraf', 'Born', 'Kabul'}, {'Afghanistan', 'Ashraf'}]

请注意,我正在返回一个sets列表以避免重复值,例如在EntitiesAfghanistan中出现两次。你知道吗

有用的阅读:

相关问题 更多 >