从元组中移除依赖于其他元素的元素

2024-03-29 09:38:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我有下面的元组

text =[('Michael', 'PERSON'), ('Jordan', 'PERSON'), ("'s", 'O'), ('legacy', 'O'), ('in', 'O'), ('the', 'O'), ('90', 'O'), ("'s", 'O'), ('shows', 'O'), ('that', 'O'), ('he', 'O'), ('was', 'O'), ('the', 'O'), ('biggest', 'O'), ('player', 'O'), ('ever', 'O'), ('in', 'O'), ('the', 'O'), ('NBA', 'ORGANIZATION'), ('.', 'O')]

原话是“迈克尔乔丹在90年代留下的遗产表明他是NBA有史以来最伟大的球员。”

我需要删除分类为“PERSON”的元素

我就这么做了

new_text = [x for x in text if x[1] != "PERSON"]
sentence= " ".join(x[0] for x in new_text)
print(sentence)

我得到的结果是

's legacy in the 90 's shows that he was the biggest player ever in the NBA .

请注意开头的"'s"。你知道吗

现在我被卡住了,因为我需要在成为"PERSON"之前删除以元素为条件的“'s”元素。在这个例子中有2"'s",但我只想删除紧跟在"PERSON"后面的那个。有什么建议吗?你知道吗

谢谢你的意见。你知道吗


Tags: thetextin元素newthatlegacyshows
3条回答

你可以使用range,如果你找到一个O,你可以看看后面的:

text =[('Michael', 'PERSON'), ('Jordan', 'PERSON'), ("'s", 'O'), ('legacy', 'O'), ('in', 'O'), ('the', 'O'), ('90', 'O'), ("'s", 'O'), ('shows', 'O'), ('that', 'O'), ('he', 'O'), ('was', 'O'), ('the', 'O'), ('biggest', 'O'), ('player', 'O'), ('ever', 'O'), ('in', 'O'), ('the', 'O'), ('NBA', 'ORGANIZATION'), ('.', 'O')]

filtered_text = []

for idx in range(len(text)):
  if text[idx][1] == "PERSON":
    continue

  if text[idx][1] == 'O' and idx > 0 and text[idx-1][1] == 'PERSON':
    continue

  filtered_text.append(text[idx][0])

sentence= " ".join(filtered_text)
print(sentence)

在这里使用简单的for循环要容易得多。请注意,enumerate用于检索前一个元素(text[pos-1]),但是,这只能在前一个元素存在(pos > 0)时进行。你知道吗

#!/usr/bin/env python3

text =[('Michael', 'PERSON'), ('Jordan', 'PERSON'), ("'s", 'O'), ('legacy', 'O'), ('in', 'O'), ('the', 'O'), ('90', 'O'), ("'s", 'O'), ('shows', 'O'), ('that', 'O'), ('he', 'O'), ('was', 'O'), ('the', 'O'), ('biggest', 'O'), ('player', 'O'), ('ever', 'O'), ('in', 'O'), ('the', 'O'), ('NBA', 'ORGANIZATION'), ('.', 'O')]


new_text = []
for pos, (word, type_) in enumerate(text):
    if type_ == "PERSON":
        # we ignore words of type PERSON
        continue
    if word == "'s" and pos > 0 and text[pos-1][1] == "PERSON":
        # ignore 's if the previous word was of type PERSON
        continue 
    new_text.append((word, type_))


sentence= " ".join(x[0] for x in new_text)
print(sentence)shows

执行此脚本将生成以下文本:
legacy in the 90 's shows that he was the biggest player ever in the NBA .

一种方法是使用zip循环text和它的移位版本,并基于以下条件保留字符串:

out = []
for i,j in zip(text[:-1], text[1:]):
    if j[0] == "'s":
        if i[1] == 'PERSON':
            continue
        else:
            out.append(j[0])
    else:
        if i[1] != 'PERSON':
            out.append(j[0])

' '.join(out)
"legacy in the 90 's shows that he was the biggest player ever in the NBA ."

相关问题 更多 >