从列表中删除特定单词

网友

1楼 · 编辑于 2024-06-16 13:22:05

首先，你应该总是张贴你已经尝试了。你知道吗

仅使用内置库：

for i in range(0, len(lines)-1):
    for it in range(0, len(words)-1):
        lines[i] = lines[i].replace(words[it], '')

代码解释行：

对于“行”列表中的每个项目，i=当前行的项目编号
对于“words”列表中的每个项目，它=“words”中当前单词的项目号；将“list”中当前项目中的所有word项目替换为“”
列表“行”中的当前项更改为自身，而“字”中没有当前项

网友

2楼 · 编辑于 2024-06-16 13:22:05

无需使用正则表达式，您可以更高效地执行此操作：

lines = ['<title>The query complexity of estimating weighted averages.</title>',
         '<title>New bounds for the query complexity of an algorithm that learns DFAs with correction and equivalence queries.</title>',
         '<title>A general procedure to check conjunctive query containment.</title>']
words = {"a", "is", "and", "there", "here"}

print([" ".join([w for line in lines
             for w in line[7:-8:].split(" ")
             if w.lower() not in words])])


['The query complexity of estimating weighted averages.
 New bounds for the query complexity of an algorithm that learns 
 DFAs with correction equivalence queries.
 general procedure to check conjunctive query containment.']

如果是case matter，则删除w.lower()打电话。还有如果您是通过解析网页来提取行，我建议您在写入文件之前从标记中提取文本。你知道吗

网友

3楼 · 编辑于 2024-06-16 13:22:05

通过re.sub函数。你知道吗

>>> lines= ['<title>The query complexity of estimating weighted averages.</title>', '<title>New bounds for the query complexity of an algorithm that learns DFAs with correction and equivalence queries.</title>', '<title>A general procedure to check conjunctive query containment.</title>']
>>> words=['a','is','and','there','here']
>>> [re.sub(r'</?title>|\b(?:'+'|'.join(words)+r')\b', r'', line) for line in lines]
['The query complexity of estimating weighted averages.', 'New bounds for the query complexity of an algorithm that learns DFAs with correction  equivalence queries.', 'A general procedure to check conjunctive query containment.']

单词前后的\b有助于精确匹配单词。\b称为单词边界，匹配单词字符和非单词字符。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章

从列表中删除特定单词

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >