Python网页中的重复项

html_example=''' <value> this is the updated value Keyword "previous" that tell me I don't want the next value. <valueIdontwant> this is the previous value <value> this value has not been updated <value> this is the updated value Keyword "previous" that tell me I don't want the next value. <valueIdontwant> this is the previous value <value> this value has not been updated '''

def get_values(content): values=[] while True: start_value=content.find('<') end_value=content.find('>', start_value+1) value=content[start_value+1:end_value] if value: values.append(value) content=content[end_value:] else: break return values get_values(html_example)

1条回答

网友

1楼 · 发布于 2024-04-26 04:36:54

代码很复杂，不太像python，但如果您希望对列表进行索引访问，请查找enumerate（）。你知道吗

def get_values_ignore_current_line(content, keyword):
   content = '\n'.join([x for x in content.splitlines() if keyword not in x]) 
   tags = re.findall('<.*?>', content)
   return tags

def get_values_ignore_next_line(content, keyword):
    lines = content.splitlines()
    new_content = [lines[0]]
    for i, line in enumerate(lines):
        if (keyword not in line) or (re.match('<.*?>', line) is not None):
            if i < len(lines) - 1:
                new_content.append(lines[i+1])
    new_content = '\n'.join(new_content)
    return re.findall('<.*?>', new_content)

相关问题更多 >

编程相关推荐

热门问题

热门文章