我有一个文件,我想从中删除包含特定模式的每一行。假设模式如下:
lineRemovalPatterns = [
"!DOCTYPE html",
"<html",
"<head",
"<meta",
"<title",
"<link rel>",
"</head>",
"<body>",
"</body>",
"</html>"
]
我应该如何循环文件并只保留不包含这些模式的行?你知道吗
HTMLGitFileContent = ""
HTMLSVNFileName = "README_SVN.html"
# Loop over the lines of the HTML SVN file, building the resultant Git file
# content. If any of the line removal patterns are in a line, remove that
# line.
HTMLSVNFile = open(HTMLSVNFileName, "r")
for line in HTMLSVNFile:
for lineRemovalPattern in lineRemovalPatterns:
if lineRemovalPattern not in line:
HTMLGitFileContent = HTMLGitFileContent + "\n" + line
break
可以使用^{} 而不是
lineRemovalPattern not in line
来排除包含要删除的子字符串的行。你知道吗不过,我还是回显@doctorlove,因为真正的DOM解析器可能会更好地为您服务。这条路不要走太远!你知道吗
以下方法使用函数
any
返回值的求反,该函数应用于涉及当前行和模式列表的列表理解:相关问题 更多 >
编程相关推荐