python2.7中的Regex和csv问题

import csv, re, mechanize htmlML = br.response().read() #escaping ? fixed the regex match patMemberName = re.compile('<a href=/foo.php\?XID=(d+) ><font color=#000000><b>(.*) </b>') searchMemberName = re.findall(patMemberName,htmlML) MembersCsv = 'path-to-csv' MemberWriter = csv.writer(open(MembersCsv, 'wb')) #adding b fixed the \n in csv for i in searchMemberName: MemberWriter.writerow(i) print (i)

2条回答

网友

1楼 · 编辑于 2024-06-16 14:09:51

对于问题1），您必须转义模式中的?。在

import re

htmlML = '<a href=/foo.php?XID=123 ><font color=#000000><b>user</b>'
patMemberID = re.compile('<a href=/foo.php\?XID=(\d*) ><font color=#000000><b>user</b>')

searchMemberID = re.findall(patMemberID, htmlML)
print len(searchMemberID)

for i in searchMemberID:
    print (i)

然后可以从字符串中提取123

问题2a）

您可以使用(.*?)来替换some string，即?maens非贪婪匹配

网友

2楼 · 编辑于 2024-06-16 14:09:51

不幸的是，我现在找不到适合Python的转义序列。通常，您将用不应在“\Q…\E”中解释的元字符包装表达式。在

试着把绳子包起来重新逃逸（字符串）。所以：

re.compile(re.escape('<font color=#000000><b>(.*)</b>'))

相关问题更多 >

编程相关推荐

热门问题

热门文章