Python 正则表达式 "对象没有该属性

8 投票

4 回答

27809 浏览

数据工程师

提问于 2025-04-15 14:40

我正在整理一个需要更新新内容的页面列表（因为我们要换媒体格式）。在这个过程中，我还在记录那些已经正确更新新内容的页面。

我的大致做法是这样的：

遍历文件结构，获取文件列表
对每个文件读取内容到一个缓冲区，然后使用正则表达式查找特定的标签
如果找到匹配项，再进行两个额外的正则表达式匹配
把匹配到的结果（其中一个或另一个）写入数据库

到第三个正则表达式匹配的时候，一切都很顺利，但我遇到了这个问题：

'NoneType' object has no attribute 'group'

# only interested in embeded content
pattern = "(<embed .*?</embed>)"

# matches content pointing to our old root
pattern2 = 'data="(http://.*?/media/.*?")'

# matches content pointing to our new root
pattern3 = 'data="(http://.*?/content/.*?")'

matches = re.findall(pattern, filebuffer)
for match in matches:
    if len(match) > 0:

    urla = re.search(pattern2, match)
    if urla.group(1) is not None:
        print filename, urla.group(1)

    urlb = re.search(pattern3, match)
    if urlb.group(1) is not None:
        print filename, urlb.group(1)

谢谢。

正则表达式文件遍历缓冲区页面更新内容匹配媒体格式标签查找数据库写入

4 个回答

我也遇到过同样的问题。

如果你使用的是python2.6，可以这样来解决：

for match in matches:
 if len(match) > 0:

  urla = re.search(pattern2, match)
  try:  
   urla.group(1):
   print filename, urla.group(1)
  excpet:
   print "Problem with",pattern2


  urlb = re.search(pattern3, match)
  try:
   urlb.group(1)
   print filename, urlb.group(1)
  except:
   print "Problem with",pattern3

回答于 2025-04-15 由 Python大师

分享举报

出现 TypeError 的原因是，search 或 match 通常会返回一个 MatchObject 或者 None。这两者中只有一个有 group 这个方法，而那个不是 None。所以你需要这样做：

url = re.search(pattern2, match)
if url is not None:
    print(filename, url.group(0))

附注： PEP-8 建议使用 4 个空格来缩进。这不仅仅是个意见，而是一个好习惯。你的代码看起来比较难读。

回答于 2025-04-15 由 Python大师

分享举报

你的错误提示是说，urla的值是None，也就是没有值。urla的值是通过re.search这个函数来决定的，所以如果urla是None，那就说明re.search返回了None。而re.search返回None的原因是字符串没有符合你设定的模式。

所以基本上你应该使用：

urla = re.search(pattern2, match)
if urla is not None:
    print filename, urla.group(1)

来替代你现在的写法。

回答于 2025-04-15 由 Python大师

分享举报

Python 正则表达式 "对象没有该属性

4 个回答

撰写回答