无法将从HTML fi中提取的整数相加

import urllib from BeautifulSoup import * import re counter = 0 added = 0 url = "http://python-data.dr-chuck.net/comments_42.html" html = urllib.urlopen(url).read() soup = BeautifulSoup(html) # Retrieve all of the span tags spans = soup('span') for comments in spans: print comments counter +=1 #y = re.findall('(\d+)', comments) -- didnt work #print y #added += y y = re.findall('(\d+)', str(soup)) print y b = sum(y) print b print "Count", counter print "Sum", added

2条回答

网友

1楼 · 编辑于 2024-04-20 03:39:16

你在尝试对字符串求和。在求和之前将字符串转换为整数，如Pynchia所说，然后打印b as the Sum。你知道吗

...
b = sum(map(int, y))
...
print "Count", counter
print "Sum", b

如果要更正注释部分，请使用：

...
y = re.findall('(\d+)', str(comments))
print y
added = sum(map(int, y))

网友

2楼 · 编辑于 2024-04-20 03:39:16

引用Python Documentation：

re.findall(pattern, string, flags=0)
Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found.
If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

此表达式：

y = re.findall('(\d+)', str(soup))将返回与模式(\d+)匹配的所有字符串的列表，即数字字符串。所以你有一个字符串列表。你知道吗

那么

b = sum(y)，将尝试使用一些字符串而不是整数，这就是您收到错误消息的原因。你知道吗

请尝试：

b = sum(map(int, y))，这将把y中的每个字符串数字转换成整数，然后将它们相加。你知道吗

演示：

>>> s = 'Today is 31st, December, Temperature is 18 degC'
>>> y = re.findall('(\d+)', s)
['31', '18']
>>> b = sum(map(int, y))
>>> b
49

相关问题更多 >

编程相关推荐

热门问题

热门文章