使用python和beautiful soup从HTML获取结构化数据

2024-05-16 09:16:19 发布

男 | 程序猿一只，喜欢编程写python代码。

我是一个新的Python。我想得到如下代码的结果：

Score      Postive        Negative
  5         good            bad
  7       interesting
  3                       horrible

但是我的代码输出没什么。求你了问题出在哪里？你知道吗

from bs4 import BeautifulSoup
text = """
... <body>
        <div class="review">
        <p class="pos">good</p>
        <p class="neg">bad</p>
    </div>
    <div class="review">
        <p class="pos">interesting</p>
    </div>
    <div class="review">
        <p class="neg">horrible</p>
    </div>
... </body>"""
soup = BeautifulSoup(text)
for parent in soup.find_all('div', attrs={'class': 'review'}):   
if parent.findNextSiblings('p', attrs={'class': 'pos'}):
    postive.append(parent.get_text())
else:
    postive.append("")
if parent.findNextSiblings('p', attrs={'class': 'neg'}): 
    negtive.append(parent.get_text())
else:
    negtive.append("")

Tags：代码 text pos div review attrs class parent

1条回答

网友

1楼 · 发布于 2024-05-16 09:16:19

p标记不是类review的div标记的同级，它们是子级：

positive = []
negative = []
for div in soup.find_all('div', attrs={'class': 'review'}):
    pos = div.find('p', {'class': 'pos'})
    positive.append(pos.get_text() if pos else '')

    neg = div.find('p', {'class': 'neg'})
    negative.append(neg.get_text() if neg else '')

print positive
print negative

印刷品：

[u'good', u'interesting', ''] 
[u'bad', '', u'horrible']

使用python和beautiful soup从HTML获取结构化数据

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用python和beautiful soup从HTML获取结构化数据

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >