用Python提取HTML内容

2024-04-18 11:27:27 发布

您现在位置:Python中文网/ 问答频道 /正文

嗨,我正在使用漂亮的soup库来解析html页面的内容。你知道吗

我使用以下脚本进入我想要的页面部分:

review_list = soup.find(class_="review_list_score_breakdown_right")

<span class=" review_list_score_breakdown_right"> <ul class="review_score_breakdown_list list_tighten clearfix" data-et-view="bLTQHcXJVNRCSPOMcAQJO:1 bLTQHcXJVNRCSPOMcAQJO:3 " id="review_list_score_breakdown"> <li class="clearfix one_col" data-question="hotel_clean"> <p class="review_score_name"> Cleanliness </p> <div class="score_bar"> <div class="score_bar_value" data-score="100" style="width: 100%;"> </div> </div> <p class="review_score_value"> 10 </p> </li> <li class="clearfix one_col" data-question="hotel_comfort"> <p class="review_score_name"> Comfort </p> <div class="score_bar"> <div class="score_bar_value" data-score="100" style="width: 100%;"> </div> </div> <p class="review_score_value"> 10 </p> </li> <li class="clearfix one_col" data-question="hotel_services"> <p class="review_score_name"> Facilities </p> <div class="score_bar"> <div class="score_bar_value" data-score="100" style="width: 100%;"> </div> </div> <p class="review_score_value"> 10 </p> </li> <li class="clearfix one_col" data-question="hotel_staff"> <p class="review_score_name"> Staff </p> <div class="score_bar"> <div class="score_bar_value" data-score="100" style="width: 100%;"> </div> </div> <p class="review_score_value"> 10 </p> </li> <li class="clearfix one_col" data-question="hotel_value"> <p class="review_score_name"> Value for money </p> <div class="score_bar"> <div class="score_bar_value" data-score="100" style="width: 100%;"> </div> </div> <p class="review_score_value"> 10 </p> </li> <li class="clearfix one_col" data-question="hotel_wifi"> <p class="review_score_name"> Free WiFi </p> <div class="score_bar"> <div class="score_bar_value" data-score="100" style="width: 100%;"> </div> </div> <p class="review_score_value"> 10 </p> </li> <li class="clearfix one_col" data-question="hotel_location"> <p class="review_score_name"> Location </p> <div class="score_bar"> <div class="score_bar_value" data-score="100" style="width: 100%;"> </div> </div> <p class="review_score_value"> 10 </p> </li> </ul> </span>

我需要从数据问题标签中提取分数。例如,如果我想知道酒店舒适度评分,我需要访问data-question= "hotel_confort",我已经尝试了find()函数,但它不起作用。你知道吗


Tags: namedivdatavaluestylebarcolli
2条回答

我认为您需要的是attrs查找查询。 你的问题类似于Extracting an attribute value with beautifulsoup

我会把你的情况说得具体一点。你知道吗

review = soup.find(class_="review_list_score_breakdown_right")
input = review.find(attrs={"data-question" : "hotel-comfort"})
output = input['value']

我已经有一段时间没有使用bs4了,所以请调试代码。你知道吗

编辑: 下面是从示例字符串中提取的一些工作代码

review = soup.find('span', {'class' : "review_list_score_breakdown_right"})
input = review.find_all(attrs={"data-question": "hotel_comfort"})
print(input) #print the html extract which you can go down further.

代码中没有hotel_confort属性。你知道吗

    review = soup.find(class_="review_list_score_breakdown_right")
    hotel = review.find(attrs={"data-question" : "hotel_comfort"})

此代码返回

<li class="clearfix one_col" data-question="hotel_comfort"> ..... </li>

相关问题 更多 >