如何使用python漂亮的soup从下面的HTML获取标签和ID信息

2024-05-16 23:20:20 发布

您现在位置:Python中文网/ 问答频道 /正文

如何从下面的HTML代码中提取ID和标签(10870,第7阶段JP Nagar)

<input id="filter_data" type="hidden" value="{&quot;Locality&quot; :{&quot;Top_Results_Array&quot; :{&quot;0&quot; :{&quot;ID&quot;:&quot;10870&quot;,&quot;LABEL&quot;:&quot;7th Phase JP Nagar&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:202.0},&quot;1&quot; :{&quot;ID&quot;:&quot;2259&quot;,&quot;LABEL&quot;:&quot;Electronic City&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:126.0},&quot;2&quot; :{&quot;ID&quot;:&quot;2265&quot;,&quot;LABEL&quot;:&quot;Koramangala&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:118.0},&quot;3&quot; :{&quot;ID&quot;:&quot;11646&quot;,&quot;LABEL&quot;:&quot;BTM 2nd Stage&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:118.0}},&quot;More_Locality_Array&quot; :{&quot;0&quot; :{&quot;ID&quot;:&quot;2277&quot;,&quot;LABEL&quot;:&quot;Bellandur&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:102.0},&quot;1&quot; :{&quot;ID&quot;:&quot;5467&quot;,&quot;LABEL&quot;:&quot;Hulimavu&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:95.0},&quot;2&quot; :{&quot;ID&quot;:&quot;2261&quot;,&quot;LABEL&quot;:&quot;HSR Layout&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:94.0},&quot;3&quot;: :{&quot;ID&quot;:&quot;2293&quot;,&quot;LABEL&quot;:&quot;Jigani&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:91.0},&quot;4&quot; :{&quot;ID&quot;:&quot;2249&quot;,&quot;LABEL&quot;:&quot;Bannerghatta Road&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:83.0},&quot;5&quot; :{&quot;ID&quot;:&quot;2264&quot;,&quot;LABEL&quot;:&quot;Kanakpura Road&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:83.0},&quot;6&quot;:

我尝试过使用python代码来获取输入值(id=filter_data)

^{pr2}$

我的产量正在下降

{"Locality":{"Top_Results_Array":{ "0":{"ID":"10870","Locality":"7th Phase JP Nagar","SELECTED":"","COUNT":202.0} ,"1":{"ID":"2259","LABEL":"Electronic City","SELECTED":"","COUNT":126.0} ,"2":{"ID":"2265","LABEL":"Koramangala","SELECTED":"","COUNT":118.0} ,"3":{"ID":"11646","LABEL":"BTM 2nd Stage","SELECTED":"","COUNT":118.0}} ,"More_Locality_Array":{"0":{ "ID":"2277","LABEL":"Bellandur","SELECTED":"","COUNT":102.0} ,"1":{"ID":"5467","LABEL":"Hulimavu","SELECTED":"","COUNT":95.0} ,"2":{"ID":"2261","LABEL":"HSR Layout","SELECTED":"","COUNT":94.0} ,"3":{"ID":"2293","LABEL":"Jigani","SELECTED":"","COUNT":91.0} ,"4":{"ID":"2249","LABEL":"Bannerghatta Road","SELECTED":"","COUNT":83.0} ,"5":{"ID":"2264","LABEL":"Kanakpura Road","SELECTED":"","COUNT":83.0}

但我需要低于产量的

10870第七期JP Nagar

2259电子城

2265科拉曼加拉

11646 BTM第二阶段

2277贝拉兰德

5467胡利马武

2261号高铁布局图

一。 . 在

你能帮我一下吗


Tags: 代码iddatacountfilterarraylabeljp
1条回答
网友
1楼 · 发布于 2024-05-16 23:20:20

我可以建议的一种方法是jsonify您的结果集并根据需要提取信息。问题是unicode的输出格式。您可以在获得result之后尝试此代码,您可以以自己的方式获取数据。您可以将数据加载为list、dict等,并根据需要获取值。在

import json
exp = soup.find_all('input', attrs={"id":"filter_data"})
abc = exp[0].get('value') # len(exp) = 1 
abc = abc.decode('utf-8')  # since its unicode
result = json.loads(abc)
result

如果您想看到具有位置的结果值,请选中

^{pr2}$

在字典里看看并决定你想要什么。在

dict(result)

玩一下json,你会得到你想要的。我希望这能有所帮助。在

相关问题 更多 >