Scraperwiki Python循环Issu

import scraperwiki import requests import lxml.html html = requests.get("http://www.store.com/us/a/productDetail/a/910271.htm").content dom = lxml.html.fromstring(html) for entry in dom.cssselect('.downloads'): document = { 'title': entry.cssselect('a')[0].text_content(), 'url': entry.cssselect('a')[0].get('href') } print document

1条回答

网友

1楼 · 发布于 2024-06-16 09:36:27

您需要使用类downloads遍历div中的a标记：

for entry in dom.cssselect('.downloads a'):
    document = {
        'title': entry.text_content(),
        'url': entry.get('href')
    }
    print document

印刷品：

{'url': '/webassets/kpna/catalog/pdf/en/1012741_4.pdf', 'title': 'Rough In/Spec Sheet'}
{'url': '/webassets/kpna/catalog/pdf/en/1012741_2.pdf', 'title': 'Installation and Care Guide with Service Parts'}
{'url': '/webassets/kpna/catalog/pdf/en/1204921_2.pdf', 'title': 'Installation and Care Guide without Service Parts'}
{'url': '/webassets/kpna/catalog/pdf/en/1011610_2.pdf', 'title': 'Installation Guide without Service Parts'}

相关问题更多 >

编程相关推荐

热门问题

热门文章

Scraperwiki Python循环Issu

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >