我正在用Python中的scrapy从一个网站上抓取数据。你知道吗
所需数据位于脚本标记中,如下所示:
<script type="text/javascript">
getDetailsfrmBean("storePg","564","Berwyn, IL","7180 W CERMAK RD.","SPACE A1","","BERWYN","IL","US","60402","(708) 788-5097","{Monday-Saturday=10-9,sunday=11-6}","41.8507029","-87.8033709");
</script>
我可以使用xpath获得以下内容:
item['lat'] = tree.xpath('//script[@type="text/javascript"]/text()'.extract()[0].encode('utf-8')
item['long'] = tree.xpath('//script[@type="text/javascript"]/text()'.extract()[0].encode('utf-8')
那么
item['lat'] = 'getDetailsfrmBean("storePg","564","Berwyn, IL","7180 W CERMAK RD.","SPACE A1","","BERWYN","IL","US","60402","(708) 788-5097","{Monday-Saturday=10-9,sunday=11-6}","41.8507029","-87.8033709");'
item['long'] = 'getDetailsfrmBean("storePg","564","Berwyn, IL","7180 W CERMAK RD.","SPACE A1","","BERWYN","IL","US","60402","(708) 788-5097","{Monday-Saturday=10-9,sunday=11-6}","41.8507029","-87.8033709");'
但是我怎样才能解析这些内容呢
item['lat'] is equal to "41.8507029"
item['long'] is equal to "-87.8033709"
item['city'] is equal to "BERWYN"
item['state'] is equal to "IL"
我能得到解决这个问题的建议吗。你知道吗
用
re
试试这个应产生以下输出:
从这里的答案中得到:https://stackoverflow.com/a/23720594/5907969
因为这个调用也是有效的Python语法,所以我们可以使用
ast
模块。加上参数都是字符串文字,这使事情更简单。你知道吗输出:
说明:
您可以使用简单的正则表达式来提取逗号分隔的引号字符串部分:
输出:
然后有多种方法可以从此类数据中解析字符串列表:
相关问题 更多 >
编程相关推荐