ruia_motor-一个使用马达存储数据的ruia插件
ruia-motor的Python项目详细描述
瑞亚汽车
使用马达存储数据的Ruia插件
Notice: Works on ruia >= 0.5.0
安装
pip install -U ruia-motor
使用量
ruia-motor
将自动将数据存储到MongoDB:
fromruiaimportAttrField,Item,Spider,TextFieldfromruia_motorimportRuiaMotorclassDoubanItem(Item):target_item=TextField(css_select='div.item')title=TextField(css_select='span.title')cover=AttrField(css_select='div.pic>a>img',attr='src')abstract=TextField(css_select='span.inq',default='')asyncdefclean_title(self,title):ifisinstance(title,str):returntitleelse:return''.join([i.text.strip().replace('\xa0','')foriintitle])classDoubanSpider(Spider):start_urls=['https://movie.douban.com/top250']mongodb_config={'host':'127.0.0.1','port':27017,'db':'ruia_motor'}asyncdefparse(self,response):etree=response.html_etreepages=['?start=0&filter=']+[i.get('href')foriinetree.cssselect('.paginator>a')]forindex,pageinenumerate(pages):url=self.start_urls[0]+pageyieldself.request(url=url,metadata={'index':index},callback=self.parse_item)asyncdefparse_item(self,response):asyncforiteminDoubanItem.get_items(html=response.html):data=item.resultsyieldRuiaMotor(collection='douban250',data=data)asyncdefinit_plugins_after_start(spider_ins):RuiaMotor.init_spider(spider_ins=spider_ins)if__name__=='__main__':DoubanSpider.start(after_start=init_plugins_after_start)
享受吧:)