我已经用scrapy创建了一个爬虫程序。该爬虫程序正在抓取网站并获取链接。 **使用的技术:*Python、Scrapy 错误 爬虫程序正在获取相对URL,因为爬虫程序无法抓取网页。 我希望爬虫程序只获取绝对URL。 请帮忙
import scrapy
import os
class MySpider(scrapy.Spider):
name = 'feed_exporter_test'
# this is equivalent to what you would set in settings.py file
custom_settings = {
'FEED_FORMAT': 'csv',
'FEED_URI': 'file1.csv'
}
filePath='file1.csv'
if os.path.exists(filePath):
os.remove(filePath)
else:
print("Can not delete the file as it doesn't exists")
start_urls = ['https://www.jamoona.com/']
def parse(self, response):
titles = response.xpath("//a/@href").extract()
for title in titles:
yield {'title': title}
答案是这样的
相关问题 更多 >
编程相关推荐