擅长:python、mysql、java
<p>这似乎对我有用</p>
<pre><code># -*- coding: utf-8 -*-
import scrapy
import urllib
class SimpleItem(scrapy.Item):
name = scrapy.Field()
url = scrapy.Field()
class CitiesSpider(scrapy.Spider):
name = "cities"
allowed_domains = ["sitercity.info"]
start_urls = (
'http://en.sistercity.info/countries/de.html',
)
def parse(self, response):
for a in response.css('a'):
item = SimpleItem()
item['name'] = a.css('::text').extract_first()
item['url'] = urllib.unquote(
a.css('::attr(href)').extract_first().encode('ascii')
).decode('utf8')
yield item
</code></pre>
<p>使用你的问题中提到的饲料出口商,它也使用另一个存储</p>
^{pr2}$
<p>(必要时删除注释)</p>
<pre><code>FEED_EXPORTERS = {
'json': 'myproj.exporter.UnicodeJsonLinesItemExporter'
}
#FEED_STORAGES = {
# '': 'myproj.exporter.CustomFileFeedStorage'
#}
FEED_FORMAT = 'json'
FEED_URI = "out.json"
</code></pre>