Scrapy爬虫遇到错误"'str'对象没有'iter'属性

-2 投票
1 回答
80 浏览
提问于 2025-04-14 15:53

我收到一个错误信息,上面写着:

AttributeError: 'str' object has no attribute 'iter'
2024-03-15 14:01:19 [scrapy.core.engine] INFO: Closing spider (finished)

当我使用这段代码的时候:

class AuctionSpider(CrawlSpider):
    name = "auction"
    allowed_domains = ["auct.co.th"]
    start_urls = ["https://www.auct.co.th/products"]

    rules = (Rule(LinkExtractor(restrict_xpaths="//div[@class='p-2 card']/text()"), callback="parse_item", follow=True),)

    def parse_item(self, response):
        yield {
            'auction_date': response.xpath("//b[@id ='product_auction_date']/text()").get(),
            'price_start': response.xpath("//b[@id ='product_price_start']/text()").get(),
            'order': response.xpath("//b[@id ='product_order']/b/text()").get(),
            'product_title': response.xpath("//div[@class ='col-md-12']/b/text()").get(),
            'product_regis_id': response.xpath("//div[@class ='col-sm-12 col-md-12 col-xl-12']/b/text()").get(),
            'total_drive': response.xpath("//b[@id='product_total_drive']/text()").get(),
            'product_gear': response.xpath("//b[@id='product_gear']/text()").get(),
            'product_color': response.xpath("//b[@id='product_color']/text()").get(),
            'cc': response.xpath("//b[@id='product_engin_cc']/text()").get(),
            'regis_year': response.xpath("//b[@id='product_regis_year']/text()").get(),
            'build_year': response.xpath("//b[@id='product_build_year']/text()").get(),
            'gas_type': response.xpath("//b[@id='product_gas_type']/text()").get(),
            'vin_no': response.xpath("//b[@id='product_body_number']/text()").get(),
            'engine_no': response.xpath("//b[@id='product_engin_number']/text()").get(),
            'endtax': response.xpath("//b[@id='product_endtax']/text()").get(),
            'stock': response.xpath("//b[@id='product_oderstock']/text()").get(),
            'price': response.xpath("//b[@id='product_price_other']/text()").get(),
            'gadget': response.xpath("//b[@id='product_gadget']/text()").get(),
            'remark': response.xpath("//b[@id='product_remark']/text()").get(),
        }

这个错误说“'str'对象没有'iter'这个属性”。

这是什么原因呢?我该怎么解决这个问题呢?

1 个回答

1

这个问题出现是因为你特意在找一个字符串:(/text())

restrict_xpaths="//div[@class='p-2 card']/text()"

你需要把它换成实际包含链接的标签的xpath选择器,像这样:

rules = (Rule(LinkExtractor(restrict_xpaths="//div[@class='p-2 card']//a"), callback="parse_item", follow=True),)

由于某种原因,我没有产品,所以我得到的输出是:

{'auction_date': None, 'price_start': None, 'order': None, 'product_title': '-', 'product_regis_id': '-', 'total_drive': None, 'product_gear': None, 'product_color': None, 'cc': None, 'regis_year': None, 'build_year': None, 'gas_type': None, 'vin_no': None, 'engine_no': None, 'endtax': None, 'stock': None, 'price': None, 'gadget': None, 'remark': None}

撰写回答