从Just Di抓取数据

2024-04-18 17:50:43 发布

您现在位置:Python中文网/ 问答频道 /正文

# PIP requirements: requests, beautifulsoup4
import requests
from bs4 import BeautifulSoup
import json
import csv

jd_url = "http://www.justdial.com/Bangalore/Car-Hire-%3Cnear%3E-Shanthinagar"

# Split http/https prefix if any
# TODO: work on URLs which dont' have the CT part in URL
jd_url = jd_url.split('http://www.justdial.com/')[-1].split('https://www.justdial.com/')[-1]
city, search, cat_id = '', '', ''
split_vals = jd_url.split('/')
if len(split_vals) == 3:
    city, search, cat_id = jd_url.split('/')
    cat_id = cat_id.split('-')[-1]
elif len(split_vals) == 2:
    city, search = jd_url.split('/')
search = search.replace('-', '+')

我用这个脚本将脚本中提到的字段刮到csv文件中。我遇到了一个打字错误。我对python-pelase帮助很陌生。在

^{pr2}$

我收到一个类型错误:列表索引必须是整数,而不是str 请帮我弄明白。在


Tags: csvimportcomidhttpurlcitysearch