即使链接中有https:,也会出现缺少架构错误
我正在尝试使用Python废弃多个wiki页面,我在excel中有一个wiki URL列表。你知道吗
并创建了一个Python类来抓取Wiki页面并在for循环中运行它。当运行没有for循环的代码时,我可以得到输出,但是当我在for循环中包含以下代码时,我得到了缺少的模式。你知道吗
import re
from bs4 import BeautifulSoup
import requests
import xlrd
wb = xlrd.open_workbook('list.xls')
sheet = wb.sheet_by_index(0)
class wiki:
def __init__(self,url):
#self.name =name
self.url = url
cont = requests.get(self.url, timeout=5)
soup = BeautifulSoup(cont.content, "html.parser")
def urlcont (self):
cont = requests.get(self.url, timeout=5)
soup = BeautifulSoup(cont.content, "html.parser")
print (soup.prettify())
def head(self):
cont = requests.get(self.url, timeout=5)
soup = BeautifulSoup(cont.content, "html.parser")
title = soup.find(class_='firstHeading').i.text
return title
for i in range (sheet.nrows):
url = sheet.cell_value(i,2)
print (url)
data = wiki(url)
head = data.head()
print (head)
运行此代码后出错
Traceback (most recent call last):
File "D:\PYTHON\1click\final\alex.py", line 177, in <module>
movie = wikimovie(movieurl)
File "D:\PYTHON\1click\final\alex.py", line 69, in __init__
cont = requests.get(self.url, timeout=5)
File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py", line 519, in request
prep = self.prepare_request(req)
File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py", line 452, in prepare_request
p.prepare(
File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\models.py", line 313, in prepare
self.prepare_url(url, params)
File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\models.py", line 387, in prepare_url
raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '': No schema supplied. Perhaps you meant http://?
删除For循环时的输出
https://en.wikipedia.org/wiki/######
######
用于打印所有url(带有for循环)而不调用类的输出
https://en.wikipedia.org/wiki/######
https://en.wikipedia.org/wiki/######
https://en.wikipedia.org/wiki/######
当for循环被忽略并用这行“url=sheet.cell\u值(一、二)
目前没有回答
相关问题 更多 >
编程相关推荐