requests.exceptions.MissingSchema:无效的URL“”

2024-03-28 18:23:37 发布

您现在位置:Python中文网/ 问答频道 /正文

即使链接中有https:,也会出现缺少架构错误

我正在尝试使用Python废弃多个wiki页面,我在excel中有一个wiki URL列表。你知道吗

并创建了一个Python类来抓取Wiki页面并在for循环中运行它。当运行没有for循环的代码时,我可以得到输出,但是当我在for循环中包含以下代码时,我得到了缺少的模式。你知道吗

import re
from bs4 import BeautifulSoup
import requests
import xlrd

wb = xlrd.open_workbook('list.xls')
sheet = wb.sheet_by_index(0)

class wiki:

    def __init__(self,url):
        #self.name =name
        self.url = url
        cont = requests.get(self.url, timeout=5)
        soup = BeautifulSoup(cont.content, "html.parser")

    def urlcont (self):
        cont = requests.get(self.url, timeout=5)
        soup = BeautifulSoup(cont.content, "html.parser")
        print (soup.prettify())
    def head(self):
        cont = requests.get(self.url, timeout=5)
        soup = BeautifulSoup(cont.content, "html.parser")
        title = soup.find(class_='firstHeading').i.text 
        return title

for i in range (sheet.nrows):
    url = sheet.cell_value(i,2)
    print (url)

    data = wiki(url)
    head = data.head()
    print (head)


运行此代码后出错

Traceback (most recent call last):
  File "D:\PYTHON\1click\final\alex.py", line 177, in <module>
    movie = wikimovie(movieurl)
  File "D:\PYTHON\1click\final\alex.py", line 69, in __init__
    cont = requests.get(self.url, timeout=5)
  File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py", line 519, in request
    prep = self.prepare_request(req)
  File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py", line 452, in prepare_request
    p.prepare(
  File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\models.py", line 313, in prepare
    self.prepare_url(url, params)
  File "C:\Users\acer\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\models.py", line 387, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '': No schema supplied. Perhaps you meant http://?

删除For循环时的输出

https://en.wikipedia.org/wiki/######

######

用于打印所有url(带有for循环)而不调用类的输出

https://en.wikipedia.org/wiki/######
https://en.wikipedia.org/wiki/######
https://en.wikipedia.org/wiki/######

当for循环被忽略并用这行“url=sheet.cell\u值(一、二)


Tags: inpyselfurlforgetlocalwiki