Python靓汤返回错误

2024-04-25 19:29:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从剑桥大学出版社的网站上摘录不同期刊的封面。我想把它保存为在线ISSN。下面的代码可以工作,但是在一两个日志之后,它会给我这个错误:

Traceback (most recent call last):
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\urllib3\connection
.py", line 141, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\urllib3\util\conne
ction.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\socket.py", line 745, in getaddr
info
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11004] getaddrinfo failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\urllib3\connection
pool.py", line 601, in urlopen
    chunked=chunked)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\urllib3\connection
pool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\http\client.py", line 1239, in r
equest
    self._send_request(method, url, body, headers, encode_chunked)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\http\client.py", line 1285, in _
send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\http\client.py", line 1234, in e
ndheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\http\client.py", line 1026, in _
send_output
    self.send(msg)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\http\client.py", line 964, in se
nd
    self.connect()
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\urllib3\connection
.py", line 166, in connect
    conn = self._new_conn()
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\urllib3\connection
.py", line 150, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x030DB770>: Fai
led to establish a new connection: [Errno 11004] getaddrinfo failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\requests\adapters.
py", line 440, in send
    timeout=timeout
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\urllib3\connection
pool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\urllib3\util\retry
.py", line 388, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='ore', port=80): Max retries exceeded with
 url: /services/aop-file-manager/file/57f386d3efeebb2f18eac486 (Caused by NewConnectionError('<urlli
b3.connection.HTTPConnection object at 0x030DB770>: Failed to establish a new connection: [Errno 110
04] getaddrinfo failed',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Boys\Documents\Python\python_work\Kudos\CUPgetcovers.py", line 19, in <module>
    f.write(requests.get("http://" + imagefound).content)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\requests\api.py",
line 72, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\requests\api.py",
line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\requests\sessions.
py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\requests\sessions.
py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\Boys\AppData\Local\Programs\Python\Python36-32\lib\site-packages\requests\adapters.
py", line 508, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='ore', port=80): Max retries exceeded w
ith url: /services/aop-file-manager/file/57f386d3efeebb2f18eac486 (Caused by NewConnectionError('<ur
llib3.connection.HTTPConnection object at 0x030DB770>: Failed to establish a new connection: [Errno
11004] getaddrinfo failed',))

Process returned 1 (0x1)        execution time : 4.373 s
Press any key to continue . . .

我做错什么了?我在谷歌上找不到任何答案。它以前工作得很好。 提前谢谢你。在

编辑: 启动.py公司名称:

^{pr2}$

在列表链接.csv公司名称:

/core/journals/journal-of-materials-research
/core/journals/journal-of-mechanics
/core/journals/journal-of-modern-african-studies
/core/journals/journal-of-navigation
/core/journals/journal-of-nutritional-science
/core/journals/journal-of-pacific-rim-psychology
/core/journals/journal-of-paleontology
/core/journals/journal-of-pension-economics-and-finance
/core/journals/journal-of-plasma-physics
/core/journals/journal-of-policy-history
/core/journals/journal-of-psychologists-and-counsellors-in-schools
/core/journals/journal-of-public-policy
/core/journals/journal-of-race-ethnicity-and-politics
/core/journals/journal-of-radiotherapy-in-practice
/core/journals/journal-of-relationships-research
/core/journals/journal-of-roman-archaeology
/core/journals/journal-of-roman-studies
/core/journals/journal-of-smoking-cessation
/core/journals/journal-of-social-policy
/core/journals/journal-of-southeast-asian-studies
/core/journals/journal-of-symbolic-logic
/core/journals/journal-of-the-american-philosophical-association
/core/journals/journal-of-the-australian-mathematical-society
/core/journals/journal-of-the-gilded-age-and-progressive-era
/core/journals/journal-of-the-history-of-economic-thought
/core/journals/journal-of-the-institute-of-mathematics-of-jussieu
/core/journals/journal-of-the-international-neuropsychological-society
/core/journals/journal-of-the-international-phonetic-association
/core/journals/journal-of-the-marine-biological-association-of-the-united-kingdom
/core/journals/journal-of-the-royal-asiatic-society
/core/journals/journal-of-the-society-for-american-music
/core/journals/journal-of-tropical-ecology
/core/journals/journal-of-tropical-psychology
/core/journals/journal-of-wine-economics
/core/journals/kantian-review
/core/journals/knowledge-engineering-review
/core/journals/language-and-cognition
/core/journals/language-in-society
/core/journals/language-teaching
/core/journals/language-variation-and-change
/core/journals/laser-and-particle-beams
/core/journals/latin-american-antiquity
/core/journals/latin-american-politics-and-society
/core/journals/law-and-history-review
/core/journals/legal-information-management
/core/journals/legal-studies
/core/journals/legal-theory
/core/journals/leiden-journal-of-international-law
/core/journals/libyan-studies
/core/journals/lichenologist
/core/journals/lms-journal-of-computation-and-mathematics
/core/journals/macroeconomic-dynamics
/core/journals/management-and-organization-review
/core/journals/mathematical-gazette
/core/journals/mathematical-proceedings-of-the-cambridge-philosophical-society
/core/journals/mathematical-structures-in-computer-science
/core/journals/mathematika
/core/journals/medical-history
/core/journals/medical-history-supplements
/core/journals/melanges-d-histoire-sociale
/core/journals/microscopy-and-microanalysis
/core/journals/microscopy-today
/core/journals/mineralogical-magazine
/core/journals/modern-american-history
/core/journals/modern-asian-studies
/core/journals/modern-intellectual-history
/core/journals/modern-italy
/core/journals/mrs-advances
/core/journals/mrs-bulletin
/core/journals/mrs-communications
/core/journals/mrs-energy-and-sustainability
/core/journals/mrs-online-proceedings-library-archive
/core/journals/nagoya-mathematical-journal
/core/journals/natural-language-engineering
/core/journals/netherlands-journal-of-geosciences
/core/journals/network-science
/core/journals/new-perspectives-on-turkey
/core/journals/new-surveys-in-the-classics
/core/journals/new-testament-studies
/core/journals/new-theatre-quarterly
/core/journals/nineteenth-century-music-review
/core/journals/nordic-journal-of-linguistics
/core/journals/numerical-mathematics-theory-methods-and-applications
/core/journals/nutrition-research-reviews
/core/journals/organised-sound
/core/journals/oryx
/core/journals/paleobiology
/core/journals/the-paleontological-society-papers
/core/journals/palliative-and-supportive-care
/core/journals/papers-of-the-british-school-at-rome
/core/journals/parasitology
/core/journals/parasitology-open
/core/journals/personality-neuroscience
/core/journals/perspectives-on-politics
/core/journals/philosophy
/core/journals/phonology
/core/journals/plainsong-and-medieval-music
/core/journals/plant-genetic-resources
/core/journals/polar-record
/core/journals/political-analysis
/core/journals/political-science-research-and-methods
/core/journals/politics-and-gender
/core/journals/politics-and-religion
/core/journals/politics-and-the-life-sciences
/core/journals/popular-music
/core/journals/powder-diffraction
/core/journals/prehospital-and-disaster-medicine
/core/journals/primary-health-care-research-and-development
/core/journals/probability-in-the-engineering-and-informational-sciences
/core/journals/proceedings-of-the-asil-annual-meeting
/core/journals/proceedings-of-the-edinburgh-mathematical-society
/core/journals/proceedings-of-the-international-astronomical-union
/core/journals/proceedings-of-the-nutrition-society
/core/journals/proceedings-of-the-prehistoric-society
/core/journals/proceedings-of-the-royal-society-of-edinburgh-section-a-mathematics
/core/journals/ps-political-science-and-politics
/core/journals/psychological-medicine
/core/journals/public-health-nutrition
/core/journals/publications-of-the-astronomical-society-of-australia
/core/journals/quarterly-reviews-of-biophysics
/core/journals/quaternary-research
/core/journals/queensland-review
/core/journals/radiocarbon
/core/journals/ramus
/core/journals/recall
/core/journals/religious-studies
/core/journals/renewable-agriculture-and-food-systems
/core/journals/review-of-international-studies
/core/journals/review-of-middle-east-studies
/core/journals/review-of-politics
/core/journals/review-of-symbolic-logic
/core/journals/revista-de-historia-economica-journal-of-iberian-and-latin-american-economic-history
/core/journals/robotica
/core/journals/royal-historical-society-camden-fifth-series
/core/journals/royal-institute-of-philosophy-supplements
/core/journals/rural-history
/core/journals/science-in-context
/core/journals/scottish-journal-of-theology
/core/journals/seed-science-research
/core/journals/slavic-review
/core/journals/social-philosophy-and-policy
/core/journals/social-policy-and-society
/core/journals/social-science-history
/core/journals/spanish-journal-of-psychology
/core/journals/studies-in-american-political-development
/core/journals/studies-in-church-history
/core/journals/studies-in-second-language-acquisition
/core/journals/tempo
/core/journals/theatre-research-international
/core/journals/theatre-survey
/core/journals/theory-and-practice-of-logic-programming
/core/journals/think
/core/journals/traditio
/core/journals/trans-trans-regional-and-national-studies-of-southeast-asia
/core/journals/transactions-of-the-royal-historical-society
/core/journals/transnational-environmental-law
/core/journals/twentieth-century-music
/core/journals/twin-research-and-human-genetics
/core/journals/urban-history
/core/journals/utilitas
/core/journals/victorian-literature-and-culture
/core/journals/visual-neuroscience
/core/journals/weed-science
/core/journals/weed-technology
/core/journals/wireless-power-transfer
/core/journals/world-politics
/core/journals/world-s-poultry-science-journal
/core/journals/world-trade-review
/core/journals/zygote

Tags: andoftheinpycorelocalline
1条回答
网友
1楼 · 发布于 2024-04-25 19:29:03

你应该简化你的代码和你的抓取策略,尽管我可以看到并不是所有的日志页面都有相同的结构。在大多数页面上,您可以通过表单值轻松获得ISSN。在其他方面(我认为是免费访问),你需要应用一些启发式方法来获得ISSN。我也不知道为什么要使用httplib2和requests,因为两者都提供了或多或少相同的功能。不管怎样,这里有一些代码可以满足你的需要。。。有点(我还删除了CSV代码,因为它是不需要的):

import requests
from bs4 import BeautifulSoup, SoupStrainer

with open('listoflinks.csv', encoding="utf8") as f:
        for line in f:
            path = line.strip()
            print("getting", path)
            response = requests.get("https://www.cambridge.org" + path)
            soup = BeautifulSoup(response.text, "html.parser")
            try:
               issn = soup.find("input", attrs={'name': 'productIssn'}).get('value')
            except:
               values = soup.find_all("span", class_="value")
               for v in values:
                  if "(Online)" in v.string:
                      issn = v.string.split(" ")[0]
                      break

            print("issn:", issn)
            details_container = soup.find("div", class_="details-container")
            image = details_container.find("img")
            imgurl = image['src'][2:]
            print("imgurl:", imgurl)
            with open(issn + ".jpg", 'wb') as output:
               output.write(requests.get("http://" + imgurl).content)

相关问题 更多 >