在使用web链接时遇到问题

import urllib from BeautifulSoup import * url = raw_input('Enter - ') rpt=raw_input('Enter Position') rpt=int(rpt) cnt=raw_input('Enter Count') cnt=int(cnt) count=0 counts=0 tags=list() soup=None while x==0: html = urllib.urlopen(url).read() soup = BeautifulSoup(html) # Retrieve all of the anchor tags tags=soup.findAll('a') for tag in tags: url= tag.get('href') count=count + 1 if count== rpt: break counts=counts + 1 if counts==cnt: x==1 else: continue print url

3条回答

网友

1楼 · 编辑于 2024-05-16 19:55:16

我也参加了那个课程，在一个朋友的帮助下，我解决了这个问题：

import urllib
from bs4 import BeautifulSoup

url = "http://python-data.dr-chuck.net/known_by_Happy.html"
rpt=7
position=18

count=0
counts=0
tags=list()
soup=None
x=0
while x==0:
    html = urllib.urlopen(url).read()
    soup = BeautifulSoup(html,"html.parser")
    tags=soup.findAll('a')
    url= tags[position-1].get('href')
    count=count + 1
    if count == rpt:
        break

print  url

网友

2楼 · 编辑于 2024-05-16 19:55:16

我相信这就是你想要的：

import urllib
from bs4 import *
url = raw_input('Enter - ')
position=int(raw_input('Enter Position'))
count=int(raw_input('Enter Count'))

#perform the loop "count" times.
for _ in xrange(0,count):
    html = urllib.urlopen(url).read()
    soup = BeautifulSoup(html)
    tags=soup.findAll('a')
    for tag in tags:
        url= tag.get('href')
        tags=soup.findAll('a')
        # if the link does not exist at that position, show error.
        if not tags[position-1]:
            print "A link does not exist at that position."
        # if the link at that position exist, overwrite it so the next search will use it.
        url = tags[position-1].get('href')
print url

代码现在将循环输入中指定的次数，每次它将在给定的位置使用href并将其替换为url，这样，每个循环将在树结构中进一步查看。在

我建议您使用变量的全名，这样更容易理解。此外，你可以把它们放在一行读出来，这样你的开头就更容易理解了。在

网友

3楼 · 编辑于 2024-05-16 19:55:16

根据詹森斯的回答，我找到了解决办法

url = tags[position-1].get('href')

为我做了这个把戏！在

谢谢你的帮助！在

相关问题更多 >

编程相关推荐

热门问题

热门文章