我不能在爬行时移动页面

2024-06-16 11:41:23 发布

您现在位置:Python中文网/ 问答频道 /正文

http://www.kif.re.kr/kif2/publication/pub_list.aspx?menuid=17

我在做爬虫。但我不能进入下一页。我想进入下一页

<a class="pagebutton" href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl12','')">2</a>

这是第二页的html代码

作为在开发人员模式下搜索的结果,它是post方法

Request URL:http://www.kif.re.kr/kif2/publication/pub_list.aspx?menuid=17

下面是在开发人员模式下找到的数据

__EVENTTARGET:ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl12
__EVENTARGUMENT:
__VIEWSTATE:/wEPDwUKMTg4Nzc2Nzc3OA9kFgJmD2QWAgIED2QWAgIDD2QWAmYPZBYEZg9kFgQCAQ8QZBAVBwbsoJzrqqkG7KCA7J6QDOuwnOqwhOyXsOyblAbqtoztmLgJ7J2Y66Kw7LKYBuuqqeywqAbsmpTslb0VBwgyNDAvMjQxLwgyNDAvMjQyLwgyNDAvMjU5LwgyNDAvMjYyLwgyNDAvMzM3LwgyNDAvMjYzLwgyNDAvMjY2LxQrAwdnZ2dnZ2dnZGQCAw8PZBYCHgpvbmtleXByZXNzBV5pZiAoZXZlbnQua2V5Q29kZSA9PSAxMykge19fZG9Qb3N0QmFjaygnY3RsMDAkQ29udGVudFBsYWNlSG9sZGVyMSRkYXRhX2xpc3QxJGlidFNlYXJjaCcsJycpfTsgZAIDDw9kFgIeBWFsaWduBQZjZW50ZXIWAgIDD2QWAmYPZBYCZg9kFhYCBA8PFggeBFRleHQFATEeCENzc0NsYXNzBQdjdXJyZW50HgRfIVNCAgIeB1Zpc2libGVnZGQCBg8PFggfAgUBMh8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAggPDxYIHwIFATMfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIKDw8WCB8CBQE0HwMFCnBhZ2VidXR0b24fBAICHwVnZGQCDA8PFggfAgUBNR8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAg4PDxYIHwIFATYfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIQDw8WCB8CBQE3HwMFCnBhZ2VidXR0b24fBAICHwVnZGQCEg8PFggfAgUBOB8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAhQPDxYIHwIFATkfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIWDw8WCB8CBQIxMB8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAhsPDxYCHwIFCyZuYnNwWzEvODNdZGQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgcFGWN0bDAwJG1lbnVfbmF2MSRpYnRTZWFyY2gFLmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRpYnRTZWFyY2gFMWN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRpYnRTZWFyY2hBbGwFPmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRXZWJQYWdlTmF2aWdhdG9yVjIxJGN0bDA2BT5jdGwwMCRDb250ZW50UGxhY2VIb2xkZXIxJGRhdGFfbGlzdDEkV2ViUGFnZU5hdmlnYXRvclYyMSRjdGwwOAU+Y3RsMDAkQ29udGVudFBsYWNlSG9sZGVyMSRkYXRhX2xpc3QxJFdlYlBhZ2VOYXZpZ2F0b3JWMjEkY3RsMzAFPmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRXZWJQYWdlTmF2aWdhdG9yVjIxJGN0bDMyuFjgj5nepdWXkOAwNYww+divJYtYSrYgHZpTcewu9Ds=
__VIEWSTATEGENERATOR:E95FE49A
__EVENTVALIDATION:/wEdACHcOKX2MiW8o3JKug67fnRBm/LuJNf32p7npb2HQkdSHj2jQIPNrpQqFhY2rmhcQzOr90YGqna/Dtr3eCnJKH/FRrctoJJXOcc5nzwqquFEKe/f6ybfmfBBwP5V9TZX05svUiuWBMoi40eiFXgXu/HvnPjbm91I+Oz3HACj/rejcfKu91e/rwNa3qahKk8QP//P3Ctl3lcnXTxti+MHToVFJ4X5e7akN9M5YNbryOCPFUzWTSqkhEUajNOJze2BA47TqM8vDP0IP5ki4KWYQixH1ITUrNZx490LfBrUZBBPZp6DDFbb0FBaxN5KpyeciB3wOyFRvNC7wvyrzR4zZIFKvsDwEoIoZw4QpAfkYvtGlm/erM6tYMUIO2Y+EofXRtI5fpcvmMZwp9oWz1DjjMQ7kMX3NKB1EbRuWhW/PUV26RCgECz38VETCqQlHmY2JJfazoydmTWb206Gy1R0dPzbnPz5BKeIBWlSOZDH/jTFFrzBKTtWpKGoPFsObJHPJ/aat3bwhGesAEcXWRHlLMcB7+Yj6K/9RPZv/XJ9M8z/IAbi3aAtkyVcWc7DpsPsia8+XWZOcmYS4tf4O30N13XKSyM1xB3zywxlTxuxx1lP5+GDugiF+Yf+KojuR7Az4t0LDho3RsEd/ZN7ejUxBtxfh6oqlZNMy4/Raz+OSUeRTRVfoUMGNPEUTwp88pek/ycTkyMA26w5UfW8JGdFRvrmOA59JlLF9OIGGWESn/RCnw==
ctl00$agentPlatform:1
ctl00$menu_nav1$tbxSearchWord:
ctl00$ContentPlaceHolder1$data_list1$ddlSearchItem:240/241/
ctl00$ContentPlaceHolder1$data_list1$tbxSearch:
ctl00$ContentPlaceHolder1$data_list1$hdnSearchText:
ctl00$ContentPlaceHolder1$data_list1$hdnSearchPath:240/241/
ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl00:0
ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl01:1
ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl02:821

下面是我的代码。代码本身没有错误。 然而,r.text的值不是我想要的

       url = 'http://www.kif.re.kr/kif2/publication/pub_list.aspx?menuid=17'
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text, 'lxml')

        pageTag = soup.findAll('td',align='center')

        inputTag = pageTag[0].findAll('a')

        for link in inputTag:
            print(link['href'])
            payload = {'__EVENTTARGET ' :'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl12',
                       '__EVENTARGUMENT' : '',
                       '__VIEWSTATE'
                       : '/wEPDwUKMTg4Nzc2Nzc3OA9kFgJmD2QWAgIED2QWAgIDD2QWAmYPZBYEZg9kFgQCAQ8QZBAVBwbsoJzrqqkG7KCA7J6QDOuwnOqwhOyXsOyblAbqtoztmLgJ7J2Y66Kw7LKYBuuqqeywqAbsmpTslb0VBwgyNDAvMjQxLwgyNDAvMjQyLwgyNDAvMjU5LwgyNDAvMjYyLwgyNDAvMzM3LwgyNDAvMjYzLwgyNDAvMjY2LxQrAwdnZ2dnZ2dnZGQCAw8PZBYCHgpvbmtleXByZXNzBV5pZiAoZXZlbnQua2V5Q29kZSA9PSAxMykge19fZG9Qb3N0QmFjaygnY3RsMDAkQ29udGVudFBsYWNlSG9sZGVyMSRkYXRhX2xpc3QxJGlidFNlYXJjaCcsJycpfTsgZAIDDw9kFgIeBWFsaWduBQZjZW50ZXIWAgIDD2QWAmYPZBYCZg9kFhYCBA8PFggeBFRleHQFATEeCENzc0NsYXNzBQdjdXJyZW50HgRfIVNCAgIeB1Zpc2libGVnZGQCBg8PFggfAgUBMh8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAggPDxYIHwIFATMfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIKDw8WCB8CBQE0HwMFCnBhZ2VidXR0b24fBAICHwVnZGQCDA8PFggfAgUBNR8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAg4PDxYIHwIFATYfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIQDw8WCB8CBQE3HwMFCnBhZ2VidXR0b24fBAICHwVnZGQCEg8PFggfAgUBOB8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAhQPDxYIHwIFATkfAwUKcGFnZWJ1dHRvbh8EAgIfBWdkZAIWDw8WCB8CBQIxMB8DBQpwYWdlYnV0dG9uHwQCAh8FZ2RkAhsPDxYCHwIFCyZuYnNwWzEvODNdZGQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgcFGWN0bDAwJG1lbnVfbmF2MSRpYnRTZWFyY2gFLmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRpYnRTZWFyY2gFMWN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRpYnRTZWFyY2hBbGwFPmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRXZWJQYWdlTmF2aWdhdG9yVjIxJGN0bDA2BT5jdGwwMCRDb250ZW50UGxhY2VIb2xkZXIxJGRhdGFfbGlzdDEkV2ViUGFnZU5hdmlnYXRvclYyMSRjdGwwOAU+Y3RsMDAkQ29udGVudFBsYWNlSG9sZGVyMSRkYXRhX2xpc3QxJFdlYlBhZ2VOYXZpZ2F0b3JWMjEkY3RsMzAFPmN0bDAwJENvbnRlbnRQbGFjZUhvbGRlcjEkZGF0YV9saXN0MSRXZWJQYWdlTmF2aWdhdG9yVjIxJGN0bDMyuFjgj5nepdWXkOAwNYww+divJYtYSrYgHZpTcewu9Ds=',
                       '__VIEWSTATEGENERATOR' : 'E95FE49A',
                       '__EVENTVALIDATION' : '/wEdACHcOKX2MiW8o3JKug67fnRBm/LuJNf32p7npb2HQkdSHj2jQIPNrpQqFhY2rmhcQzOr90YGqna/Dtr3eCnJKH/FRrctoJJXOcc5nzwqquFEKe/f6ybfmfBBwP5V9TZX05svUiuWBMoi40eiFXgXu/HvnPjbm91I+Oz3HACj/rejcfKu91e/rwNa3qahKk8QP//P3Ctl3lcnXTxti+MHToVFJ4X5e7akN9M5YNbryOCPFUzWTSqkhEUajNOJze2BA47TqM8vDP0IP5ki4KWYQixH1ITUrNZx490LfBrUZBBPZp6DDFbb0FBaxN5KpyeciB3wOyFRvNC7wvyrzR4zZIFKvsDwEoIoZw4QpAfkYvtGlm/erM6tYMUIO2Y+EofXRtI5fpcvmMZwp9oWz1DjjMQ7kMX3NKB1EbRuWhW/PUV26RCgECz38VETCqQlHmY2JJfazoydmTWb206Gy1R0dPzbnPz5BKeIBWlSOZDH/jTFFrzBKTtWpKGoPFsObJHPJ/aat3bwhGesAEcXWRHlLMcB7+Yj6K/9RPZv/XJ9M8z/IAbi3aAtkyVcWc7DpsPsia8+XWZOcmYS4tf4O30N13XKSyM1xB3zywxlTxuxx1lP5+GDugiF+Yf+KojuR7Az4t0LDho3RsEd/ZN7ejUxBtxfh6oqlZNMy4/Raz+OSUeRTRVfoUMGNPEUTwp88pek/ycTkyMA26w5UfW8JGdFRvrmOA59JlLF9OIGGWESn/RCnw==',
                       'ctl00$agentPlatform' : '1',
                       'ctl00$menu_nav1$tbxSearchWord' : '',
                       'ctl00$ContentPlaceHolder1$data_list1$ddlSearchItem' : '240/241/',
                       'ctl00$ContentPlaceHolder1$data_list1$tbxSearch' : '',
                       'ctl00$ContentPlaceHolder1$data_list1$hdnSearchText':'',
                       'ctl00$ContentPlaceHolder1$data_list1$hdnSearchPath' : '240/241/',
                       'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl00' : '0',
                       'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl01' : '1',
                       'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl02' : '821'
                       }
            r = requests.post('http://www.kif.re.kr/kif2/publication/pub_list.aspx?menuid=17', data=payload)

            print(r.text)
            return

我怎样才能进入下一页


Tags: textrehttpdatawwwlistpubkr
1条回答
网友
1楼 · 发布于 2024-06-16 11:41:23
'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl01' : '1'

此控件控制页码。而且是零基数。如果你想转到第2页,就把它改成1

你应该把它变成一个变量:

for i in range(page_number):
    ....
    'ctl00$ContentPlaceHolder1$data_list1$WebPageNavigatorV21$ctl01' : i
    ....

相关问题 更多 >