使用Python请求通过AJAX表单

2024-05-21 05:44:23 发布

您现在位置:Python中文网/ 问答频道 /正文

我一直试图使用python请求模块跳过http://dq.ndc.bsnl.co.in/bsnl-web/residentialSearch.seam上的表单页面。在

我猜问题是表单字段中的AJAX。我真的不知道如何用Python请求发送请求。 我知道这可以通过Selenium实现,但我需要通过请求来完成。在

以下是我当前的代码:

import requests
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:27.0) Gecko/20100101 Firefox/27.0'
           }
payload = {
    "residential": "residential",
    "residential:j_id12": "",
    "residential:firstField": 'a',
    "residential:criteria1": "3",
    "residential:city": "ASIND",
    "residential:button1": "residential:button1",
    "residential:suggestionBoxId_selection": "",
    "javax.faces.ViewState": "j_id1"

}
with requests.Session() as s:
    # print s.headers
    print s.get('http://dq.ndc.bsnl.co.in/bsnl-web/residentialSearch.seam')
    print s.headers
    print s.cookies
    resp = s.post(
        'http://dq.ndc.bsnl.co.in/bsnl-web/residentialSearch.seam',
        data=payload, headers=headers)

    print resp.text

Tags: inwebhttprequestsheaderspayloadprintco
1条回答
网友
1楼 · 发布于 2024-05-21 05:44:23

你已经接近完整的解决方案了。首先,您需要有效负载中的AJAXREQUEST来启动搜索,然后按照重定向到第一个结果页面。下一页你会收到更多的请求。唯一的问题是:没有真正的页面结束标记,它会从第一页重新开始。所以我必须查看Page x of y的内容。在

import re
import requests
import requests.models

# non-standard conform redirect:
requests.Response.is_redirect = property(lambda self: (
    'location' in self.headers and (
        self.status_code in requests.models.REDIRECT_STATI or
        self.headers.get('Ajax-Response', '') == 'redirect'
)))

headers = {
    'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:27.0) Gecko/20100101 Firefox/27.0'
}
payload = {
    "AJAXREQUEST": "loader2",
    "residential": "residential",
    "residential:j_id12": "",
    "residential:firstField": 'a',
    "residential:criteria1": "3",
    "residential:city": "ASIND",
    "residential:button1": "residential:button1",
    "residential:suggestionBoxId_selection": "",
    "javax.faces.ViewState": "j_id1"

}

with requests.Session() as s:
    print s.get('http://dq.ndc.bsnl.co.in/bsnl-web/residentialSearch.seam')
    print s.headers
    print s.cookies
    resp = s.post(
        'http://dq.ndc.bsnl.co.in/bsnl-web/residentialSearch.seam',
        data=payload, headers=headers)

    while True:
        # do data processing
        for l in resp.text.split("subscriber');")[1:]: print l[2:].split('<')[0]

        # look for next page
        current, last = re.search('Page (\d+) of (\d+)', resp.text).groups()
        if int(current) == int(last):
            break

        resp = s.post('http://dq.ndc.bsnl.co.in/bsnl-web/resSrchDtls.seam',
            data={'AJAXREQUEST':'_viewRoot',
                'j_id10':'j_id10',
                'javax.faces.ViewState':'j_id2',
                'j_id10:PGDOWNLink':'j_id10:PGDOWNLink',
            }, headers=headers)

相关问题 更多 >