如何获取更改后的url来发布数据?

2024-05-13 00:00:38 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从[website][1]获取json数据。现在我拿到了工作单。我想知道工作的细节。我发现是用邮局寄的我知道。所以呢,我使用如下请求发布数据:

url = "https://www2.jobs.gov.hk/1/0/WebServices/QuickviewWS.asmx/F_GetJobCardDetail"
postdata = {'p_ordNo': card,'p_langOpt': '3','p_liveOnly': ''}
scode = requests.post(url, data = postdata,timeout=30).status

它返回代码502如下

[![在此处输入图像描述][2]][2]

[![在此处输入图像描述][3]][3]

我发现姿势改变了,但我不知道如何得到改变的网址。你知道吗


Tags: 数据https图像jsonurljobswebsite细节
2条回答

我修改了请求,如下所示:

postdata = {'p_ordNo': card,
            'p_langOpt': '3',
            'p_liveOnly': ''}

session.post(url, data = json.dumps(postdata),headers=headers).content

现在起作用了

不是

session.post(url, data = postdata,headers=headers).content

我不确定所有这些步骤都是必要的,但这就是我工作的方式,请阅读代码中的注释:

import requests
from bs4 import BeautifulSoup
import json

# use requests.session() to capture cookies.
session = requests.session()

# Set some headers it needs at least some of these.
headers={'User-Agent':"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv'\:'57.0) Gecko/20100101 Firefox/57.0",
'Accept': 'application/json, text/javascript, */*','Accept-Language': 'en-GB,en;q=0.8,fr;q=0.6,es;q=0.4,pl;q=0.2',
'Accept-Encoding': 'gzip, deflate',
'Referer':'http://www1.jobs.gov.hk/1/0/WebForm/jobseeker/jobsearch/quickview.aspx?ResetTimeStamp=true&SearchFor=jobtype&id=1',
'Content-Type':' application/json; charset=utf-8',
'X-Requested-With':'XMLHttpRequest',
'Connection':' keep-alive'}

# Make a GET request to get the value we use for p_ordNo later.
url = "http://www2.jobs.gov.hk/1/0/WebForm/jobseeker/jobsearch/quickview.aspx?ResetTimeStamp=true&SearchFor=jobtype&id=1"
page = session.get(url, headers=headers).text

# Read the value we use for p_ordNo later.
soup = BeautifulSoup(page, "lxml")
value = soup.find("input", {"name": "ctl00$ContentPlaceHolder1$uxSelectedOrdNo"})["value"]

# Make the POST request it is not JSON as it is not quoted correctly.
url = "http://www2.jobs.gov.hk/1/0/WebServices/Quickview3WS.asmx/F_GetJobCardDetail"

# get the response as a JSON object
result = session.post(url, headers=headers, data = "{{p_ordNo:'{}',p_langOpt: '3',p_liveOnly: ''}}".format(value),timeout=30).json()

#Print the result.
print (json.dumps(result, indent=4, sort_keys=True, ensure_ascii=False))

相关问题 更多 >