如何使用networking/xhr查找此页面的表内容?

2024-05-17 16:17:37 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图使用以下代码访问此页面上下表的内容:https://cdn.ime.co.ir

enter image description here

根据该守则:

import requests
with requests.Session() as s:
    data = {'ContractCode' : 'SAFOR99' }
    r = s.post('https://cdn.ime.co.ir/home/load/' , json = data ).json()
    print(r)

但我从结果中看到:

JSONDecodeError: Expecting value

请帮助我了解如何阅读此表的内容


Tags: 代码httpsimportjson内容datairsession
1条回答
网友
1楼 · 发布于 2024-05-17 16:17:37

数据不在html内容中,但通过API检索,更具体地说,协议是websocket。您可以使用chromedevtools和wss上的过滤器来检查框架,以找到以下url:wss://cdn.ime.co.ir/realTimeServer/connect

有多个必需的查询参数,包括connectionToken,它是通过https://cdn.ime.co.ir/realTimeServer/negotiate上的RESTAPI获取的

打开websocket后,除非使用相同的ConnectionToken值对https://cdn.ime.co.ir/realTimeServer/start执行另一个rest请求,否则不会收到太多数据。之后,服务器向您发送JSON数据

下面的代码执行上面描述的所有任务,并获得result未筛选列表中的数据:

import requests
import json
import asyncio
import websockets
import urllib
import random
from threading import Thread

connectionData = [{"name":"marketshub"}]

r = requests.get("https://cdn.ime.co.ir/realTimeServer/negotiate", params = {
    "clientProtocol": "2.1",
    "connectionData": json.dumps(connectionData),
})
response = r.json()

print(f'got connection token : {response["ConnectionToken"]}')

wsParams = {
    "transport": "webSockets",
    "clientProtocol": "2.1",
    "connectionToken": response["ConnectionToken"],
    "connectionData": json.dumps(connectionData),
    "tid": random.randint(0,9)
}

websocketUri = f"wss://cdn.ime.co.ir/realTimeServer/connect?{urllib.parse.urlencode(wsParams)}"

def startReceiving(arg):
    r = requests.get("https://cdn.ime.co.ir/realTimeServer/start", params = wsParams)
    print(f'started receiving data : {r.json()}')

result = []

async def websocketConnect():
    async with websockets.connect(websocketUri) as websocket:
        print(f'started websocket')
        thread = Thread(target = startReceiving, args = (0, ))
        thread.start()
        for i in range(0,10):
            print("receiving")
            data = await websocket.recv()
            jsonData = json.loads(data)
            if ("M" in jsonData and len(jsonData["M"]) > 0 and "A" in jsonData["M"][0] and len(jsonData["M"][0]["A"]) > 0):
                items = jsonData["M"][0]["A"][0]
                if type(items) == list and len(items) > 0: 
                    result = items
                    break
        thread.join()
        print(json.dumps(result, indent=4, sort_keys=True))

asyncio.get_event_loop().run_until_complete(websocketConnect())

然后,您可以使用以下方法获取SAFOR99项:

print([i for i in result if i["ContractCode"] == "SAFOR99"])

相关问题 更多 >