用python从Tableau图表中抓取数据

2024-05-14 16:48:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从以下网站获取所有权数据:

https://www.usnewsdeserts.com/states/california/#1536357227283-a4a9d6e4-ccf9

我使用的代码如下所示:

import requests
from bs4 import BeautifulSoup
import json
import re
import random
url = "https://public.tableau.com/vizql/w/TopOwnersCalifornia/v/Owners/bootstrapSession/sessions/5E565C4C5F7D462BBE8DFEE9246F846E-0:0"
header = random.choice(user_agent_list)
url = "https://public.tableau.com/vizql/w/TopOwnersCalifornia/v/Owners/bootstrapSession/sessions/5E565C4C5F7D462BBE8DFEE9246F846E-0:0"
header = random.choice(user_agent_list)
HEADERS = {"User-Agent": header}
params = {"stickySessionKey": {"dataserverPermissions":"44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a"}}
r = requests.post(url, params=params, headers = HEADERS)
soup = BeautifulSoup(r.text, "html.parser")      
print(soup)

我得到:

<br/>
2020-12-12 12:41:46.829
(X9S6ik90vQizHF9Qa-S@CwAAAUk,0:0)

如何获取这些数据


Tags: 数据httpsimportcomurlrandomparamspublic
1条回答
网友
1楼 · 发布于 2024-05-14 16:48:36

我做了一个tableau scraper library来从Tableau工作表中提取数据。您只需在developer tools的网络选项卡中找到tableau URL,在本例中:

GET https://public.tableau.com/views/NewspapersByCountyCalifornia/Newspaperbycounty

您可以使用以下代码提取数据:

from tableauscraper import TableauScraper as TS

url = "https://public.tableau.com/views/NewspapersByCountyCalifornia/Newspaperbycounty"

ts = TS()
ts.loads(url)
dashboard = ts.getDashboard()

for t in dashboard.worksheets:
    #show worksheet name
    print(f"WORKSHEET NAME : {t.name}")
    #show dataframe for this worksheet
    print(t.data)

run in this repl.it

相关问题 更多 >

    热门问题