<p>因为您使用的是Selenium,所以我创建了一个逻辑来为给定的url刮取表值。我使用Csv从这个页面导出数据表。我相信这可以转换成熊猫数据帧。你知道吗</p>
<pre><code>import csv
csvFile = open('DataDetails.csv', 'w')
writer = csv.writer(csvFile)
link = 'http://webapps2.rrc.texas.gov/EWA/productionQueryAction.do?pager.pageSize=10&pager.offset=10&methodToCall=search&searchArgs.paramValue=|1=Operator|2=01|3=2019|4=01|5=2019|8=production|9=Operator|10=13|101=Both|102=01|204=district&rrcActionMan=H4sIAAAAAAAAAMWQy07DMBBFv6ZsKkUeJ6nSxSxCgS1PwSLqwsQmsZTW0cThIfnjmTiVimi3iN31vHzvCSAEygACEC6I6rL21u0fatKV2GKYuo9GUd0uN2S9Iavi7Id5VX0_yIRXEm8-1ZA07n2RloXgvsSFvLl-KVmmk-zJ6TEevh8Nfc1_JNpxP8Od8a3TT26juo4LOY77oTe1fbNG87tAceKrGqKlkpoh6RWp3bPqRjPZxdvekPKOgkQBIUUpYB2ySeezLvDoJqyP8yAQ0pjt0vk2huAlyQa1HTzZ2v-yAQc8V4c2h1_yxl_TmaOzXMUgd6ox9APCWWKwrc7NRmCcEUEwKciyYiLFoCDLw4qrp2f-Bfw3Uvd3sqQCAAA'
wait = WebDriverWait(driver, 10)
driver.set_page_load_timeout(10)
driver.get(link)
#Finding Table Headers
Headers = driver.find_elements_by_xpath("//table[@class='DataGrid']//tr/td[@class='PagerBanner']/parent::tr/following::tr[1]//a")
items = []
for header in Headers:
items.append(header.text)
writer.writerow(items)
for i in range(10): #Assuming table size is 10
values = driver.find_elements_by_xpath("//table[@class='DataGrid']//tr/td[@class='PagerBanner']/parent::tr/following::tr[1]/following::tr["+str(i+1)+"]/td")
rows=[]
for value in values:
rows.append(value.text)
writer.writerow(rows)#writing row values
</code></pre>
<p><strong>这是输出,</strong></p>
<p>操作员姓名操作员编号石油(BBL)套管头(MCF)GW天然气(MCF)凝析油(BBL)</p>
<p>4 SWIFT服务有限责任公司953799 0 0</p>
<p>94工作,LP 966260 0</p>
<p>A&C石油有限责任公司214 329 0 0</p>
<p>A.N.MAC DIARMID公司572 108 0 0</p>
<p>AAA石油公司148 25 1 0 0</p>
<p>AARONMARK服务有限责任公司891 38 0 0</p>
<p>AB RESERVE有限责任公司893 0 0 0</p>
<p>ABACO公司894 0 0 4370 83</p>
<p>阿布拉克斯石油公司3125 9356 10706 0</p>
<p>雅阁GR能源公司3422 0 0 0</p>