我已成功地从网站https://fcainfoweb.nic.in/Reports/Report_Menu_Web.aspx中获取数据。我制作了一个excel文件,其中包含一种商品的结果。在抓取第二种商品的数据后,我无法将另一张表添加到同一excel文件中。任何帮助都将不胜感激。先谢谢你。这是我的密码:-
from selenium import webdriver
import time, re
from selenium.webdriver.support.ui import Select
from bs4 import BeautifulSoup
import pandas as pd
from selenium import webdriver
import time
chrome_path = r"C:\Users\user\Desktop\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://fcainfoweb.nic.in/Reports/Report_Menu_Web.aspx")
html_source = driver.page_source
results=[]
driver.find_element_by_xpath("""//*[@id="ctl00_MainContent_Rbl_Rpt_type_1"]""").click()
element_variation = driver.find_element_by_id ("ctl00_MainContent_Ddl_Rpt_Option1")
drp_variation = Select(element_variation)
drp_variation.select_by_visible_text("Daily Variation")
driver.find_element_by_id("ctl00_MainContent_Txt_FrmDate").send_keys("01/05/2020")
driver.find_element_by_id("ctl00_MainContent_Txt_ToDate").send_keys("27/05/2020")
element_commodity = driver.find_element_by_id ("ctl00_MainContent_Lst_Commodity")
drp_commodity = Select(element_commodity)
drp_commodity.select_by_visible_text("Rice")
driver.find_element_by_xpath("""//*[@id="ctl00_MainContent_btn_getdata1"]""").click()
soup = BeautifulSoup(driver.page_source, 'html.parser')
table = pd.read_html(driver.page_source)[2] #second table is the one that we want
print(len(table))
print(table)
results.append(table)
driver.back()
time.sleep(1)
with pd.ExcelWriter(r'C:\Users\user\Desktop\python.xlsx') as writer:
table.to_excel(writer, sheet_name = "rice", index=False) # Rice results on sheet named rice
writer.save()
driver.find_element_by_xpath("""//*[@id="btn_back"]""").click()
driver.find_element_by_xpath("""//*[@id="ctl00_MainContent_Rbl_Rpt_type_1"]""").click()
element_variation = driver.find_element_by_id ("ctl00_MainContent_Ddl_Rpt_Option1")
drp_variation = Select(element_variation)
drp_variation.select_by_visible_text("Daily Variation")
driver.find_element_by_id("ctl00_MainContent_Txt_FrmDate").send_keys("01/05/2020")
driver.find_element_by_id("ctl00_MainContent_Txt_ToDate").send_keys("27/05/2020")
element_commodity = driver.find_element_by_id ("ctl00_MainContent_Lst_Commodity")
drp_commodity = Select(element_commodity)
drp_commodity.select_by_visible_text("Wheat")
driver.find_element_by_xpath("""//*[@id="ctl00_MainContent_btn_getdata1"]""").click()
soup = BeautifulSoup(driver.page_source, 'html.parser')
table = pd.read_html(driver.page_source)[2] #second table is the one that we want
print(len(table))
print(table)
results.append(table)
driver.back()
time.sleep(1)
with pd.ExcelWriter(r'C:\Users\user\Desktop\python.xlsx') as writer:
table.to_excel(writer, sheet_name = "wheat", index=False) # Wheat results on sheet named wheat
writer.save()
对于某些类型的文件,您可能必须将所有数据读取到内存中,添加新数据,然后将所有数据再次保存到文件中。对于其他一些文件,您必须使用“附加”模式
有关ExcelWriter的信息,请参阅文档。它有选项
mode="a"
附加到现有文件或者您可以在一个
with
中完成,而无需附加顺便说一句:我发现
append
模式不适用于引擎xlsxwriter
,我不得不使用引擎openpyxl
(这也意味着用pip
安装模块openpyxl
)我找到了有问题的可用引擎Engines available for to_excel function in pandas
完整工作代码
相关问题 更多 >
编程相关推荐