将Python Selenium输出写入Excel
我写了一个脚本,用来从网上抓取产品信息。我的目标是把这些信息写到一个Excel文件里。因为我对Python的了解有限,所以我只知道怎么用PowerShell里的Out-file来导出数据。但这样做的结果是,每个产品的信息都被打印在了不同的行上。我希望每个产品的信息能在同一行里。
我想要的输出效果可以在图片中看到。我希望我的输出能像第二种版本那样,但第一种也可以接受。
这是我的代码:
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
url = "http://www.strem.com/"
cas = ['16940-92-4','29796-57-4','13569-57-8','15635-87-7']
for i in cas:
driver = webdriver.Firefox()
driver.get(url)
driver.find_element_by_id("selectbox_input").click()
driver.find_element_by_id("selectbox_input_cas").click()
inputElement = driver.find_element_by_name("keyword")
inputElement.send_keys(i)
inputElement.submit()
# Check if a particular element exists; returns True/False
def check_exists_by_xpath(xpath):
try:
driver.find_element_by_xpath(xpath)
except NoSuchElementException:
return False
return True
xpath1 = ".//div[@class = 'error']" # element containing error message
xpath2 = ".//table[@class = 'product_list tiles']" # element containing table to select product from
#xpath3 = ".//div[@class = 'catalog_number']" # when selection is needed, returns the first catalog number
if check_exists_by_xpath(xpath1):
print "cas# %s is not found on Strem." %i
driver.quit()
else:
if check_exists_by_xpath(xpath2):
catNum = driver.find_element_by_xpath(".//div[@class = 'catalog_number']")
catNum.click()
country = driver.find_element_by_name("country")
for option in country.find_elements_by_tag_name('option'):
if option.text == "USA":
option.click()
country.submit()
name = driver.find_element_by_id("header_description").text
prodNum = driver.find_element_by_class_name("catalog_number").text
print(i)
print(name.encode("utf-8"))
print(prodNum)
skus_by_xpath = WebDriverWait(driver, 10).until(
lambda driver : driver.find_elements_by_xpath(".//td[@class='size']")
)
for output in skus_by_xpath:
print(output.text)
prices_by_xpath = WebDriverWait(driver, 10).until(
lambda driver : driver.find_elements_by_xpath(".//td[@class='price']")
)
for result in prices_by_xpath:
print(result.text[3:]) #To remove last three characters, use :-3
driver.quit()
else:
country = driver.find_element_by_name("country")
for option in country.find_elements_by_tag_name('option'):
if option.text == "USA":
option.click()
country.submit()
name = driver.find_element_by_id("header_description").text
prodNum = driver.find_element_by_class_name("catalog_number").text
print(i)
print(name.encode("utf-8"))
print(prodNum)
skus_by_xpath = WebDriverWait(driver, 10).until(
lambda driver : driver.find_elements_by_xpath(".//td[@class='size']")
)
for output in skus_by_xpath:
print(output.text)
prices_by_xpath = WebDriverWait(driver, 10).until(
lambda driver : driver.find_elements_by_xpath(".//td[@class='price']")
)
for result in prices_by_xpath:
print(result.text[3:]) #To remove last three characters, use :-3
driver.quit()
2 个回答
0
我通常发现,把数据写入CSV文件是将数据导入Excel的最安全方法。我用的代码大概是这样的:
import csv
import sys
import time
import datetime
from os import fsync
ts=time.time() #get the time, to use in a filename
ds=datetime.datetime.fromtimestamp(ts).strftime('%Y%m%d%H%M') #format the time for the filename
f2=open('OutputLog_'+ds+'.txt','w') #my file is output_log + the date time stamp
f2.write(str('Column1DataPoint'+','+'Column2DataPoint') #write your text, separate your data with comma's
#if you're running a long loop, and want to keep your file up to date with the proces do these two steps in your loop too
f2.flush()
fsync(f2.fileno())
#once the loop is finished and data is writtin, close your file
f2.close()
我觉得对你来说,可以把上面的代码中的写入部分改成下面这样:
f2.write(str(i+','+name.encode("utf-8")+','+prodNum+','+output.text)
1
这是一个关于一个Python库的教程,这个库可以让你在Python中处理Excel文件。虽然还有其他的库,但我个人比较喜欢用这个。
首先,你需要导入这个库里的Workbook类,代码是:
from openpyxl import Workbook
wb = Workbook()
然后,你可以使用提供的方法来写入你的数据,最后用下面的代码保存你的文件:
wb.save(filename)
开始使用起来非常简单。
这里有一个关于使用xlwt和xlrd的PDF教程,不过我自己不太常用这两个模块。你可以在这里查看:http://www.simplistix.co.uk/presentations/python-excel.pdf