有人能帮我理解为什么我在这里的函数不返回我作为参数提供的url列表中的每个url,以及为什么我得到以下输出吗?我只是尝试返回每个项目的url和列表,以及每个url的项目对应的所有图像。你知道吗
beta_test_items = ['https://www.facebook.com/marketplace/item/2009940172578816',
'https://www.facebook.com/marketplace/item/1591865710899243']
from selenium import webdriver
from time import sleep
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
def scrape_item_details(beta_test_items):
#finish this function
for url in beta_test_items:
images = []
driver.get(url)
sleep(3)
image_element = driver.find_element_by_xpath('//img[contains(@class, "_5m")]')
images = [image_element.get_attribute('src')]
try:
previous_and_next_buttons = driver.find_elements_by_xpath("//i[contains(@class, '_3ffr')]")
next_image_button = previous_and_next_buttons[1]
print(next_image_button.text)
if next_image_button.is_displayed():
next_image_button.click()
image_element = driver.find_element_by_xpath('//img[contains(@class, "_5m")]')
print(image_element.get_attribute('src'))
sleep(2)
if image_element.get_attribute('src') in images:
pass
else:
images.append(image_element.get_attribute('src'))
else:
pass
except:
pass
yield(url, images)
if __name__ == '__main__':
当我尝试当前运行它时,我得到了以下输出,我不知道为什么在第二张照片被附加到图像列表后,它会在第一个url上停止:
In [46]: scrape_item_details(beta_items_list)
['https://scontent-atl3-1.xx.fbcdn.net/v/t1.0-9/27750896_2002108023449096_2229019388723795634_n.jpg?oh=26d3fe06595affdcbd142754766fe934&oe=5B0933C9']
Next
https://scontent-atl3-1.xx.fbcdn.net/v/t1.0-9/27655331_2002108026782429_4575620607831413757_n.jpg?oh=a7c94bc2b8ef8b39bc65291b641f7953&oe=5B0A11DD
Out[46]:
('https://www.facebook.com/marketplace/item/2009940172578816',
['https://scontent-atl3-1.xx.fbcdn.net/v/t1.0-9/27750896_2002108023449096_2229019388723795634_n.jpg?oh=26d3fe06595affdcbd142754766fe934&oe=5B0933C9',
'https://scontent-atl3-1.xx.fbcdn.net/v/t1.0-9/27655331_2002108026782429_4575620607831413757_n.jpg?oh=a7c94bc2b8ef8b39bc65291b641f7953&oe=5B0A11DD'])
----更新----
我将return改为yield,运行list(scrape_item_details(beta_test_items))
时得到以下输出:
[('https://www.facebook.com/marketplace/item/2009940172578816',
['https://scontent-atl3-1.xx.fbcdn.net/v/t1.0-9/27750896_2002108023449096_2229019388723795634_n.jpg?oh=26d3fe06595affdcbd142754766fe934&oe=5B0933C9',
'https://scontent-atl3-1.xx.fbcdn.net/v/t1.0-9/27655331_2002108026782429_4575620607831413757_n.jpg?oh=a7c94bc2b8ef8b39bc65291b641f7953&oe=5B0A11DD',
'https://scontent-atl3-1.xx.fbcdn.net/v/t1.0-9/27973017_1685674758138175_781683034741350935_n.jpg?oh=e2aa32aa73f3bb9061e861bd1ea306cb&oe=5B0741FF']),
('https://www.facebook.com/marketplace/item/1591865710899243',
['https://scontent-atl3-1.xx.fbcdn.net/v/t1.0-9/27750896_2002108023449096_2229019388723795634_n.jpg?oh=26d3fe06595affdcbd142754766fe934&oe=5B0933C9',
'https://scontent-atl3-1.xx.fbcdn.net/v/t1.0-9/27655331_2002108026782429_4575620607831413757_n.jpg?oh=a7c94bc2b8ef8b39bc65291b641f7953&oe=5B0A11DD',
'https://scontent-atl3-1.xx.fbcdn.net/v/t1.0-9/27973017_1685674758138175_781683034741350935_n.jpg?oh=e2aa32aa73f3bb9061e861bd1ea306cb&oe=5B0741FF'])]
不确定为什么第一个url中的图像会重复作为第二个url的输入?你知道吗
目前没有回答
相关问题 更多 >
编程相关推荐