循环发送输出到csv fi时

2024-03-29 05:01:15 发布

您现在位置:Python中文网/ 问答频道 /正文

它从CSV文件转到一个URL,然后向下滚动。我正试图从网页上抓取公司的网址。我好像不能让它工作。现在,如果我只使用一个独立的URL而不从CSV中提取它,它将打印到powershell。仍然无法将其写入CSV。你知道吗

以下是我正在使用的几个URL:

https://www.facebook.com/search/pages/?q=Los%20Angeles%20remodeling
https://www.facebook.com/search/pages/?q=Boston%20remodeling

我的想法是它必须是一个循环中的一个循环。或者,它可以是ifelif。我现在还不确定。如有任何建议,我们将不胜感激。你知道吗

import time
from selenium import webdriver
from bs4 import BeautifulSoup as bs
import csv
import requests
from selenium.webdriver.support.ui import WebDriverWait


driver = webdriver.Chrome()
elems = driver.find_elements_by_class_name('_32mo')


chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)


driver.get('https://www.facebook.com')
username = driver.find_element_by_id("email")
password = driver.find_element_by_id("pass")
username.send_keys("*****")
password.send_keys("******")
driver.find_element_by_id('loginbutton').click()
time.sleep(2)



with open('fb_urls.csv') as f_input, open('fb_profile_urls.csv', 'w', newline=)  as f_output:
    csv_input = csv.reader(f_input)
    csv_output = csv.writer(f_output)
    for url in csv_input:
        driver.get(url[0])
        time.sleep(5)
        lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
        match=False
        while(match==False):
            lastCount = lenOfPage
            time.sleep(1)
            lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
            if lastCount==lenOfPage:
                match=True
                for elem in elems:
                    csv_output.(driver.find_elements_by_tag_name('href'))

Tags: csvimportinputoutputbytimedriverbody