试图让BeautifulSoup下载一些数据。未获取任何错误,但未下载任何内容

2024-03-28 20:54:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我只是试着运行下面的代码。我没有收到错误消息,但没有数据被实际写入CSV。我查看了网站,发现了snapshot-td2-cpsnapshot-td2元素。当我删除writer.writerow语句并使用print语句时,我看到六个数字2字符,就是这样

import csv
import requests
from bs4 import BeautifulSoup

url_base = "https://finviz.com/quote.ashx?t="
tckr = ['SBUX','MSFT','AAPL']
url_list = [url_base + s for s in tckr]

with open('C:/Users/Excel/Desktop/today.csv', 'a', newline='') as f:
    writer = csv.writer(f)

    for url in url_list:
        try:
            fpage = requests.get(url)
            fsoup = BeautifulSoup(fpage.content, 'html.parser')

            # write header row
            writer.writerow(map(lambda e : e.text, fsoup.find_all('td', {'class':'snapshot-td2-cp'})))

            # write body row
            writer.writerow(map(lambda e : e.text, fsoup.find_all('td', {'class':'snapshot-td2'})))            
        except:
            print("{} - not found".format(url))

在SBUX示例中,我想从这个表中获取数据

enter image description here

几个月前我测试了这段代码,一切都很好。有人能指出我的错误吗?我没看见。谢谢


Tags: csv代码importurl错误snapshot语句requests
1条回答
网友
1楼 · 发布于 2024-03-28 20:54:17

要获取数据,请在请求中指定User-Agent

import csv
import requests
from bs4 import BeautifulSoup

url_base = "https://finviz.com/quote.ashx?t="
tckr = ['SBUX','MSFT','AAPL']
url_list = [(s, url_base + s) for s in tckr]

headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0'}

with open('data.csv', 'w') as f_out:
    writer = csv.writer(f_out, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    for t, url in url_list:
        print('Scrapping ticker {}...'.format(t))
        soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')
        writer.writerow([t])
        for row in soup.select('.snapshot-table2 tr'):
            writer.writerow([td.text for td in row.select('td')])

印刷品:

Scrapping ticker SBUX...
Scrapping ticker MSFT...
Scrapping ticker AAPL...

并保存data.csv(LibreOffice的屏幕截图):

enter image description here

相关问题 更多 >