如何使用Python从WSJ中提取每日收盘价？

Here are the codes: import requests import pandas as pd url = 'https://quotes.wsj.com/index/COMP/historical-prices' jsonData = requests.get(url).json() final_df = pd.DataFrame() for row in jsonData['data']: #row = jsonData['data'][1] data_row = [] for idx, colspan in enumerate(row['colspan']): colspan_int = int(colspan[0]) data_row.append(row['td'][idx] * colspan_int) flat_list = [item for sublist in data_row for item in sublist] temp_row = pd.DataFrame([flat_list]) final_df = final_df.append(temp_row, sort=True).reset_index(drop=True) wait2 = input("PRESS ENTER TO CONTINUE.")

# url = 'https://quotes.wsj.com/index/HK/XHKG/HSI/historical-prices/download?num_rows=15&range_days=15&endDate=12/06/2019' response = requests.get(url) open('HSI.csv', 'wb').write(response.content) read_file = pd.read_csv (r'C:\A-CEO\REPORTS\STOCKS\PROFILE\Python\HSI.csv') read_file.to_excel (r'C:\A-CEO\REPORTS\STOCKS\PROFILE\Python\HSI.xlsx', index = None, header=True) # url = 'https://quotes.wsj.com/index/SPX/historical-prices/download?num_rows=15&range_days=15&endDate=12/06/2019' response = requests.get(url) open('SPX.csv', 'wb').write(response.content) read_file = pd.read_csv (r'C:\A-CEO\REPORTS\STOCKS\PROFILE\Python\SPX.csv') read_file.to_excel (r'C:\A-CEO\REPORTS\STOCKS\PROFILE\Python\SPX.xlsx', index = None, header=True) # url = 'https://quotes.wsj.com/index/COMP/historical-prices/download?num_rows=15&range_days=15&endDate=12/06/2019' response = requests.get(url) open('COMP.csv', 'wb').write(response.content) read_file = pd.read_csv (r'C:\A-CEO\REPORTS\STOCKS\PROFILE\Python\COMP.csv') read_file.to_excel (r'C:\A-CEO\REPORTS\STOCKS\PROFILE\Python\COMP.xlsx', index = None, header=True)

1条回答

网友

1楼 · 发布于 2024-04-18 07:49:43

URL是错误的；一旦下载，你可以做“获取信息”，如果在Mac上，你会看到“从哪里来：”。你会看到它的形式如下。你知道吗

import requests
import pandas as pd
import io

#original URL had a bunch of other parameters I omitted, only these seem to matter but YMMV
url = 'https://quotes.wsj.com/index/COMP/historical-prices/download?num_rows=360&range_days=360&endDate=11/06/2019'

response = requests.get(url)

#do this if you want the CSV written to your machine
open('test_file.csv', 'wb').write(response.content)

# this decodes the content of the downloaded response and presents it to pandas
df_test = pd.read_csv(io.StringIO(response.content.decode('utf-8')))

要回答您的其他问题，您可以简单地循环浏览一系列标记或符号，例如：

base_url = 'https://quotes.wsj.com/index/{ticker_name}/historical-prices/download?num_rows=360&range_days=360&endDate=11/06/2019'


ticker_list = ['COMP','SPX','HK/XHKG/HSI']

for ticker in ticker_list:
    response = requests.get(base_url.format(ticker_name = ticker))
    #do this if you want the CSV written to your machine
    open('prices_'+ticker.replace('/','-')+'.csv', 'wb').write(response.content)

注意对于HK/XHKG/HSI，我们需要用连字符替换斜杠，否则它不是有效的文件名。也可以使用此模式生成数据帧。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章