从yahoo finance python立即下载多个股票

2024-05-14 03:30:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个关于雅虎财务使用熊猫数据阅读器的功能的问题。几个月来,我一直在使用一个股票行情表,并按以下行执行:

import pandas_datareader as pdr
import datetime

stocks = ["stock1","stock2",....]
start = datetime.datetime(2012,5,31)
end = datetime.datetime(2018,3,1)

f = pdr.DataReader(stocks, 'yahoo',start,end)

从昨天开始,我得到了一个错误“index error:list index out of range”(索引器错误:列表索引超出范围),只有当我尝试获取多个股票时才会出现这个错误。

最近几天我有什么需要考虑的变化吗?或者你有更好的办法解决我的问题吗?


Tags: 数据import功能pandasdatetimeindex错误start
3条回答

您可以将新的Python YahooFinancials模块与pandas一起使用。YahooFinancials构建得很好,通过散列每个Yahoo Financial网页中的数据存储对象来获取数据,因此它速度很快,既不依赖于旧的中断api,也不像scraper那样依赖于Web驱动程序。数据以JSON格式返回,您可以通过传入股票/指数股票的列表来初始化YahooFinancials类,一次拉取任意数量的股票。

$pip安装yahoofinancials

用法示例:

from yahoofinancials import YahooFinancials
import pandas as pd

# Select Tickers and stock history dates
ticker = 'AAPL'
ticker2 = 'MSFT'
ticker3 = 'INTC'
index = '^NDX'
freq = 'daily'
start_date = '2012-10-01'
end_date = '2017-10-01'


# Function to clean data extracts
def clean_stock_data(stock_data_list):
    new_list = []
    for rec in stock_data_list:
        if 'type' not in rec.keys():
            new_list.append(rec)
    return new_list

# Construct yahoo financials objects for data extraction
aapl_financials = YahooFinancials(ticker)
mfst_financials = YahooFinancials(ticker2)
intl_financials = YahooFinancials(ticker3)
index_financials = YahooFinancials(index)

# Clean returned stock history data and remove dividend events from price history
daily_aapl_data = clean_stock_data(aapl_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[ticker]['prices'])
daily_msft_data = clean_stock_data(mfst_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[ticker2]['prices'])
daily_intl_data = clean_stock_data(intl_financials
                                     .get_historical_stock_data(start_date, end_date, freq)[ticker3]['prices'])
daily_index_data = index_financials.get_historical_stock_data(start_date, end_date, freq)[index]['prices']
stock_hist_data_list = [{'NDX': daily_index_data}, {'AAPL': daily_aapl_data}, {'MSFT': daily_msft_data},
                        {'INTL': daily_intl_data}]


# Function to construct data frame based on a stock and it's market index
def build_data_frame(data_list1, data_list2, data_list3, data_list4):
    data_dict = {}
    i = 0
    for list_item in data_list2:
        if 'type' not in list_item.keys():
            data_dict.update({list_item['formatted_date']: {'NDX': data_list1[i]['close'], 'AAPL': list_item['close'],
                                                            'MSFT': data_list3[i]['close'],
                                                            'INTL': data_list4[i]['close']}})
            i += 1
    tseries = pd.to_datetime(list(data_dict.keys()))
    df = pd.DataFrame(data=list(data_dict.values()), index=tseries,
                      columns=['NDX', 'AAPL', 'MSFT', 'INTL']).sort_index()
    return df

一次多个股票数据示例(返回每个股票代码的JSON对象列表):

from yahoofinancials import YahooFinancials

tech_stocks = ['AAPL', 'MSFT', 'INTC']
bank_stocks = ['WFC', 'BAC', 'C']

yahoo_financials_tech = YahooFinancials(tech_stocks)
yahoo_financials_banks = YahooFinancials(bank_stocks)

tech_cash_flow_data_an = yahoo_financials_tech.get_financial_stmts('annual', 'cash')
bank_cash_flow_data_an = yahoo_financials_banks.get_financial_stmts('annual', 'cash')

banks_net_ebit = yahoo_financials_banks.get_ebit()
tech_stock_price_data = tech_cash_flow_data.get_stock_price_data()
daily_bank_stock_prices = yahoo_financials_banks.get_historical_stock_data('2008-09-15', '2017-09-15', 'daily')

JSON输出示例:

代码:

yahoo_financials = YahooFinancials('WFC')
print(yahoo_financials.get_historical_stock_data("2017-09-10", "2017-10-10", "monthly"))

JSON返回:

{
    "WFC": {
        "prices": [
            {
                "volume": 260271600,
                "formatted_date": "2017-09-30",
                "high": 55.77000045776367,
                "adjclose": 54.91999816894531,
                "low": 52.84000015258789,
                "date": 1506830400,
                "close": 54.91999816894531,
                "open": 55.15999984741211
            }
        ],
        "eventsData": [],
        "firstTradeDate": {
            "date": 76233600,
            "formatted_date": "1972-06-01"
        },
        "isPending": false,
        "timeZone": {
            "gmtOffset": -14400
        },
        "id": "1mo15050196001507611600"
    }
}

雅虎财经已经无法使用,因为雅虎已经改变了格式,修复雅虎财经已经足够好,可以下载数据。不过,要进行解析,您需要其他库,下面是一个简单的工作示例:


import numpy as np #python library for scientific computing
import pandas as pd #python library for data manipulation and analysis
import matplotlib.pyplot as plt #python library for charting
import fix_yahoo_finance as yf #python library to scrap data from yahoo finance
from pandas_datareader import data as pdr #extract data from internet sources into pandas data frame

yf.pdr_override()

data = pdr.get_data_yahoo(‘^DJI’, start=”2006–01–01")
data2 = pdr.get_data_yahoo(“MSFT”, start=”2006–01–01")
data3 = pdr.get_data_yahoo(“AAPL”, start=”2006–01–01")
data4 = pdr.get_data_yahoo(“BB.TO”, start=”2006–01–01")

ax = (data[‘Close’] / data[‘Close’].iloc[0] * 100).plot(figsize=(15, 6))
(data2[‘Close’] / data2[‘Close’].iloc[0] * 100).plot(ax=ax, figsize=(15,6))
(data3[‘Close’] / data3[‘Close’].iloc[0] * 100).plot(ax=ax, figsize=(15,6))
(data4[‘Close’] / data5[‘Close’].iloc[0] * 100).plot(ax=ax, figsize=(15,6))

plt.legend([‘Dow Jones’, ‘Microsoft’, ‘Apple’, ‘Blackberry’], loc=’upper left’)
plt.show()

有关代码的说明,请访问https://medium.com/@gerrysabar/charting-stocks-price-from-yahoo-finance-using-fix-yahoo-finance-library-6b690cac5447

如果您阅读了Pandas数据阅读器的documentation,它们会立即对多个数据源API(其中之一是Yahoo!)!金融。

v0.6.0 (January 24, 2018)

Immediate deprecation of Yahoo!, Google Options and Quotes and EDGAR. The end points behind these APIs have radically changed and the existing readers require complete rewrites. In the case of most Yahoo! data the endpoints have been removed. PDR would like to restore these features, and pull requests are welcome.

这可能是导致您获得IndexError错误(或任何其他通常不存在的错误)的罪魁祸首。


然而,还有一个Python包的目标是修复对Yahoo!的支持!Finance for Pandas数据读取器,您可以在此处找到该包:

https://pypi.python.org/pypi/fix-yahoo-finance

根据他们的文件:

Yahoo! finance has decommissioned their historical data API, causing many programs that relied on it to stop working.

fix-yahoo-finance offers a temporary fix to the problem by scraping the data from Yahoo! finance using and return a Pandas DataFrame/Panel in the same format as pandas_datareader’s get_data_yahoo().

By basically “hijacking” pandas_datareader.data.get_data_yahoo() method, fix-yahoo-finance’s implantation is easy and only requires to import fix_yahoo_finance into your code.

您只需添加以下内容:

from pandas_datareader import data as pdr
import fix_yahoo_finance as yf

yf.pdr_override() 

stocks = ["stock1","stock2", ...]
start = datetime.datetime(2012,5,31)
end = datetime.datetime(2018,3,1)

f = pdr.get_data_yahoo(stocks, start=start, end=end)

甚至不需要Pandas数据读取器:

import fix_yahoo_finance as yf

stocks = ["stock1","stock2", ...]
start = datetime.datetime(2012,5,31)
end = datetime.datetime(2018,3,1)
data = yf.download(stocks, start=start, end=end)

相关问题 更多 >