Pandas concat函数给出“FutureWarning:空或全NA项的DataFrame拼接行为已弃用”
我有一个 pandas 数据框,里面有一堆股票的代码。我想要遍历这个列表,调用一个函数,通过 API 获取每个股票代码的更多信息。因为我只能一次处理一个股票代码,所以最后我需要得到一个新的 pandas 数据集,里面包含所有的股票信息,因为这些数据需要清理,而我需要把所有信息放在一起才能做到这一点。
我遇到的问题是,有时候在尝试合并数据框时,会收到一个未来警告:
“FutureWarning: 数据框合并时,如果有空的或全是 NA 的条目,这种行为将不再被支持。在未来的版本中,这将不再在确定结果的数据类型时排除空的或全是 NA 的列。为了保留旧的行为,请在合并操作之前排除相关条目。 pd_profiles = pd.concat([pd_profiles, pd_temp], axis=0)”
我试着搞明白这个问题,并且添加了一些代码来解决这个问题,但都没有成功。
如果有人知道我为什么会收到这个未来警告,或者知道更好的解决方法,请告诉我。
def update_exchage_data(self):
key = api_key
# Here I create my list of equities that I get from an API but in the example I'm
# hard coding them in. There are also other columns than the symbol but I wanted to
# keep the code simple.
equities = [["APPL", "MSFT", "NVDA", "GOOG"]
pd_equities = pd.DataFrame(equities, columns=["symbol"]
# Here I start create a list of columns that make up the final equities profile
# dataframe and create a blank dataframe that I will concatenate each returned
# profile to.
profile_columns = ["symbol", "name", "exchange", "sector", "industry"]
pd_profiles = pd.DataFrame(columns=profile_columns)
# This is an indicator to see if pd_profiles is empty in order to avoid
# concatenating an empty dataframe
first_record = 1
for index, row in pd_equities.iterrows():
pd_temp = self.sub_get_profile(key, row["symbol"], profile_columns)
#this is to check and see if the returned dataframe is empty
if not pd_temp.empty and pd_temp.notnull and len(pd_temp) >= 1:
if first_record == 1:
first_record = 0
pd_profiles = pd_temp
else:
pd_profiles = pd.concat([pd_profiles, pd_temp], axis=0)
pd_profiles = pd_profiles.reset_index(drop=True)
self.sub_get_profile(key, symbol, df_cols)
# I didn't add any of the API code as it works fine and would just be taking up
# extra room. The point is that it returns the data in a json called
# json_api_response
# Here I create an empty dataframe so that even if the API call doesn't return
# anything, my function still returns an empty dataframe.
pd_data = pd.DataFrame(columns=df_cols)
if not pd.DataFrame.from_dict(json_api_response, orient="columns").empty:
pd_data = pd.DataFrame.from_dict(json_api_response, orient="columns")
return pd_data
提前谢谢大家。
1 个回答
0
试着使用
if not pd_temp.empty and pd_temp.notnull().any().any() and len(pd_temp) >= 1:
而不是
if not pd_temp.empty and pd_temp.notnull and len(pd_temp) >= 1: