Pandas concat函数给出“FutureWarning:空或全NA项的DataFrame拼接行为已弃用”

2 投票
1 回答
174 浏览
提问于 2025-04-12 17:39

我有一个 pandas 数据框,里面有一堆股票的代码。我想要遍历这个列表,调用一个函数,通过 API 获取每个股票代码的更多信息。因为我只能一次处理一个股票代码,所以最后我需要得到一个新的 pandas 数据集,里面包含所有的股票信息,因为这些数据需要清理,而我需要把所有信息放在一起才能做到这一点。

我遇到的问题是,有时候在尝试合并数据框时,会收到一个未来警告:

“FutureWarning: 数据框合并时,如果有空的或全是 NA 的条目,这种行为将不再被支持。在未来的版本中,这将不再在确定结果的数据类型时排除空的或全是 NA 的列。为了保留旧的行为,请在合并操作之前排除相关条目。 pd_profiles = pd.concat([pd_profiles, pd_temp], axis=0)”

我试着搞明白这个问题,并且添加了一些代码来解决这个问题,但都没有成功。

如果有人知道我为什么会收到这个未来警告,或者知道更好的解决方法,请告诉我。

def update_exchage_data(self):
   key = api_key
   # Here I create my list of equities that I get from an API but in the example I'm 
   # hard coding them in.  There are also other columns than the symbol but I wanted to 
   # keep the code simple.
   equities = [["APPL", "MSFT", "NVDA", "GOOG"] 
   pd_equities = pd.DataFrame(equities, columns=["symbol"]
   # Here I start create a list of columns that make up the final equities profile 
   # dataframe and create a blank dataframe that I will concatenate each returned 
   # profile to. 
   profile_columns = ["symbol", "name", "exchange", "sector", "industry"]
   pd_profiles = pd.DataFrame(columns=profile_columns)

   # This is an indicator to see if pd_profiles is empty in order to avoid 
   # concatenating an empty dataframe
   first_record = 1
   for index, row in pd_equities.iterrows():
   pd_temp = self.sub_get_profile(key, row["symbol"], profile_columns)
      #this is to check and see if the returned dataframe is empty
      if not pd_temp.empty and pd_temp.notnull and len(pd_temp) >= 1:
         if first_record == 1:
         first_record = 0
         pd_profiles = pd_temp
      else:
         pd_profiles = pd.concat([pd_profiles, pd_temp], axis=0)
pd_profiles = pd_profiles.reset_index(drop=True)


self.sub_get_profile(key, symbol, df_cols)
   # I didn't add any of the API code as it works fine and would just be taking up 
   # extra room.  The point is that it returns the data in a json called 
   # json_api_response

   # Here I create an empty dataframe so that even if the API call doesn't return 
   # anything, my function still returns an empty dataframe. 
   pd_data = pd.DataFrame(columns=df_cols)
      if not pd.DataFrame.from_dict(json_api_response, orient="columns").empty:
         pd_data = pd.DataFrame.from_dict(json_api_response, orient="columns")
return pd_data

提前谢谢大家。

1 个回答

0

试着使用

if not pd_temp.empty and pd_temp.notnull().any().any() and len(pd_temp) >= 1:

而不是

if not pd_temp.empty and pd_temp.notnull and len(pd_temp) >= 1:

撰写回答