从pandas.core.groupby.generic.DataFrameGroupBy中删除空数据帧

2024-05-14 09:23:14 发布

您现在位置:Python中文网/ 问答频道 /正文

如何从pandas.core.groupby.generic.DataFrameGroupBy中删除空数据帧

我的聚合代码:

cols = ["col1", "col2","col3","col4"]  
joined = pd.concat(df.reset_index() for df in collectData)
joined = joined.replace({np.nan:1, 0:1})
joined[cols] = joined[cols].mask(joined[cols] < 0, 1)

df = joined.set_index('sensor').groupby(pd.Grouper(freq='D'))

分组后的数据:

list(df)

[(Timestamp('2020-02-04 00:00:00+0000', tz='UTC', freq='D'),
                                 col1       col2      col3    col4  
  sensor                                                                   
  2020-02-04 00:00:00+00:00    2.586569   0.015321  0.000149    0.884470   
  2020-02-04 00:00:00+00:00    4.429571   4.049798  1.820845    2.882445   
  2020-02-04 00:00:00+00:00   12.883314   6.900607  1.002138    3.613021    
  ...                               ...        ...       ...         ...    
  2020-02-04 23:45:00+00:00    3.798017   1.605979  0.176515    2.400820   
  2020-02-04 23:45:00+00:00    5.546771   2.232437  0.233292    3.750547   
  2020-02-04 23:45:00+00:00    4.910360   3.730932  0.985459    1.238469       
  
  [48945 rows x 4 columns]),
 (Timestamp('2020-02-05 00:00:00+0000', tz='UTC', freq='D'),
  Empty DataFrame
  Columns: [col1, col2, col3, col4]
  Index: []),
 (Timestamp('2020-02-06 00:00:00+0000', tz='UTC', freq='D'),
  Empty DataFrame
  Columns: [col1, col2, col3, col4]]
  Index: []),
 (Timestamp('2020-02-07 00:00:00+0000', tz='UTC', freq='D'),
                                 col1       col2      col3    col4  
  sensor                                                                   
  2020-02-07 00:00:00+00:00   17.065174   3.065422  0.171053    9.048574   
  2020-02-07 00:00:00+00:00   30.181997  20.651204  4.413567   15.200674   
  2020-02-07 00:00:00+00:00    1.864378   1.726365  0.819459    1.441588   
  ...                               ...        ...       ...         ...   
  2020-02-07 23:45:00+00:00   39.644320   0.234830  0.002289   13.642480   
  2020-02-07 23:45:00+00:00   30.778517  10.540318  0.944788   13.165241   
  2020-02-07 23:45:00+00:00   34.610439  25.342142  6.184292   22.725937      
  
  [50112 rows x 4 columns]),]

dfdf.size()的大小:

sensor
2020-02-02 00:00:00+00:00    47574
2020-02-03 00:00:00+00:00    49353
2020-02-04 00:00:00+00:00    48945
2020-02-05 00:00:00+00:00        0
2020-02-06 00:00:00+00:00        0
                             ...  
2020-09-26 00:00:00+00:00    83680
2020-09-27 00:00:00+00:00    84293
2020-09-28 00:00:00+00:00    84873
2020-09-29 00:00:00+00:00    84306
2020-09-30 00:00:00+00:00    84875
Freq: D, Length: 242, dtype: int64

我需要在应用std = df.apply(gstd)之前删除空数据帧。我不知道空数据框的位置。 https://stackoverflow.com/a/51052536/14338086https://stackoverflow.com/a/16916611/14338086返回错误。同样使用df.filter(lambda x: x.size() != 0)返回TypeError: 'numpy.int64' object is not callabledropna()不可用


Tags: 数据dfsensortimestamptzcol2col3col1
1条回答
网友
1楼 · 发布于 2024-05-14 09:23:14

我用下面的代码解决了这个问题,也许它对某人有帮助

cols = [" col1", "col2", "col3", "col4"]
   
joined = pd.concat(df.reset_index() for df in collectData)
joined = joined.replace({np.nan:1, 0:1})
joined[cols] = joined[cols].mask(joined[cols] < 0, 1)

df = joined.set_index('sensor').groupby(pd.Grouper(freq='D'))
dff = pd.concat(map(lambda x: x[1], df))
means = dff.groupby(dff.index.floor('d')).agg(gmean)
std = dff.groupby(dff.index.floor('d')).agg(gstd)

df_result = pd.merge (left=means, right=std, how='left', on='sensor')

相关问题 更多 >

    热门问题