选择某个范围内的数据帧的列值,并将其放入另一个数据帧的相应列中

2024-06-02 04:58:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个csv文件,如下所示

date,mean,min,max,std
2018-03-15,3.9999999999999964,inf,0.0,100.0
2018-03-16,0.46403712296984756,90.0,0.0,inf
2018-03-17,2.32452732452731,,0.0,143.2191767899579
2018-03-18,2.8571428571428523,inf,0.0,100.0
2018-03-20,0.6928406466512793,100.0,0.0,inf
2018-03-22,2.8675703858185635,,0.0,119.05383697172658

我想选择那些列值,即> 20< 500,即(20 to 500),并将这些值与date放在数据帧的另一列中

Date        percentage_change  location
2018-02-14  23.44              BOM

所以我想从csv中获取日期和值,并将其添加到新的数据框中的适当列中

Date        percentage_change   location
2018-02-14  23.44               BOM
2018-03-15  100.0               NaN
2018-03-16  90.0                NaN
2018-03-17  143.2191767899579   NaN
....        ....                ....

现在我知道了df.max(axis=1)df.min(axis=1)这样的函数,它们给出了最小值和最大值,但不确定是否可以根据范围来查找值,那么如何实现呢


Tags: 文件csv数据dfdatelocationnanbom
1条回答
网友
1楼 · 发布于 2024-06-02 04:58:26

给定数据帧df1df2,可以通过对齐列名、清除数字数据,然后使用pd.DataFrame.append来实现这一点

df_app = df1.loc[:, ['date', 'mean', 'min', 'std']]\
            .rename(columns={'date': 'Date'})\
            .replace(np.inf, 0)\
            .fillna(0)

print(df_app)

df_app['percentage_change'] = np.maximum(df_app['min'], df_app['std'])

print(df_app)
df_app = df_app[df_app['percentage_change'].between(20, 500)]

res = df2.append(df_app.loc[:, ['Date', 'percentage_change']])

print(res)

#          Date location  percentage_change
# 0  2018-02-14      BOM          23.440000
# 0  2018-03-15      NaN         100.000000
# 1  2018-03-16      NaN          90.000000
# 2  2018-03-17      NaN         143.219177
# 3  2018-03-18      NaN         100.000000
# 4  2018-03-20      NaN         100.000000
# 5  2018-03-22      NaN         119.053837

相关问题 更多 >