Dataframe:在不为NaN的同一列中使用上一个值的掩码

2024-04-19 03:26:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据帧:

Trajectory Direction Resulting_Direction
STRAIGHT   NORTH     NORTH
STRAIGHT   NaN       NORTH
LEFT       NaN       WEST
LEFT       NaN       WEST
LEFT       NaN       WEST
STRAIGHT   NaN       WEST
STRAIGHT   NaN       WEST
RIGHT      NaN       NORTH
RIGHT      NaN       NORTH
RIGHT      NaN       NORTH

我的目标是在遇到三条直线轨迹时改变方向。因此在这个例子中,我的新列将是Resulting_Direction(假设它最初不在df中)。你知道吗

目前我正在通过逐行if语句来实现这一点。然而,这是痛苦的缓慢和低效。我希望使用一个掩码来设置结果的方向,然后使用fillna(method=“ffill”)。这是我的尝试:

df.loc[:,'direction'] = np.NaN
df.loc[df.index == 0, "direction"] = "WEST"
# mask is for finding when a signal hasnt changed in three seconds, but now has
mask = (df.trajectory != df.trajectory.shift(1)) & (df.trajectory == df.trajectory.shift(-1)) & (df.trajectory == df.trajectory.shift(-2))
df.loc[(mask) & (df['trajectory'] == 'LEFT') & (df['direction'].dropna().shift() == "WEST"),'direction'] = 'SOUTH'
df.loc[(mask) & (df['trajectory'] == 'LEFT') & (df['direction'].dropna().shift() == "SOUTH"),'direction'] = 'EAST'
df.loc[(mask) & (df['trajectory'] == 'LEFT') & (df['direction'].dropna().shift() == "EAST"),'direction'] = 'NORTH'
df.loc[(mask) & (df['trajectory'] == 'LEFT') & (df['direction'].dropna().shift() == "NORTH"),'direction'] = 'WEST'
df.loc[(mask) & (df['trajectory'] == 'RIGHT') & (df['direction'].dropna().shift() == "WEST"),'direction'] = 'NORTH'
df.loc[(mask) & (df['trajectory'] == 'RIGHT') & (df['direction'].dropna().shift() == "SOUTH"),'direction'] = 'WEST'
df.loc[(mask) & (df['trajectory'] == 'RIGHT') & (df['direction'].dropna().shift() == "EAST"),'direction'] = 'SOUTH'
df.loc[(mask) & (df['trajectory'] == 'RIGHT') & (df['direction'].dropna().shift() == "NORTH"),'direction'] = 'EAST'
df.loc[:,'direction'] = df.direction.fillna(method="ffill")
print(df[['trajectory','direction']])

我相信我的问题在df['direction'].dropna().shift()中。如何在非NaN的同一列中找到上一个值?你知道吗


Tags: rightdfshiftmasknanleftlocwest
1条回答
网友
1楼 · 发布于 2024-04-19 03:26:11

IIUC,问题是检测方向改变的位置,假设在3个连续改变命令的开始:

thresh = 3
# mark the consecutive direction commands
blocks = df.Trajectory.ne(df.Trajectory.shift()).cumsum()


# group by blocks
groups = df.groupby(blocks)

# enumerate each block
df['mask'] = groups.cumcount()

# shift up to mark the beginning
# mod thresh to divide each block into small block of thresh
df['mask'] = groups['mask'].shift(1-thresh) % thresh

# for conversion of direction to letters:
changes = {'LEFT': -1,'RIGHT':1}

# all the directions
directions = ['NORTH', 'EAST', 'SOUTH', 'WEST']

# update directions according to the start direction
start = df['Direction'].iloc[0]
start_idx = directions.index(start)
directions = {k%4: v for k,v in enumerate(directions, start=start_idx)}


# update direction changes
direction_changes = (df.Trajectory
                     .where(df['mask'].eq(2))   # where the changes happends
                     .map(changes)              # replace the changes with number
                     .fillna(0)                 # where no direction change is 0
                    )
# mod 4 for the 4 direction
# and map
df['Resulting_Direction'] = (direction_changes.cumsum() % 4).map(directions)

输出:

  Trajectory Direction Resulting_Direction  mask
0   STRAIGHT     NORTH               NORTH   NaN
1   STRAIGHT       NaN               NORTH   NaN
2       LEFT       NaN                WEST   2.0
3       LEFT       NaN                WEST   NaN
4       LEFT       NaN                WEST   NaN
5   STRAIGHT       NaN                WEST   NaN
6   STRAIGHT       NaN                WEST   NaN
7      RIGHT       NaN               NORTH   2.0
8      RIGHT       NaN               NORTH   NaN
9      RIGHT       NaN               NORTH   NaN

相关问题 更多 >