从一个热编码列创建差异列

F1 F2 F3 F4 S1 S2 S3 S4 1 0 1 1 0 0 2 1 0 2 1 0 0 1 1 0 0 3 3 1 0 0 0 1 0 0 0 4 0 0 0 1 0 0 0 4

1条回答

网友

1楼 · 发布于 2024-04-20 14:16:34

你可以做：

def func(x):
    # create result array
    result = np.zeros(x.shape, dtype=np.int)

    # get indices of array distinct of zero
    w = np.argwhere(x).ravel()

    # compute the difference between consecutive indices and add the first index + 1
    array = np.hstack(([w[0] + 1], np.ediff1d(w)))

    # set the values on result
    np.put(result, w, array)

    return result


columns = ['S{}'.format(i) for i in range(1, 5)]
s = pd.DataFrame(df.ne(0).apply(func, axis=1).values.tolist(),
                 columns=columns)

result = pd.concat([df, s], axis=1)
print(result)

输出

   F1  F2  F3  F4  S1  S2  S3  S4
0   0   1   1   0   0   2   1   0
1   1   0   0   1   1   0   0   3
2   1   0   0   0   1   0   0   0
3   0   0   0   1   0   0   0   4

注意，您需要导入numpy（import numpy as np），以便func工作。其思想是找到不同于零的索引计算连续值之间的差异，将第一个值设置为index + 1，并对每一行执行此操作

相关问题更多 >

编程相关推荐

热门问题

热门文章

从一个热编码列创建差异列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >