Python添加一个新列，并用另一列上的条件值填充

2条回答

网友

1楼 · 编辑于 2024-04-25 00:08:21

我找到了3种方法：使用np.where、pd.loc和pd.apply（以及@OO7here的建议）

def using_where(df):
    df['col3'] = np.where(df['col2']>5, 'Out', np.where(df['col2']<5, 'In', 5))
    return df

def using_apply(df):
    df['col3'] = df['col2'].apply(lambda x: 5 if x == 5 else ('In' if x < 5 else 'Out'))
    return df

def using_loc(df):
    df['col3'] = 5
    df.loc[df['col2']>5, 'col3'] = 'Out'
    df.loc[df['col2']<5, 'col3'] = 'In'
    return df

我对它们进行了分析，根据{}的{}不同，它们的表现似乎也不同：

size = 10**4
df = pd.DataFrame({'col1': np.random.randint(0, 10, size), 'col2': np.random.randint(0, 10, size)})
%timeit using_where(df)
%timeit using_apply(df)
%timeit using_loc(df)

使用size = 10**4输出：

1000 loops, best of 3: 1.97 ms per loop
100 loops, best of 3: 2.11 ms per loop
100 loops, best of 3: 4.14 ms per loop

使用size = 10**5输出：

100 loops, best of 3: 18.6 ms per loop
100 loops, best of 3: 17.5 ms per loop
100 loops, best of 3: 11.9 ms per loop

总之，我想说，您应该亲自尝试这个评测，并为您的应用程序选择最快的方法。希望这有帮助

网友

2楼 · 编辑于 2024-04-25 00:08:21

试试这个

df = pd.DataFrame({'col1': [1, 2,10,9], 'col2': [3, 4,5,6]})
df['col3'] = df['col2'].apply(lambda x: '5' if x == 5 else ('In' if x < 5 else "Out"))
df

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python添加一个新列，并用另一列上的条件值填充

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >