在python中，用列中的字符串替换某些值

网友

1楼 · 编辑于 2024-05-14 07:28:00

尝试（并按Country分组）：

import numpy as np

df["Country"]=np.where(df["Country"].eq("Mainland China"), "Mainland China", "Other")

编辑

timeit（请注意，我没有像.loc[]那样做lambda doesn't support assignment-请随意建议添加它的方法）：

import pandas as pd
import numpy as np
import timeit
from timeit import Timer

#proportion-wise that's the dataframe, as per OP's question

df=pd.DataFrame({"Country": ["Mainland China"]*398+["a", "b","c"]*124})

df["otherCol"]=2
df["otherCol2"]=3

#shuffle

df2=df.copy().sample(frac=1)
df3=df2.copy()
df4=df3.copy()

op2=Timer(lambda: np.where(df2["Country"].eq("Mainland China"), "Mainland China", "Other"))
op3=Timer(lambda: df3.Country.map(lambda x: x if x == 'Mainland China' else 'Others'))
op4=Timer(lambda: df4["Country"].apply(lambda x: x if x == "Mainland China" else "Others"))

print(op2.timeit(number=1000))
print(op3.timeit(number=1000))
print(op4.timeit(number=1000))

返回：

2.1856687490362674 #numpy
2.2388894270407036 #map
2.4437739049317315 #apply

网友

2楼 · 编辑于 2024-05-14 07:28:00

我认为更改该值的最快方法是使用^{}而不是apply，因为.loc是针对pandas优化的

df.loc[df.Country != 'Mainland China', 'Country'] = 'Others'

网友

3楼 · 编辑于 2024-05-14 07:28:00

尝试使用apply：

dataframe["Country"] = dataframe["Country"].apply(lambda x: x if x == "Mainland China" else "Others")

相关问题更多 >

编程相关推荐

热门问题

热门文章

在python中，用列中的字符串替换某些值

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >