import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0, 10, (2000, 2)), columns=['A', 'B'])
frequencies = df['A'].value_counts()
condition = frequencies<200 # you can define it however you want
mask_obs = frequencies[condition].index
mask_dict = dict.fromkeys(mask_obs, 'miscellaneous')
df['A'] = df['A'].replace(mask_dict) # or you could make a copy not to modify original data
您可以从
value_counts
的索引中提取要屏蔽的值,并使用replace将它们映射到“杂项”:现在,使用value_counts将所有低于阈值的值分组为missional:
^{pr2}$我认为需要:
如果需要求和下面
^{pr2}$threshold
下的所有值:但如果需要
rename
索引值低于阈值:如果需要将原始列替换为^{} ,请使用^{} :
另一种解决方案:
相关问题 更多 >
编程相关推荐