每n个值采样一次，或与n个值最接近的匹配值

2条回答

网友

1楼 · 编辑于 2024-04-25 22:56:02

您可以使用列表理解动态创建bin，然后使用pd.cut创建组，并使用groupby和sample(1)为“X”的每32个值获取一条记录。你知道吗

df = pd.DataFrame({'X':np.random.randint(0, 100, 5000),'Y':np.random.choice(list('ABCDEF'),5000)})

bins = [i for i in np.arange(df.X.min(), df.X.max(), 32)] + [np.inf]

df.groupby(pd.cut(df.X,bins=bins), as_index=False).apply(lambda x: x.sample(1).values)

输出：

[[15 'F']
 [51 'A']
 [90 'C']
 [98 'A']]

网友

2楼 · 编辑于 2024-04-25 22:56:02

df = pd.DataFrame({'x': [1, 1, 1, 3, 3, 3, 6, 6, 6],
                   'y': ['a', 'b', 'c'] * 3})

x = [0, 10, 32, 39, 64, 70, 73, 74, 97, 100, 110, 129]
spacer = 32

X = pd.Series(x)
# For each value `n` in the range 0, 32, 64, ..., 129, find the index location of the 
# nearest value in X via `X.sub(n).abs().idxmin()`. Then use these index locations 
# to find the actual target values in X via `loc`.
target_vals = X.loc[[X.sub(n).abs().idxmin() 
                     for n in xrange(0, x[-1], spacer)]].tolist()  # `range` in Python 3.
>>> target_vals
[0, 32, 64, 97, 129]

# Sample the target values, taking a sample size of 1.
df[df['x'].isin(target_vals)].groupby('x').apply(lambda group: group.sample(1))

相关问题更多 >

编程相关推荐

热门问题

热门文章

每n个值采样一次，或与n个值最接近的匹配值

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >