python/pandas:随机抽样在for循环中不起作用

2024-04-25 07:38:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在处理一个pandas数据帧,我正在尝试将我的数据帧子集,以便列的累计和不大于18 然后选择黄色的百分比不应小于65%,然后尝试运行相同的多次迭代。然而有时 循环进入无限循环,有时会产生结果,但每次迭代都会得到相同的结果。你知道吗

while循环之后的所有内容都是从下面的帖子中获取的 Python random sample selection based on multiple conditions

df=pd.DataFrame({'id':['A','B','C','D','E','G','H','I','J','k','l','m','n','o'],'color':['red','red','orange','red','red','red','red','yellow','yellow','yellow','yellow','yellow','yellow','yellow'], 'qty':[5,2, 3, 4, 7, 6, 8, 1, 5,2, 3, 4, 7, 6]})

df_sample = df

for x in range(2):
    sample_s = df.sample(n=df.shape[0])
    sample_s= sample_s[(sample_s.qty.cumsum()<= 30)]
    sample_size=len(sample_s)
    while sum(df['qty']) > 18:
        yellow_size = 0.65
        df_yellow = df[df['color'] == 'yellow'].sample(int(yellow_size*sample_size))
        others_size = 1 - yellow_size
        df_others = df[df['color'] != 'yellow'].sample(int(others_size*sample_size))
        df = pd.concat([df_yellow, df_others]).sample(frac=1)
    print df

当两个结果相同时,我就是这样得到结果的。你知道吗

   color id  qty
     red  H    2
  yellow  n    3
  yellow  J    5
     red  G    2
  yellow  I    1
     red  D    4

    color id  qty
     red  H    2
  yellow  n    3
  yellow  J    5
     red  G    2
  yellow  I    1
     red  D    4

我真的希望有人能帮忙解决这个问题。你知道吗


Tags: 数据sampleidpandasdfsizered子集