在pandas中使用groupby和SortValue时避免创建单个文件的任何方法

2024-05-29 09:43:45 发布

男 | 程序猿一只，喜欢编程写python代码。

这是我的数据集的一小部分，它包含数千行

designation   names                  runs   wickets catches
batsman       brendon mccullum        78       0       12
bowler        shane bond              0        3       0   
bowler        mitchell mcclenaghan    20       1       1 
batsman       kane williamson         192      0       7
wicketkeeper  brendon mccullum        78       0       12
batsman       daniel vettori          65       11      3
wicketkeeper  luke ronchi             7        0       4
bowler        daniel vettori          65       11      3
batsman       martin guptill          120      0       2

我需要根据名称拆分数据集，计算每列的权重，然后附加到同一个excel工作表中。这是我的密码

df1 = df.sort_values('names')
for i, g in df1.groupby('names'):
    g.to_csv('{}'.format(i) + '-names'+ '.csv', header=True, index_label=True)

这段代码将主文件拆分为每个名称的中间文件，然后我运行for循环对所有中间文件执行计算

filenames = glob.glob('*-names.csv')

    for files_ in filenames:
        df2 = pd.read_csv(files_)


        ### perform required calculations


        df.to_excel(writer, 'Sheet1', index=False, header=True)
        writer.save()

这段代码对我有用，但它创建了大量的中间文件。我想知道是否有任何方法可以绕过文件创建步骤

Tags：文件 csv 数据名称 true for names excel

1条回答

网友

1楼 · 发布于 2024-05-29 09:43:45

似乎您需要处理每个组，然后写入excel：

df1 = df.sort_values('names')
for i, g in df1.groupby('names'):
    print (g)
    # perform required calculations with g
    g.to_excel(writer, 'Sheet1', index=False, header=True)
    writer.save()

或者可能需要为每个组应用自定义功能：

def f(x):
    print (x)
    # perform required calculations with x
    return x

df2 = df1.groupby('names').apply(f)

在pandas中使用groupby和SortValue时避免创建单个文件的任何方法

相关问题更多 >

编程相关推荐

热门问题

热门文章

在pandas中使用groupby和SortValue时避免创建单个文件的任何方法

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >