在python/pandas中删除子字符串并合并行

description total average number 0 NFL football (white) L 49693 66 1007 1 NFL football (white) XL 79682 74 1198 2 NFL football (white) XS 84943 81 3792 3 NFL football (white) S 78371 73 3974 4 NFL football (blue) L 99482 92 3978 5 NFL football (blue) M 32192 51 3135 6 NFL football (blue XL 75343 71 2879 7 NFL football (red) XXL 84391 79 1192 8 NFL football (red) XS 34727 57 992 9 NFL football (red) L 44993 63 1562

description total average number 0 NFL football (white) 292689 74 9971 1 NFL football (blue) 207017 71 9992 2 NFL football (red) 164111 66 3746

2条回答

网友

1楼 · 编辑于 2024-05-15 03:45:15

替换works，但也可以使用rsplit删除描述中的最后一个单词，然后执行groupby：

df.description = df.description.apply(lambda x: x.rsplit(' ',1)[0])

df.groupby(by='description')[['total', 'number']].sum()

网友

2楼 · 编辑于 2024-05-15 03:45:15

您可以groupby重新格式化的description字段（无需修改description的原始内容），在该字段中，重新格式化是通过使用空格分割完成的，并通过使用.str.split()，.str.join()排除最后一部分。然后用.agg()进行聚合

通过使用.round()和.astype()四舍五入并转换为整数，进一步将输出重新格式化为所需的输出

(df.groupby(
            df['description'].str.split(' ').str[:-1].str.join(' ')
           )
   .agg({'total': 'sum', 'average': 'mean', 'number': 'sum'})
   .round(0)
   .astype(int)
).reset_index()

结果:

            description   total  average  number
0   NFL football (blue)  207017       71    9992
1    NFL football (red)  164111       66    3746
2  NFL football (white)  292689       74    9971

相关问题更多 >

编程相关推荐

热门问题

热门文章