我对pandas
很陌生。如果他们有相同的名字,我需要聚合'Names'
,然后取'Rating'
和'NumsHelpful'
的平均值(不计算NaN
)。'Review'
应该连接起来,而'Weight(Pounds)'
应该保持不变:
col names: ['Brand', 'Name', 'NumsHelpful', 'Rating', 'Weight(Pounds)', 'Review']
Name 'Brand' 'Name'
1534 Zing Zang Zing Zang Bloody Mary Mix, 32 fl oz
1535 Zing Zang Zing Zang Bloody Mary Mix, 32 fl oz
1536 Zing Zang Zing Zang Bloody Mary Mix, 32 fl oz
1537 Zing Zang Zing Zang Bloody Mary Mix, 32 fl oz
1538 Zing Zang Zing Zang Bloody Mary Mix, 32 fl oz
1539 Zing Zang Zing Zang Bloody Mary Mix, 32 fl oz
1540 Zing Zang Zing Zang Bloody Mary Mix, 32 fl oz
'NumsHelpful' 'Rating' 'Weight'
1534 NaN 2 4.5
1535 NaN 2 4.5
1536 NaN NaN 4.5
1537 NaN NaN 4.5
1538 2 NaN 4.5
1539 3 5 4.5
1540 5 NaN 4.5
'Review'
1534 Yummy - Delish
1535 The best Bloody Mary mix! - The best Bloody Ma...
1536 Best Taste by far - I've tried several if not ...
1537 Best bloody mary mix ever - This is also good ...
1538 Outstanding - Has a small kick to it but very ...
1539 OMG! So Good! - Spicy, terrific Bloody Mary mix!
1540 Good stuff - This is the best
所以输出应该是这样的:
'Brand' 'Name' 'NumsHelpful' 'Rating'
Zing Zang Zing Zang Bloody Mary Mix, 32 fl oz 3.33 3
'Weight' 'Review'
4.5 Review1 / Review2 / ... / ReviewN
我该怎么走?谢谢。
目前没有回答
相关问题 更多 >
编程相关推荐