从事务数据帧标记客户（pandas）

list = [(1, 111, '2016-01-10', 'A'), (1, 112, '2016-02-02', 'B'), (1, 112, '2016-02-02', 'C'), (1, 113, '2016-04-10', 'X'), (2, 211, '2016-02-02', 'X'), (3, 311, '2016-04-05', 'X'), (4, 411, '2016-02-05', 'X'), (4, 411, '2016-02-05', 'C'), (4, 412, '2016-03-10', 'E'), (4, 413, '2016-07-14', 'E')] labels = ['custID', 'transacID', 'orderDate', 'itemDescription'] df = pd.DataFrame.from_records(list, columns=labels) df custID transacID orderDate itemDescription 0 1 111 2016-01-10 A 1 1 112 2016-02-02 B 2 1 112 2016-02-02 C 3 1 113 2016-04-10 X 4 2 211 2016-02-02 X 5 3 311 2016-04-05 X 6 4 411 2016-02-05 X 7 4 411 2016-02-05 C 8 4 412 2016-03-10 E 9 4 413 2016-07-14 E

custID transacID orderDate itemDescription label 0 1 111 2016-01-10 A great 1 1 112 2016-02-02 B great 2 1 112 2016-02-02 C great 3 1 113 2016-04-10 X great 4 2 211 2016-02-02 X boo 5 3 311 2016-04-05 X boo 6 4 411 2016-02-05 X awesome 7 4 411 2016-02-05 C awesome 8 4 412 2016-03-10 E awesome 9 4 413 2016-07-14 E awesome

1条回答

网友

1楼 · 发布于 2024-04-19 01:15:36

以下是将groupby和apply与自定义函数一起使用的解决方案：

def categorize(g):
    if len(g) > 1 and g.iloc[0]['itemDescription'] == 'X':
        g['label'] = 'great'
    elif len(g) > 1 and g.iloc[0]['itemDescription'] != 'X':
        g['label'] = 'awesome'
    else:
        g['label'] = 'boo'
    return g

df.groupby('custID').apply(categorize)  
#    custID  transacID   orderDate itemDescription    label
# 0       1        111  2016-01-10               A  awesome
# 1       1        112  2016-02-02               B  awesome
# 2       1        112  2016-02-02               C  awesome
# 3       1        113  2016-04-10               X  awesome
# 4       2        211  2016-02-02               X      boo
# 5       3        311  2016-04-05               X      boo
# 6       4        411  2016-02-05               X    great
# 7       4        411  2016-02-05               C    great
# 8       4        412  2016-03-10               E    great
# 9       4        413  2016-07-14               E    great

很可能有更好的解决办法。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章