在datafram中使用列表理解和字符串序列导出新列

2024-04-29 13:51:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用列表理解(包含字符串)在数据框中派生一个新列。我不知道我做错了什么,但无法找出我的代码中的错误

我有一个清单如下


buyout_deals = ['BIMBO', 'EBO', 'IBI', 'IBO', 'MBI', 'MBO', 'Secondary buyout', 'Take Private']

我正在尝试使用上面的列表和一个名为Deal\u Type的列在我的数据帧中派生一个新的列,该列包含用','分隔的字符串

Announced_Date  Deal_Nature Deal_Type
0   2019-05-14  Recommended Acquisition,Cross border,Private
1   2019-05-14  Recommended Acquisition,Buy & Build,Domestic,Private
2   2019-05-14  Recommended Acquisition,Domestic,Insolvency,Private
3   2019-05-14  Recommended Acquisition,Domestic,Private
4   2019-05-14  Recommended Acquisition,Buy & Build,Cross border,Private,T...
5   2019-05-14  Recommended Acquisition,Domestic,IBO,Private
6   2019-05-14  Recommended Acquisition,Cross border,Private,Transatlantic
7   2019-05-14  Recommended Acquisition,Domestic,MBO,Private
8   2019-05-14  Recommended Acquisition,Domestic,Exit,MBO,Private,Secondar...
9   2019-05-14  Recommended Acquisition,Cross border,Divestment,Private

我正在尝试在“交易类型”列的“买断交易”列表中查找任何1个关键字。如果包含,则新列将显示为“买断”,否则显示为“非买断”

下面是我尝试过的函数(和许多其他方法),但我无法得到想要的结果

def buyout_nonbuyout(row):
    if row['Deal_Type'] in buyout_deals:
        return 'Buyout'
    else:
        return 'Non-Buyout'

df = df.assign(Buyout_NonBuyout=df.apply(buyout_nonbuyout, axis=1))

df.head(10)

我得到以下输出

enter image description here

索引5、7和8处的行应该是Buyout而不是Non-Buyout,因为它至少包含Buyout\u deals列表中的一个关键字

预期结果:

enter image description here

谁能帮我一下吗?我也尝试过for循环,但没有得到正确的结果。 谢谢


Tags: 字符串df列表typeprivatecrossrecommendedborder
2条回答
df = (
    df.assign(Buyout_NonBuyout=df.Deal_Type.apply(lambda x: sum([e in(x) for e in buyout_deals])))
    .assign(Buyout_NonBuyout=df.Buyout_NonBuyout.apply(lambda x: 'Buyout' if x>0 else 'Non-Buyout'))
)


    Announced_Date  Deal_Nature         Deal_Type                               Buyout_NonBuyout
0   2019-05-14  Recommended Acquisition,Cross border,Private                    Non-Buyout
1   2019-05-14  Recommended Acquisition,Buy & Build,Domestic,Private            Non-Buyout
2   2019-05-14  Recommended Acquisition,Domestic,Insolvency,Private             Non-Buyout
3   2019-05-14  Recommended Acquisition,Domestic,Private                        Non-Buyout
4   2019-05-14  Recommended Acquisition,Buy & Build,Cross border,Private,T...   Non-Buyout
5   2019-05-14  Recommended Acquisition,Domestic,IBO,Private                    Buyout
6   2019-05-14  Recommended Acquisition,Cross border,Private,Transatlantic      Non-Buyout
7   2019-05-14  Recommended Acquisition,Domestic,MBO,Private                    Buyout
8   2019-05-14  Recommended Acquisition,Domestic,Exit,MBO,Private,Secondar...   Buyout
9   2019-05-14  Recommended Acquisition,Cross border,Divestment,Private         Non-Buyout

您可能需要尝试以下操作:

def buyout_nonbuyout(row):
    deal_types = row['Deal_Type'].split(',')
    for deal_type in deal_types:
        if deal_type in buyout_deals:
            return 'Buyout'
    return 'Non-Buyout'

相关问题 更多 >