import pandas as pd
df = pd.DataFrame({'Reference':["PO: TK42-8",
"PO GQ5-42",
"PO:HEA-238/239",
"PO: 4501005609 Purchaser: Mariana Toledo Blanco",
"FITN7-26",
"PO#CP4-62",
"PO 4501004752 Purchaser Yang Gao / Split from S94964",
"GUANGDONG YOULONG ELECTRICAL APPLIANCES CO.,LTD // PO#GQY6-17"]
})
从上面的df中,我已经尝试了一段时间,以最小的成功率提取两条信息(如果可用)。从而创建2个新列,如下面所需的df所示
df2 = pd.DataFrame({'Reference':["PO: TK42-8",
"PO GQ5-42",
"PO:HEA-238/239",
"PO: 4501005609 Purchaser: Mariana Toledo Blanco",
"FITN7-26",
"PO#CP4-62",
"PO 4501004752 Purchaser Yang Gao / Split from S94964",
"GUANGDONG YOULONG ELECTRICAL APPLIANCES CO.,LTD // PO#GQY6-17"],
"PO":["TK42-8", "GQ5-42", "HEA-238/239", "4501005609", "FITN7-26","CP4-62", "4501004752", "GQY6-17" ],
"Purchaser":["", "", "", "Mariana Toledo Blanco", "","", "Yang Gao", "" ],
})
到目前为止,我在以下方面取得了一些成功:
df['PO'] = df['Reference'].str.extract(r"PO:.*?([ \w.\S-]+)")
df['Purchaser'] = df['Reference'].str.extract(r"Purchaser.*?([ \w.*]+)")
但是,我不知道如何正确地为每个函数括号中的每种情况编写脚本
用
解释
用
解释
相关问题 更多 >
编程相关推荐