PythonPandas：从列中获取唯一字符串的最佳方法

df = pd.DataFrame({'cust mobile no': ['1', '2', '3'], 'cust home phone': [np.nan, '2', 'x'], 'cust nextkin phone': ['1', '2', 'g'], 'cust fax': [np.nan, '4', '5'], 'cust id': ['001', '002', '003']}) cust mobile no cust home phone cust nextkin phone cust fax cust id 0 1 NaN 1 NaN 001 1 2 2 2 4 002 2 3 x g 5 003

cust id cust phone 1 cust phone 2 cust phone 3 cust phone 4 0 001 1 NaN NaN NaN 1 002 2 4 NaN NaN 2 003 3 x g 5

1条回答

网友

1楼 · 发布于 2024-05-15 11:01:20

首先定义一个函数，该函数使用所有四列实现所需的逻辑：

from itertools import zip_longest
input_keys = ["cust mobile no", "cust home phone", "cust nextkin phone", "cust fax"]
output_keys = [f"cust phone {n}" for n in range(1, 5)]

def assign_phone_nrs(row): 
    l = [row[k] for k in input_keys if row[k] != "nan"] # get columns != 'nan'
    l = list(dict.fromkeys(l).keys())  # remove duplicates, keep order 
    output_phone_nrs = dict(zip_longest(output_keys, l, fillvalue=np.nan))  # pad with nans & put into dict
    output_phone_nrs["cust id"] = row["cust id"]   # add original id
    return pd.Series(output_phone_nrs)

现在将其应用于输入数据帧：

>>> df.apply(assign_phone_nrs, axis=1)                                                                                                                                              
  cust phone 1 cust phone 2 cust phone 3 cust phone 4 cust id
0            1          NaN          NaN          NaN     001
1            2            4          NaN          NaN     002
2            3            x            g            5     003

相关问题更多 >

编程相关推荐

热门问题

热门文章

PythonPandas：从列中获取唯一字符串的最佳方法

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >