数据帧上的递归循环

2024-05-16 12:16:46 发布

您现在位置:Python中文网/ 问答频道 /正文

输入:

| Company | Employee Number |
|---------|-----------------|
| 1       | 12              |
| 2       | 34, 12          |
| 3       | 56, 34, 78      |
| 4       | 90              |

目标:

查找所有公司中某员工的所有员工编号

最终结果:

| Company | Employee Number |
|---------|-----------------|
| 1       | 12, 34, 56, 78  |
| 2       | 12, 34, 56, 78  |
| 3       | 12, 34, 56, 78  |
| 4       | 90              |

从上面的结果可以看出,前三行是同一个员工。我们知道,因为第一个员工编号“12”存在于第二行,而员工编号“34”存在于第2行和第3行。所以,第1行、第2行和第3行都是同一个雇员。因此,我们将不同的员工编号串联起来,并显示上面显示的结果

注意:您可以有0或N个员工编号

有没有递归的方法可以做到这一点?如果没有,你能想出什么解决办法


Tags: 方法number目标员工employee公司company编号
1条回答
网友
1楼 · 发布于 2024-05-16 12:16:46

以下是我的做法(在评论中解释):

# Replace NaN in df["Employee Number"] with empty string
df["Employee Number"] = df["Employee Number"].fillna("")

# Add a column with sets that contain the individual employee numbers
df["EN_Sets"] = df["Employee Number"].str.findall(r"\d+").apply(set)

# Build the maximal distinct employee number sets
en_sets = []
for en_set in df.EN_Sets:
    union_sets = []
    keep_sets = []
    for s in en_sets:
        if s.isdisjoint(en_set):
            keep_sets.append(s)
        else:
            union_sets.append(s)
    en_sets = keep_sets + [en_set.union(*union_sets)]

# Build a dictionary with the replacement strings as keys the distinct sets
# as values
en_sets = {", ".join(sorted(s)): s for s in en_sets}

# Apply-function to replace the original employee number strings
def setting_en_numbers(s):
    for en_set_str, en_set in en_sets.items():
        if not s.isdisjoint(en_set):
            return en_set_str

# Apply the function to df["Employee Number"]
df["Employee Number"] = df.EN_Sets.apply(setting_en_numbers)
df = df[["Company", "Employee Number"]]

结果

df:
   Company Employee Number
0        1              12
1        2          34, 12
2        3      56, 34, 78
3        4              90
4        5             NaN

   Company Employee Number
0        1  12, 34, 56, 78
1        2  12, 34, 56, 78
2        3  12, 34, 56, 78
3        4              90
4        5                

相关问题 更多 >