如何计算文本fi中值匹配的出现次数

2024-04-19 13:41:30 发布

您现在位置:Python中文网/ 问答频道 /正文

我的问题是:每个员工都有一个唯一的标识(例如KCUTD\u41) 我已经从一个文件中创建了一个字典来收集每个公司的员工id,如下所示:

{    'Company 1' :['KCUTD_41',
                   'KCTYU_48',
                   'VTSYC_48',
                      ......]
     'Company 2' :['PORUH_21',
                   'PUSHB_10',
                    ....... ]
     'Company 3' :['STEYRU_69']}

我总共有几家公司。你知道吗

同时,在另一个文件中,我有几行,每行对应一个公司与几个员工和博士生之间的协作组(d215485等…)

文件如下所示:

PORUH_21 d215487 d215489 d213654 KCTYU_48 d154225 ...
d25548 d89852 VTSYC_48 d254548 d121154 d258774 PUSHB_10 ...
etc ....

我想要的是员工的数量和团队的数量(在它出现的地方)来得到类似的东西

输出:

Company 1 : (number of employees from company 1 per line ) : number of groups or line where it appears in total 
Company 2 : (number of employees per line from company2) : nb of groups or line where the employees from company2 appears in total
Company 3 : ......

我想使用一个条件来查看字典中每个键的值是否匹配,如果匹配,则计算出现的次数

我希望现在好多了

如果你能帮我


Tags: 文件offromnumber字典line员工公司
1条回答
网友
1楼 · 发布于 2024-04-19 13:41:30

我不太清楚您希望输出看起来如何,但是这段代码可能会帮助您找到您想要的地方。。。你知道吗

import re

companies = {
    'Company 1' :['KCUTD_41','KCTYU_48','VTSYC_48'],
    'Company 2' :['PORUH_21','PUSHB_10'],
    'Company 3' :['STEYRU_69']
     }

finalout = {}
for k,v in companies.items():
    finalout[k] = {"number_in_company":len(v)}
print (finalout)

lines_from_file = [
    "PORUH_21 d215487 d215489 d213654 KCTYU_48 d154225", 
    "d25548 d89852 VTSYC_48 d254548 d121154 d258774 PUSHB_10"
]


pattern_groups    = "(d\d+)"
pattern_employees = "([A-Z]_\d+)"
for line in lines_from_file:
    print("          -")
    print(line)
    print("Groups per line:", re.subn(pattern_groups, '', line)[1])
    print("Employees per line:", re.subn(pattern_employees, '', line)[1])

输出:

{'Company 1': {'number_in_company': 3}, 'Company 2': {'number_in_company': 2}, 'Company 3': {'number_in_company': 1}}
          -
PORUH_21 d215487 d215489 d213654 KCTYU_48 d154225
Groups per line: 4
Employees per line: 2
          -
d25548 d89852 VTSYC_48 d254548 d121154 d258774 PUSHB_10
Groups per line: 5
Employees per line: 2

相关问题 更多 >