如何从多路径文件名中提取公共名称并删除我不想删除的内容

2024-03-28 15:31:13 发布

您现在位置:Python中文网/ 问答频道 /正文

例如,我有7个文件名为:

g18_84pp_2A_MVP1_GoodiesT0-HKJ-DFG_MIX-CMVP1_Y1000-MIX.txt
g18_84pp_2A_MVP2_GoodiesT0-HKJ-DFG_MIX-CMVP2_Y1000-MIX.txt
g18_84pp_2A_MVP3_GoodiesT0-HKJ-DFG_MIX-CMVP3_Y1000-MIX.txt
g18_84pp_2A_MVP4_GoodiesT0-HKJ-DFG_MIX-CMVP4_Y1000-MIX.txt
g18_84pp_2A_MVP5_GoodiesT0-HKJ-DFG_MIX-CMVP5_Y1000-MIX.txt
g18_84pp_2A_MVP6_GoodiesT0-HKJ-DFG_MIX-CMVP6_Y1000-MIX.txt
g18_84pp_2A_MVP7_GoodiesT0-HKJ-DFG_MIX-CMVP7_Y1000-MIX.txt

我想从这些文件中提取一个名称,命名为:

 g18_84pp_2A_MVP_GoodiesT0_MIX.txt

有什么想法吗?谢谢。你知道吗

有没有可能我只能依靠下划线?你知道吗

例如,将文件名分隔为

"g18_84pp_2A_MVP2", "_", "GoodiesT0-HKJ-DFG" "_", "MIX-CMVP2_Y1000-MIX", ".txt". 

"g18_84pp_2A_MVP2"不带数字2,取"GoodiesT0""GoodiesT0-HKJ-DFG"取第一个"MIX""MIX-CMVP2_Y1000-MIX",B/C我有很多文件有不同的名称来分隔部分,我希望它也是通用的


Tags: 文件txt名称文件名mixdfgmvp2mvp3
1条回答
网友
1楼 · 发布于 2024-03-28 15:31:13
import re
names = ['g18_84pp_2A_MVP1_GoodiesT0-HKJ-DFG_MIX-CMVP1_Y1000-MIX.txt',
'g18_84pp_2A_MVP2_GoodiesT0-HKJ-DFG_MIX-CMVP2_Y1000-MIX.txt',
'g18_84pp_2A_MVP3_GoodiesT0-HKJ-DFG_MIX-CMVP3_Y1000-MIX.txt',
'g18_84pp_2A_MVP4_GoodiesT0-HKJ-DFG_MIX-CMVP4_Y1000-MIX.txt',
'g18_84pp_2A_MVP5_GoodiesT0-HKJ-DFG_MIX-CMVP5_Y1000-MIX.txt',
'g18_84pp_2A_MVP6_GoodiesT0-HKJ-DFG_MIX-CMVP6_Y1000-MIX.txt',
'g18_84pp_2A_MVP7_GoodiesT0-HKJ-DFG_MIX-CMVP7_Y1000-MIX.txt']

f = lambda x: re.findall('g18_84pp_2A_MVP(.*?)_GoodiesT0(.*?)_MIX(.*?)\.txt', x)

for x in names:
    print(f(x))

产生

[('1', '-HKJ-DFG', '-CMVP1_Y1000-MIX')]
[('2', '-HKJ-DFG', '-CMVP2_Y1000-MIX')]
[('3', '-HKJ-DFG', '-CMVP3_Y1000-MIX')]
[('4', '-HKJ-DFG', '-CMVP4_Y1000-MIX')]
[('5', '-HKJ-DFG', '-CMVP5_Y1000-MIX')]
[('6', '-HKJ-DFG', '-CMVP6_Y1000-MIX')]
[('7', '-HKJ-DFG', '-CMVP7_Y1000-MIX')]

筛选与此模式不匹配的名称:

names = list(filter(f, names))

既然不清楚你想做什么,这将是一个很好的起点。你知道吗

更新

问题已更新。以下是您(可能)想要实现的目标:

import re
names = ['g18_84pp_2A_MVP1_GoodiesT0-HKJ-DFG_MIX-CMVP1_Y1000-MIX.txt',
'g18_84pp_2A_MVP2_GoodiesT0-HKJ-DFG_MIX-CMVP2_Y1000-MIX.txt',
'g18_84pp_2A_MVP3_GoodiesT0-HKJ-DFG_MIX-CMVP3_Y1000-MIX.txt',
'g18_84pp_2A_MVP4_GoodiesT0-HKJ-DFG_MIX-CMVP4_Y1000-MIX.txt',
'g18_84pp_2A_MVP5_GoodiesT0-HKJ-DFG_MIX-CMVP5_Y1000-MIX.txt',
'g18_84pp_2A_MVP6_GoodiesT0-HKJ-DFG_MIX-CMVP6_Y1000-MIX.txt',
'g18_84pp_2A_MVP7_GoodiesT0-HKJ-DFG_MIX-CMVP7_Y1000-MIX.txt']

expression = 'g18_84pp_2A_MVP(.*?)_Goodies(.*?)_MIX(.*?)\.txt'
f = lambda x: re.findall(expression, x)
_f = lambda x: len(re.findall(expression, x))==3

for x in names:
    print(f(x))

输出

[('1', 'T0-HKJ-DFG', '-CMVP1_Y1000-MIX')]
[('2', 'T0-HKJ-DFG', '-CMVP2_Y1000-MIX')]
[('3', 'T0-HKJ-DFG', '-CMVP3_Y1000-MIX')]
[('4', 'T0-HKJ-DFG', '-CMVP4_Y1000-MIX')]
[('5', 'T0-HKJ-DFG', '-CMVP5_Y1000-MIX')]
[('6', 'T0-HKJ-DFG', '-CMVP6_Y1000-MIX')]
[('7', 'T0-HKJ-DFG', '-CMVP7_Y1000-MIX')]

如果需要筛选原始列表:

names = list(filter(_f, names))

相关问题 更多 >