如何从列表中排序数据

servername01A-2015-05-15-13-42-25 servernameB02-2018-03-25-05-32-35 pt-clark-2018-09-25-14-10-05 PT-Peter-2019-01-01-12-12-05 G4535-2017-07-14-11-29-25 G4535-2017-07-14-11-29-25 g4535-2017-07-14-11-29-25 pc-rescue-2013-11-11-11-12-05

#exclusion file exclusion = open("./exclusion.list", "r") #data in data_in = open("./list_in", "r") #read files exclusion_lines = exclusion.readlines() data_lines = data_in.readlines() #start for a in data: Z = re.split("(.*)-([0-9]{4}.*)", a[1]) matchPCPT = re.search("^([Pp][TtCc]-*)", Z[1]) matchG = re.search("^([Gg][0-9]{4})", Z[1]) if not matchPCPT and not matchG: print Z[1]

mailsrv1a-2015-05-15-13-42-25 mailsrv1b-2015-05-15-13-42-25 mailsrv1c-2015-05-15-13-42-25 mailsrv1a-2015-05-15-13-42-25 datasrvA2-2016-05-15-23-25-25 datasrvB2-2016-05-15-23-25-25 datasrvB2-2016-05-15-23-25-25 g4535-2017-07-14-11-29-25 pc-rescue-2013-11-11-11-12-05 PT-Peter-2019-01-01-12-12-05 pt-clark-2018-09-25-14-10-05 G4535-2017-07-14-11-29-25 benchsrv01rt-2017-07-14-11-29-25 benchsrv02rt-2017-07-14-11-29-25 esxsrv01-2017-07-14-11-29-25 esxsrv02-2017-07-14-11-29-25 solaris10g-2017-07-14-11-29-25 solaris10g-2017-07-14-11-29-25 solaris30g-2017-07-14-11-29-25 test1t-2017-07-14-11-29-25 test2t-2017-07-14-11-29-25 test3t-2017-07-14-11-29-25 test4t-2017-07-14-11-29-25 test5t-2017-07-14-11-29-25

2条回答

网友

1楼 · 编辑于 2024-06-16 10:23:48

可以使用标准方法处理字符串。你知道吗

第二部分总是有20个字符，所以您可以使用slice[:-20]来获取第一部分。你知道吗

使用text.lower().startswith( ("g", "pt-", "pc-") )可以跳过一些名称。你知道吗

可以将正确的名称添加到列表中（即result），并且可以检查它是否在此列表中以跳过重复的值。你知道吗

text = ''' servername01A-2015-05-15-13-42-25
 servernameB02-2018-03-25-05-32-35
 pt-clark-2018-09-25-14-10-05
 PT-Peter-2019-01-01-12-12-05
 G4535-2017-07-14-11-29-25
 G4535-2017-07-14-11-29-25
 g4535-2017-07-14-11-29-25
 pc-rescue-2013-11-11-11-12-05
 example-2013-11-11-11-12-05'''

data = text.split('\n')

excluded = ['benchsrv01rt', 'benchsrv02rt', 'solaris30g', 'solaris10g']

result = []

for name in data:
    name = name.strip()
    name = name[:-20]
    if not name.lower().startswith(('g', 'pc-', 'pt-')):
        if name not in excluded and name not in result:
           result.append(name)

print(result)

唯一的问题是G4535中的数字（如果您真的需要用数字识别名称），它可能需要regex

    import re

    if not re.match('g[0-9]{4}|pc-|pt-', name, re.IGNORECASE):
        if name not in excluded and name not in result:
           result.append(name)

编辑：其他问题可能是test*t，这也可能需要regex。你知道吗

import re

text = '''mailsrv1a-2015-05-15-13-42-25
mailsrv1b-2015-05-15-13-42-25
mailsrv1c-2015-05-15-13-42-25
mailsrv1a-2015-05-15-13-42-25
datasrvA2-2016-05-15-23-25-25
datasrvB2-2016-05-15-23-25-25
datasrvB2-2016-05-15-23-25-25
g4535-2017-07-14-11-29-25
pc-rescue-2013-11-11-11-12-05
PT-Peter-2019-01-01-12-12-05
pt-clark-2018-09-25-14-10-05
G4535-2017-07-14-11-29-25
benchsrv01rt-2017-07-14-11-29-25
benchsrv02rt-2017-07-14-11-29-25
esxsrv01-2017-07-14-11-29-25
esxsrv02-2017-07-14-11-29-25
solaris10g-2017-07-14-11-29-25
solaris10g-2017-07-14-11-29-25
solaris30g-2017-07-14-11-29-25
test1t-2017-07-14-11-29-25
test2t-2017-07-14-11-29-25
test3t-2017-07-14-11-29-25
test4t-2017-07-14-11-29-25
test5t-2017-07-14-11-29-25'''

data = text.split('\n')

excluded = ['benchsrv01rt', 'benchsrv02rt', 'solaris30g', 'solaris10g']

result = []

for name in data:
    name = name.strip()
    name = name[:-20]
    if not re.match('g[0-9]{4}|pc-|pt-|test[0-9]t', name, re.IGNORECASE):
        if name not in excluded and name not in result:
           result.append(name)

print(result)

编辑：您还可以使用list excluded和excluded = '|'.join(excluded)创建regex，您可以在re.match()中使用

excluded = [
    'benchsrv01rt',
    'benchsrv02rt',
    'solaris30g',
    'solaris10g',
    'g[0-9]{4}',
    'pc-',
    'pt-',
    'test[0-9]t',
]

excluded = '|'.join(excluded)
#print(excluded)

result = []

for name in data:
    name = name.strip()
    name = name[:-20]
    if not re.match(excluded, name, re.IGNORECASE):
        if not in result:
           result.append(name)

print(result)

网友

2楼 · 编辑于 2024-06-16 10:23:48

下面是一个示例（它包含一些Python列表理解）

# init list
datalist = [ 'servername01A-2015-05-15-13-42-25',
             'servernameB02-2018-03-25-05-32-35',
             'pt-clark-2018-09-25-14-10-05',
             'PT-Peter-2019-01-01-12-12-05',
             'G4535-2017-07-14-11-29-25',
             'G4535-2017-07-14-11-29-25',
             'g4535-2017-07-14-11-29-25',
             'pc-rescue-2013-11-11-11-12-05' ]

# 1. remove duplicates
datalist = list(set(datalist))

# 2. remove second part of ID
for i,data in enumerate(datalist):
    tmp = '-'.join([tmp_str for tmp_str in data.split('-') if not tmp_str.isdigit()]) 
    datalist[i] = tmp

# 3. remove some servers
# I skipped this step since you did not provide the list of servers to exclude

# 4. remove all computer which starting by G**** or g****
datalist = [d for d in datalist if not d.startswith("G") and not d.startswith("g") ]


# 5. remove all computer which starting by pt- or PT-, PC-, pc-
for prefix in ['pt-', 'PT-', 'PC-', 'pc-']:
    datalist = [d for d in datalist if not d.startswith(prefix) ]

# 6. sort
datalist = sorted(datalist)

最终输出为：

相关问题更多 >

编程相关推荐

热门问题

热门文章