Python按名称、后缀和长度按字母顺序排序

2024-04-26 12:00:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我一直试图得到一个金属清单,按名称、后缀和长度按字母顺序排序,但似乎只能按长度排序。我不知道我哪里出错了。你知道吗

金属.csv

list of names with date and suffix
name,date,suffix
copper.abc,2017-10-06,abc
gold.xyz,2017-10-06,xyz
19823.efg,2017-10-06,efg
silver.abc,2017-10-06,abc
iron.efg,2017-10-06,efg
unknown9258.xyz,2017-10-06,xyz
nickel.xyz,2017-10-06,xyz
bronze.abc,2017-10-06,abc
platinum.abc,2017-10-06,abc
unknown--23.efg,2017-10-06,efg

过滤器_排序.py

#!/usr/bin/python
# -*- coding: utf-8 -*-

import enchant
import re
from operator import itemgetter, attrgetter

pattern = re.compile(u"([^0-9-]+\..*),(.*,.*)", flags=re.UNICODE)

original = open('metals.csv', 'r')
with open('output.txt', 'a') as newfile:
    for line in original.readlines():
        m = pattern.match(line)
        if m:
            repl = m.group(1)
            newfile.write(m.group(1)+"\n")
newfile.close()

d = enchant.Dict("en_US")

output = []

infile = open("output.txt", "r")
with open('filtered.txt', 'a') as filtered:
    for line in infile.readlines():
        word = line.strip('\n').split('.')[0]
        if d.check(word) is True:
            if len(word) <= 8:
                output.append("{0}.{1}".format(word, line.strip('\n').split('.')[1]))
    for name in sorted(output, key=len):
        filtered.write(str(name+"\n"))
filtered.close()

结果是:

gold.xyz
iron.efg
copper.abc
silver.abc
nickel.xyz
bronze.abc
platinum.abc

我想要:

bronze.abc
copper.abc
silver.abc
platinum.abc
iron.efg
gold.xyz
nickel.xyz

我首先获取一个列表,过滤掉带有数字或破折号的名称,然后将其保存到一个新文件中。接下来,我尝试对结果列表进行排序,并将其再次保存到新列表中。我对Python不太熟悉,所以它显然效率很低。任何提示都将不胜感激,提前谢谢!你知道吗


Tags: nameoutputsilver排序withlineopenfiltered
2条回答

完整的优化解决方案:

import csv, re

def multi_sort(s):
    parts = s.split('.')
    return (parts[1], len(s), parts[0])

with open('metals.csv', 'r') as inp, open('output.txt', 'w', newline='') as out:
    reader = csv.DictReader(inp, fieldnames=None)  # name,date,suffix - header line
    names = []
    for l in reader:
        if re.search(r'[^0-9-]+\..*', l['name']):
            names.append(l['name'])
    names.sort(key=multi_sort)

    writer = csv.writer(out)
    for n in names:
        writer.writerow((n,))

output.txt内容:

bronze.abc
copper.abc
silver.abc
platinum.abc
iron.efg
gold.xyz
nickel.xyz

您请求排序使用您的长度作为关键字:

for name in sorted(output, key=len):

而是使用返回元组的lambda对字典进行排序,如下所示:

for name in sorted(output, key=lambda k: (k.split('.')[1], k.split('.')[0], len)):

首先根据后缀(如abc)排序,然后根据前缀(如brown)排序,最后根据len排序。输出:

bronze.abc
copper.abc
silver.abc
platinum.abc
iron.efg
gold.xyz
nickel.xyz

相关问题 更多 >