用电影分级排序文件

2条回答

网友

1楼 · 编辑于 2024-06-11 18:48:45

Let's solve your problem step by step:

所以你的问题有两部分：

首先，从文件中获取正确格式的数据
然后根据他们的评分对他们进行排序

For the first part i tried two approaches :

第一种方法，使用手动生成器方法

首先打开文件：

with open('dsda') as f:
    data=[line.strip().split() for line in f if line!='\n'][0]

为此，我需要float isdigit，但是isdigit只支持int，所以我想出了这样的方法：

def isfloat(point):
    try:
        float(point)
        return True
    except ValueError:
        return False

现在让我们使用生成器方法以适当的形式获取数据：

def generator_approach(data_):
    storage=[]
    flag=True
    for word in data_:

        storage.append(word)
        if isfloat(word)==True:
            yield storage
            storage=[]


closure_ = generator_approach(data)
print(list(closure_))

输出：

[['Harry', 'Potter', 'and', 'the', 'Prisoner', 'of', 'Azkaban', ',', '7.8'], ['Lord', 'of', 'the', 'Rings:', 'The', 'Two', 'Towers', ',', '8.7'], ['Spider', 'Man', ',', '7.3'], ['Alice', 'in', 'Wonderland', ',', '6.5'], ['The', 'Good', 'Dinosaur', ',', '6.7'], ['Kung', 'Fu', 'Panda', ',', '7.6']]

现在让我们尝试第二种方法，即regex方法：

import re
pattern=r'\w.+?[0-9.]+'

with open('dsda') as f:
    for line in f:
        data_r=[line1.split() for line1 in re.findall(pattern,line)]

输出：

[['Harry', 'Potter', 'and', 'the', 'Prisoner', 'of', 'Azkaban', ',', '7.8'], ['Lord', 'of', 'the', 'Rings:', 'The', 'Two', 'Towers', ',', '8.7'], ['Spider', 'Man', ',', '7.3'], ['Alice', 'in', 'Wonderland', ',', '6.5'], ['The', 'Good', 'Dinosaur', ',', '6.7'], ['Kung', 'Fu', 'Panda', ',', '7.6']]

正如您可以看到的，这两种方法的输出是相同的，现在根据评级对它们进行排序并不是什么大事：

print(sorted(data_r,key=lambda x:float(x[-1])))

输出：

[['Alice', 'in', 'Wonderland', ',', '6.5'], ['The', 'Good', 'Dinosaur', ',', '6.7'], ['Spider', 'Man', ',', '7.3'], ['Kung', 'Fu', 'Panda', ',', '7.6'], ['Harry', 'Potter', 'and', 'the', 'Prisoner', 'of', 'Azkaban', ',', '7.8'], ['Lord', 'of', 'the', 'Rings:', 'The', 'Two', 'Towers', ',', '8.7']]

网友

2楼 · 编辑于 2024-06-11 18:48:45

问题是“forx-in-file”循环从文件中读取行，因此title数组将文件的行作为字符串包含。因此，key的sorted参数接收这些字符串并返回每个字符串的第三个字符（rating[2]）；请注意，“New List”确实是按第三个字符排序的-e，i，i，n，r，r。要解决这个问题，您可以将文件的行解析为格式（title，rating）的元组并将它们存储在数组中。然后，按等级排序就像从key参数中的元组抓取等级到sorted一样简单。你知道吗

但是，在我看来，您似乎希望创建自己的排序实现，而不是使用内置的sorted。看起来你要实现一个插入排序，当你在这里发布时，缩进被搞砸了。函数在不解析文件行时也有同样的问题，您需要遍历第二个循环中的数值索引array，而不是f。通过将if右移到while条件中，只分配比较评级的最终位置，而不是交换，逻辑也可以得到一些改进。你知道吗

from collections import namedtuple

def ratings_sort(movies):
        for index in range(1, len(movies)):
                movie = movies[index]
                i = index-1
                while i>=0 and movie.rating < movies[i].rating:
                        movies[i+1] = movies[i]
                        i -= 1
                movies[i+1] = movie


filename = "movie_ratings.txt"

Movie = namedtuple("Movie", "title rating")
movies = list()

with open(filename) as f:
        for line in f:
                part = line.partition(",") # gives a tuple: ("movie title", ",", "rating)
                movies.append(Movie(title=part[0].strip(), rating=float(part[2])))

print("Old List:\n", movies, "\n")

# Sort using sorted
sorted_movies = sorted(movies, key=lambda movie:movie.rating)
# Sort using ratings_sort (modifies movies array unlike sorted)
ratings_sort(movies)

print("New List (using sorted):\n", sorted_movies, "\n")
print("New List (using ratings_sort):\n", movies, "\n")

注意，为了清晰起见，我重命名了一些变量并使用了^{}。另外，我将文件从ratings_sort中移出，以便将其与sorted进行比较作为示例。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章