Python错误检查脚本是Sup

import csv input_csv = "LOCATION_ID.csv" input2 = "CITIES.csv" output_csv = "OUTPUT_CITIES.csv" with open(input_csv, "rb") as infile: input_fields = ("ID", "CITY_DECODED", "CITY", "STATE", "COUNTRY", "SPELL1", "SPELL2", "SPELL3") reader = csv.DictReader(infile, fieldnames = input_fields) with open(input2, "rb") as infile2: input_fields2 = ("Latitude", "Longitude", "City") reader2 = csv.DictReader(infile2, fieldnames = input_fields2) next(reader2) words = [] for next_row in reader2: words.append(next_row["City"]) with open(output_csv, "wb") as outfile: output_fields = ("EXISTS","ID", "CITY_DECODED", "CITY", "STATE", "COUNTRY", "SPELL1", "SPELL2", "SPELL3") writer = csv.DictWriter(outfile, fieldnames = output_fields) writer.writerow(dict((h,h) for h in output_fields)) next(reader) for next_row in reader: search_term = next_row["CITY_DECODED"] #I think the problem is here where I run through every city #in "words", even though all I want to know is if the city #in "search_term" exists in "words for item in words: if search_term in words: next_row["EXISTS"] = 1 writer.writerow(next_row)

1条回答

网友

1楼 · 发布于 2024-04-24 07:55:45

让我们从我看到的一个低效点开始：

for next_row in reader:
                search_term = next_row["CITY_DECODED"]
                for item in words:
                    if search_term in words:
                        next_row["EXISTS"] = 1

这是外for循环的14k次迭代。然后，在嵌套的for循环中，每次大约有6k次迭代。然后在执行if search_term in words时执行更多的迭代，因为它会迭代单词直到返回。你知道吗

我没有过多考虑这个算法实际上在做什么，但至少应该删除words（即words = list(set(words))）中的重复项。你知道吗

我正要发布关于那个for item in words小循环的帖子。你为什么这么做让我很困惑，因为items从未使用过，所以for循环是一个很大的时间浪费。你知道吗

很可能可以简化为：

for next_row in reader:
    search_term = next_row["CITY_DECODED"]
    if search_term in words:
        next_row["EXISTS"] = 1
    writer.writerow(next_row)

那么，让我们总结一下您的所有迭代：

~6k代表for next_row in reader2: words.append(next_row["City"])

~14k次迭代for next_row in reader:乘以总和（i，16000），约为2520亿次。你知道吗

去掉这个无关的循环可以得到大约8400万次迭代，这是。。好吧，好多了。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章