在文本文件的行上迭代，返回行号和出现次数？

def index(fileName, wordList): infile = open(fileName,'r') i = 0 lineNumber = 0 while True: for line in infile: lineNumber += 1 if wordList[i] in line.split(): print(wordList[i], lineNumber) i += 1 lineNumber = 0 fileName = 'index.txt' wordList = eval(input("Enter a list of words to search for: \n")) index(fileName,wordList)

3条回答

网友

1楼 · 编辑于 2024-04-25 16:40:28

如果尝试重复循环文件对象，则第一次循环之后的任何尝试都将从文件末尾开始并立即停止。有几种方法可以处理此问题；可以将算法更改为在文件的一次传递中工作，也可以将文件的内容保存到其他数据结构中，然后分析该数据结构而不是文件，或者可以使用infile.seek(0)返回到循环之间文件的开头。

网友

2楼 · 编辑于 2024-04-25 16:40:28

因为当到达文件末尾时，任何读取文件的尝试都将产生空字符串，所以程序将失败。解决这个问题的一种方法是使用file.readlines并将行存储在列表中：

with open('test.txt') as f:
    wordInput = [input(), input()] #capture the input
    lines = f.readlines()
    for word in wordInput:
        counter = 0
        for line in lines:
            counter += 1
            if word in line:
                print(word, counter)

但是，对于大型文件来说，这有点效率低下，因为它会将整个文件加载到内存中的缓冲区中。作为替代方案，您可以循环这些行，然后在完成后调用file.seek(0)。这样搜索就回到了文件的开头，您可以再次重新循环它。它是这样工作的：

>>> with open('test.txt') as f:
        for line in f:
            print(line)
        f.seek(0)
        for line in f:
            print(line)


bird 

bird 

dog 

cat 

bird
0 #returns the current seek position
bird 

bird 

dog 

cat 

bird

另外，正如@false tru在他的回答中提到的，避免使用eval(input)，因为它会计算您放在其中的任何表达式，而这会导致意外的输入问题。使用something分隔值，然后执行wordList = input().split(something)。

希望这有帮助！

网友

3楼 · 编辑于 2024-04-25 16:40:28

读取文件后，将更改当前文件位置。当文件位置到达文件末尾时，读取文件将产生空字符串。

您需要使用file.seek来重新读取文件位置。

但是，与其倒带，我更愿意如下操作（使用^{}和in运算符）：

def index(filename, words):
    with open(filename) as f:
        for line_number, line in enumerate(f, 1):
            word = line.strip()
            if word in words:
                print(word, line_number)

fileName = 'index.txt'
wordList = ['bird', 'cat'] # input().split()
words = set(wordList)
index(fileName, words)

eval执行任意表达式。与其使用eval，不如使用input().split()？

相关问题更多 >

编程相关推荐

热门问题

热门文章