Python XML字符串搜索

2024-04-26 12:32:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图用Python构建自己的string.find()方法/函数。我是为了上计算机科学课才这么做的。你知道吗

基本上,这个程序打开一个文本文件,获取用户在这个文件中输入的他们想要在文件中搜索的文本,并输出字符串所在的行号,或者如果文件中不存在字符串,则输出一个“not found”。你知道吗

然而,完成250000行XML大约需要34秒。你知道吗

我的代码瓶颈在哪里?我用C++和C++做了这个,这大约在0.3秒钟内运行了450万行。我还使用Python内置的string.find()执行了同样的搜索,对于250000行XML,这大约需要4秒钟。所以,我试图理解为什么我的版本这么慢。 https://github.com/zach323/Python/blob/master/XML_Finder.py

fhand = open('C:\\Users\\User\\filename')
import time
str  = input('Enter string you would like to locate: ') #string to be located in file
start = time.time()
delta_time = 0

def find(str):
    time.sleep(0.01)
    found_str ='' #initialize placeholder for found string
    next_index = 0 #index for comparison checking
    line_count = 1
    for line in fhand: #each line in file
        line_count = line_count +1
        for letter in line: #each letter in line
            if letter == str[next_index]: #compare current letter index to beginning index of string you want to find

                found_str += letter #if a match, concatenate to string placeholder

                #print(found_str) #print for visualization of inline search per iteration
                next_index = next_index + 1


                if found_str == str: #if complete match is found, break out of loop.



                        print('Result is: ', found_str, ' on line %s '%(line_count))
                    print (line)
                    return found_str #return string to function caller
                    break
            else:
                #if a match was found but the next_index match was False, reset the indexes and try again.
                next_index=0 # reset indext back to zero
                found_str = '' #reset string back to empty

        if found_str == str:

            print(line)

if str != "":
    result = find(str)
    delta_time = time.time() - start
    print(result)
    print('Seconds elapsed: ', delta_time)  
else:
    print('sorry, empty string')

Tags: toinforstringindexiftimecount
2条回答

试试这个:

with open(filename) as f:
    for row in f:
        if string in row:
            print(row)

以下代码运行在与文件大小相当的文本文件上。你的代码在我的电脑上运行得不太慢。你知道吗

fhand = open('test3.txt')

import time
string = input('Enter string you would like to locate: ') #string to be located in file
start = time.time()
delta_time = 0


def find(string):
    next_index_to_match = 0 
    sl = len(string)
    ct = 0

    for line in fhand: #each line in file
        ct += 1
        for letter in line: #each letter in line
            if letter == string[next_index_to_match]: #compare current letter index to beginning index of string you want to find
                # print(line)
                next_index_to_match += 1

                if sl == next_index_to_match: #if complete match is found, break out of loop.
                    print('Result is: ', string, ' on line %s '%(ct))
                    print (line)
                    return True

            else:
                #if a match was found but the next_index match was False, reset the indexes and try again.
                next_index_to_match=0 # reset indext back to zero
    return False

if string != "":   
    find(string)
    delta_time = time.time() - start
    print('Seconds elapsed: ', delta_time)  
else:
    print('sorry, empty string')

相关问题 更多 >