Python正在截断我的文件内容

2024-04-26 00:50:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我在Python中设置了一个任务,为长文本文件1-26编码字母表中的字母,为非字母数字编码26+,参见下面的代码:

#open the file,read the contents and print out normally
my_file = open("timemachine.txt")
my_text = my_file.read()
print (my_text)

print ""
print ""

#open the file and read each line, taking out the eol chars
with open("timemachine.txt","r") as myfile:
    clean_text = "".join(line.rstrip() for line in myfile)

#close the file to prevent memory hogging
my_file.close()

#print out the result all in lower case 
clean_text_lower = clean_text.lower()
print clean_text_lower

print ""
print ""

#establish a lowercase alphabet as a list   
my_alphabet_list = []
my_alphabet = """ abcdefghijklmnopqrstuvwxyz.,;:-_?!'"()[]  %/1234567890"""+"\n"+"\xef"+"\xbb"+"\xbf"

#find the index for each lowercase letter or non-alphanumeric
for letter in my_alphabet:
    my_alphabet_list.append(letter)
print my_alphabet_list,
print my_alphabet_list.index

print ""
print ""

#go through the text and find the corresponding letter of the alphabet
for letter in clean_text_lower:
    posn = my_alphabet_list.index(letter)
print posn,

当我打印这个我应该得到(1)原始文本,(2)文本减少到小写,没有空格,(3)使用的代码索引,最后(4)转换代码。然而,我只能得到原文的后半部分,或者如果我注释掉(4),它将打印所有的文本。为什么


Tags: the代码textincleanforreadmy
1条回答
网友
1楼 · 发布于 2024-04-26 00:50:15

结尾的位:

for letter in clean_text_lower:
    posn = my_alphabet_list.index(letter)
print posn,

一直在重新分配posn,而实际上什么都不做。因此,您将只获得干净文本中最后一个字母的my_alphabet_list.index(letter)

为了解决这个问题,你可以做一些事情。首先想到的是初始化列表并将值附加到其中,即:

posns = []
for letter in clean_text_lower:
    posns.append(my_alphabet_list.index(letter))

print posns,

相关问题 更多 >