如何从python中的文本文件中提取的一行中获取所有3个g？

2024-05-16 21:11:48 发布

男 | 程序猿一只，喜欢编程写python代码。

我从一个文本文件中提取了一行，结果它生成了3克的一行，但在行尾它的输出是2克。 e、 g输入线为cswisceduwwt 输出为

csw
swi
wis
isc
sce
ced
edu
dup
upa
par
ara
rad
ady
dyn
yn

在行尾，它产生了2克（2个字符），最后一克是“yn”，我认为它增加了空间。我不需要“yn”如何删除每行有2个字符的最后一个gram？代码如下

def extract_n_grams(line):
        ngram = ngrams(line, 3)
        for item in ngram:
           result=item[0]+item[1]+item[2]
           print(result)

with open('C:/Users/Dania/Desktop/MS 2nd sem/preprocessed.txt') as corpus:
    for line in corpus:
        extract_n_grams(line)

Tags： in csw for line extract corpus result item

1条回答

网友

1楼 · 发布于 2024-05-16 21:11:48

它显示最后两个字符，因为它包含空格作为最后一个（第三个）字符，所以我用这个语句删除了行末尾的空格

for line in corpus:
        rem_line=line.rstrip('\n')  #####removes space at the end of line
        extract_n_grams(rem_line)

如何从python中的文本文件中提取的一行中获取所有3个g？

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何从python中的文本文件中提取的一行中获取所有3个g？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >