用WordNet和NLTK-python替换语料库中的同义词

Traceback (most recent call last): File "C:\Users\Nedim\Documents\sinon2.py", line 21, in <module> change(word) File "C:\Users\Nedim\Documents\sinon2.py", line 4, in change synonym = wn.synset(word + ".n.01").lemma_names TypeError: can only concatenate list (not "str") to list

from nltk.corpus import wordnet as wn def change(word): synonym = wn.synset(word + ".n.01").lemma_names if word in synonym: filename = open("C:/Users/tester/Desktop/test.txt").read() writeSynonym = filename.replace(str(word), str(synonym[0])) f = open("C:/Users/tester/Desktop/test.txt", 'w') f.write(writeSynonym) f.close() f = open("C:/Users/tester/Desktop/test.txt") lines = f.readlines() for i in range(len(lines)): word = lines[i].split() change(word)

2条回答

网友

1楼 · 编辑于 2024-05-23 18:08:38

这不是非常有效，而且这不会取代一个同义词。因为每个词可能有多个同义词。你可以从中选择

from nltk.corpus import wordnet as wn
from nltk.corpus.reader.plaintext import PlaintextCorpusReader


corpus_root = 'C://Users//tester//Desktop//'
wordlists = PlaintextCorpusReader(corpus_root, '.*')


for word in wordlists.words('test.txt'):
    synonymList = set()
    wordNetSynset =  wn.synsets(word)
    for synSet in wordNetSynset:
        for synWords in synSet.lemma_names:
            synonymList.add(synWords)
    print synonymList

网友

2楼 · 编辑于 2024-05-23 18:08:38

两件事。首先，可以将文件读取部分更改为：

for line in open("C:/Users/tester/Desktop/test.txt"):
    word = line.split()

其次，.split()返回一个字符串列表，而您的change函数似乎一次只对一个单词进行操作。这就是导致异常的原因。你的word实际上是一个列表。

如果你想处理那行的每一个字，让它看起来像：

for line in open("C:/Users/tester/Desktop/test.txt"):
    words = line.split()
    for word in words:
        change(word)

相关问题更多 >

编程相关推荐

热门问题

热门文章