我有一个.txt
文件,其中包含如下数据:
math,mathematics
data,machine-learning-model
machine-learning,statistics,unsupervised-learning,books
orange,lda
machine-learning,deep-learning,keras,tensorflow
keras,similarity,distance,features
现在,我想将每一行作为一个列表存储在一个更大的列表中
句子的预期输出如下:
sentences = [['math', 'mathematics'],
['data', 'machine-learning-model'],
['machine-learning', 'statistics', 'unsupervised-learning', 'books'],
['orange', 'lda']]
这就是我所尝试的:
temp_tokens = []
sentences = []
fp = open('tags.txt')
lines = fp.readlines()
for line in lines:
temp_tokens.clear()
for word in line.split(','):
if word.strip('\n'):
temp_tokens.append(word)
temp_tokens = [e.replace('\n','') for e in temp_tokens]
print(temp_tokens)
sentences.append(temp_tokens)
print(sentences)
现在,当我print(temp_tokens)
时,我得到以下输出:
['math', 'mathematics']
['data', 'machine-learning-model']
['machine-learning', 'statistics', 'unsupervised-learning', 'books']
['orange', 'lda']
['machine-learning', 'deep-learning', 'keras', 'tensorflow']
['keras', 'similarity', 'distance', 'features']
['machine-learning']
这很好。但是,单个列表没有正确地附加到列表sentences
。当我做的时候。句子列表如下所示:
它只包含每行中的单个标记,而不是作为列表的行本身
[['data'], ['machine-learning'], ['orange'], ['machine-learning'], ['keras'], ['machine-learning\n'], ['machine-learning'], ['dataset'], ['lstm\n'], ['python'], ['python'], ['reinforcement-learning'], ['machine-learning'], ['machine-learning'], ['machine-learning'], ['machine-learning'], ['overfitting'], ['machine-learning'], ['machine-learning'], ['time-series'], ['machine-learning'], ['linear-regression'], ['python'], ['keras'], ['python'], ['python'], ['pytorch\n'], ['machine-learning'], ['machine-learning'], ['machine-learning'], ['machine-learning'], ['gradient-descent\n'], ['python'], ['image'], ['dataset'], ['python'], ['neural-network'], ['machine-learning'], ['feature-selection'], ['nlp'], ['machine-learning'], ['python'], ['machine-learning'], ['cnn'], ['machine-learning'], ['neural-network'], ['machine-learning'], ['machine-learning'], ['deep-learning'], ['machine-learning'], ['python'], ['tensorflow'], ['machine-learning'], ['machine-learning'], ['machine-learning']
有人能告诉我我的代码有什么问题吗?为什么列表“temp_tokens
”没有作为一个完整的列表附加到“sentences
”上,而只是作为单独的标记
有人能解释一下吗
或者,使用
这应该起作用:
或者更干净地说:
相关问题 更多 >
编程相关推荐