KeyError:“[1 2]不在索引中”

2024-05-29 11:01:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试标记评论(好的和坏的),但是Pandas崩溃了,出现了KeyError:“[12]不在索引中”enter image description here

这是我的密码:

from nltk.tokenize import RegexpTokenizer
from nltk.corpus import stopwords
from pymorphy2 import tokenizers
import nltk, string, json, pymorphy2
import pandas as pd

p_data = pd.read_json(open('C:\\Creme\\good.json'))
n_data = pd.read_json(open('C:\\Creme\\bad.json'))

dataset = pd.concat([p_data, n_data])
dataset = dataset[[1, 2]]

dataset.columns = ['text', 'label']

morph = pymorphy2.MorphAnalyzer()

def tokenize_me(file_text):
    file_text = file_text.lower()
    tokens = tokenizers.simple_word_tokenize(file_text)

    tokens = [morph.parse(w)[0].normal_form for w in tokens]

    #deleting punctuation symbols
    tokens = [i for i in tokens if (i not in string.punctuation)]

    #deleting stop_words
    stop_words = stopwords.words('russian')
    stop_words.extend(['что', 'это', 'так', 'вот', 'быть', 'как', 'в', '—', 'к', 'на', '...'])
    tokens = [i for i in tokens if (i not in stop_words)]

    return ' '.join(tokens)

dataset['text'] = dataset['text'].apply(tokenize_me)
dataset.to_csv('C:\\Creme\\cleaned_data2.csv')

帮我修复错误和代码


Tags: textinfromimportjsondatadatasetfile

热门问题