以下是我正在运行的代码:
import logging
from sklearn.datasets import fetch_20newsgroups
from sklearn.decomposition import PCA
from sklearn.model_selection import (GridSearchCV, RepeatedStratifiedKFold,
cross_val_score, )
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.svm import SVC
from lda_classification.model import TomotopyLDAVectorizer
from lda_classification.preprocess.spacy_cleaner import SpacyCleaner
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s',
level=logging.INFO)
workers = 6
# Change this to false if you want to search for the
# number of topics via c_v score (super slow)
cats = ["rec.autos", "rec.motorcycles", "rec.sport.baseball",
"rec.sport.hockey"]
docs, target = fetch_20newsgroups(subset='all', return_X_y=True,
categories=cats)
y_true = LabelEncoder().fit_transform(target)
processor = SpacyCleaner(chunksize=1000, workers=workers)
docs = processor.transform(docs)
我一直收到TypeError:每当运行 doc = processor.transform(docs)
时,无法pickle'\u thread.RLock'对象错误。有人知道我如何解决这个问题吗
目前没有回答
相关问题 更多 >
编程相关推荐