文本序列数据上的Hyperopt:ValueError:无法连接零维数组

2024-04-28 21:17:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用hyperopt包(版本0.1.2),发现以下错误:

Traceback (most recent call last):
  File "C:Users\PycharmProjects\script\my_script.py", line 34, in <module>
    estim.fit(xTrain, yTrain)
  File "C:\Users\PycharmProjects\venv\lib\site- 
packages\hpsklearn\estimator.py", line 746, in fit
    fit_iter.send(increment)
  File "C:Users\PycharmProjects\script\my_script\venv\lib\site- 
   packages\hpsklearn\estimator.py", line 657, in fit_iter
    return_argmin=False, # -- in case no success so far
  File "C:Users\PycharmProjects\script\my_script\venv\lib\site- 
   packages\hyperopt\fmin.py", line 388, in fmin
    show_progressbar=show_progressbar,
  File "C:Users\PycharmProjects\script\my_script\venv\lib\site- 
packages\hyperopt\base.py", line 639, in fmin
     show_progressbar=show_progressbar)
   File "C:Users\PycharmProjects\script\my_script\venv\lib\site- 
packages\hyperopt\fmin.py", line 407, in fmin
    rval.exhaust()
  File "C:Users\PycharmProjects\script\my_script\venv\lib\site-packages\hyperopt\fmin.py", line 262, in exhaust
    self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
  File "C:Users\PycharmProjects\script\my_script\venv\lib\site-packages\hyperopt\fmin.py", line 227, in run

self.serial_evaluate()
   File "C:Users\PycharmProjects\script\my_script\venv\lib\site-packages\hyperopt\fmin.py", line 141, in serial_evaluate
     result = self.domain.evaluate(spec, ctrl)
   File "C:Users\PycharmProjects\script\my_script\venv\lib\site-packages\hyperopt\base.py", line 844, in evaluate
     rval = self.fn(pyll_rval)
   File "C:Users\PycharmProjects\script\my_script\venv\lib\site-packages\hpsklearn\estimator.py", line 620, in fn_with_timeout
raise fn_rval[1]
 ValueError: zero-dimensional arrays cannot be concatenated

原始的X和y数据作为一个列表,这里有以下代码:

from sklearn.model_selection import train_test_split
from hpsklearn import HyperoptEstimator, tfidf, svc_poly
from hyperopt import tpe


if __name__ == '__main__':

    X_data = ['this is an example text 1', 'this is an example text 2', 'this is an example text 3', 'this is an example text 4'] * 25
    y_data = ['this is just and example label 1', 'this is just and example label 2'] * 50
    print(len(X_data), len(y_data))


    print(len(X_data), len(y_data))
    xTrain, xTest, yTrain, yTest = train_test_split(
        X_data, y_data,
        test_size=0.22,
        random_state=33)

    estim = HyperoptEstimator(
        classifier=[svc_poly('my_poly', degree=3)],
        preprocessing=[tfidf('tfidf')],
        algo=tpe.suggest,
        trial_timeout=180
    )

    estim.fit(xTrain, yTrain) # error pops up here

我已经尝试创建一个numpy数组列表,但是结果总是这个错误


Tags: inpydatavenvmylibpackagesline