LightGBM fit抛出“ValueError:检测到循环引用”的分类功能来自钯铜

import lightgbm as lgb from sklearn.model_selection import train_test_split rows = 100 fcols = 5 ccols = 5 # Let's define some ascii readable names for convenience fnames = ['Float_'+str(chr(97+n)) for n in range(fcols)] cnames = ['Cat_'+str(chr(97+n)) for n in range(fcols)] # The dataset is built by concatenation of the float and the int blocks dff = pd.DataFrame(np.random.rand(rows,fcols),columns=fnames) dfc = pd.DataFrame(np.random.randint(0,20,(rows,ccols)),columns=cnames) df = pd.concat([dfc,dff],axis=1) # Target column with random output df['Target'] = (np.random.rand(rows)>0.5).astype(int) # Conversion into categorical df[cnames] = df[cnames].astype('category') df['Float_a'] = pd.cut(x=df['Float_a'],bins=10) # Dataset split X = df.drop('Target',axis=1) y = df['Target'].astype(int) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33) # Model instantiation lgbmc = lgb.LGBMClassifier(objective = 'binary', boosting_type = 'gbdt' , is_unbalance = True, metric = ['binary_logloss']) lgbmc.fit(X_train,y_train)

1条回答

网友

1楼 · 发布于 2024-04-23 20:23:10

与here一样，您的问题与JSON序列化有关。序列化程序“不喜欢”由创建的类别的标签pd.切割（类似于'（0.109，0.208]'的标签）。在

您可以重写使用剪切函数（https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.cut.html）的labels可选参数生成的标签。在

在您的示例中，可以替换以下行：

df['Float_a'] = pd.cut(x=df['Float_a'],bins=10)

有了台词：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章