我正在尝试使用python中的一个新包,即MERF(混合效果随机林)。当我想用200000+行和少量集群(<;100)来拟合数据时,模型总是输出内存错误消息。我认为问题在于集群的数量。当我使用非常大的集群号(>;10000)时,它给出了一个有效的输出。你知道吗
from merf import MERF
merf = MERF()
clusters_train = np.array([])
for i in np.arange(len(df_train)):
if 1570<=df_train['fss'][i]<=1875:
clusters_train = np.append(clusters_train,1)
elif 1510<=df_train['fss'][i]<=1569:
clusters_train = np.append(clusters_train,2)
elif 1450<=df_train['fss'][i]<=1509:
clusters_train = np.append(clusters_train,3)
elif 1340<=df_train['fss'][i]<=1449:
clusters_train = np.append(clusters_train,4)
elif 1001<=df_train['fss'][i]<=1339:
clusters_train = np.append(clusters_train,5)
else:
clusters_train = np.append(clusters_train,0)
clusters_train = pd.Series(clusters_train)
X_train = df_train[['ccs','pydx','gbr']]
Z_train = df_train[['pct30p','pct90p']]
y_train = y_train['bad']
merf.fit(X_train, Z_train, clusters_train, y_train)
目前没有回答
相关问题 更多 >
编程相关推荐