我从视频教程中获得了以下代码:
呼叫程序
X_train, X_test = remove_q_qc_dpl_feat(X_train, X_test, y_train, y_test)
sel = SelectPercentile(mutual_info_classif, percentile=10).fit(X_train, y_train)
sel.fit(X_train, y_train)
features = X_train.columns[sel.get_support()]
问题:功能返回列索引列表,如[0,6 23]。我需要的是列名
下面是正在调用的函数:
def remove_q_qc_dpl_feat(X_train, X_test, y_train, y_test):
# Remove constant and quasi constant
constant_filter = VarianceThreshold(threshold=0.01)
constant_filter.fit(X_train)
X_train_filter = constant_filter.transform(X_train)
X_test_filter = constant_filter.transform(X_test)
#Remove duplicate
X_train_T = X_train_filter.T
X_test_T = X_test_filter.T
X_train_T = pd.DataFrame(X_train_T)
X_test_T = pd.DataFrame(X_test_T)
duplicated_features = X_train_T.duplicated()
features_to_keep = [not index for index in duplicated_features]
X_train_unique = X_train_T[features_to_keep].T
X_test_unique = X_test_T[features_to_keep].T
可以通过以下方式获取列:
解释:
.iloc[]
为您提供数据帧的一部分:
表示所有行因此,我们得到了列[0,6,23]的所有行的切片
.columns,提供数据帧的列的名称列表,在本例中为数据帧的切片
编辑显示我对您建议的执行情况
最后一行抛出错误:ndexError:Item错误长度1而不是27
你需要这样做:
相关问题 更多 >
编程相关推荐