从索引编号列表中提取列名

2024-03-29 11:47:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我从视频教程中获得了以下代码:

呼叫程序

   X_train, X_test = remove_q_qc_dpl_feat(X_train, X_test, y_train, y_test)
   sel = SelectPercentile(mutual_info_classif, percentile=10).fit(X_train, y_train)
   sel.fit(X_train, y_train)

   features = X_train.columns[sel.get_support()]  

问题:功能返回列索引列表,如[0,6 23]。我需要的是列名

下面是正在调用的函数

  def remove_q_qc_dpl_feat(X_train, X_test, y_train, y_test):

    # Remove constant and quasi constant
    constant_filter = VarianceThreshold(threshold=0.01)
    constant_filter.fit(X_train)
    X_train_filter = constant_filter.transform(X_train)
    X_test_filter = constant_filter.transform(X_test)

    #Remove duplicate
    X_train_T = X_train_filter.T
    X_test_T = X_test_filter.T

    X_train_T = pd.DataFrame(X_train_T)
    X_test_T = pd.DataFrame(X_test_T)  

    duplicated_features = X_train_T.duplicated()

    features_to_keep = [not index for index in duplicated_features]

    X_train_unique = X_train_T[features_to_keep].T
    X_test_unique = X_test_T[features_to_keep].T

Tags: totesttrainfilterremovefitfeatureskeep
1条回答
网友
1楼 · 发布于 2024-03-29 11:47:25

可以通过以下方式获取列:

df.iloc[:,[0, 6, 23]].columns

解释:

.iloc[]为您提供数据帧的一部分

.iloc[0,1] would mean row 0 in coumn 1

:表示所有行

因此,我们得到了列[0,6,23]的所有行的切片

.columns,提供数据帧的列的名称列表,在本例中为数据帧的切片

编辑显示我对您建议的执行情况

   selected_columns = sel.getsupport()

   features = X_train.iloc[:,[selected_columns]].columns

最后一行抛出错误:ndexError:Item错误长度1而不是27

你需要这样做:

selected_columns = X_train.columns[sel.get_support()]
X_train.iloc[:,selected_columns].columns

相关问题 更多 >