如何找出特征对logistic回归模型的重要性？

1条回答

网友

1楼 · 发布于 2024-05-27 12:43:24

在线性分类模型（logistic就是其中之一）中，感受给定参数的“影响”最简单的方法之一是考虑其系数的大小乘以数据中相应参数的标准差。

举个例子：

import numpy as np    
from sklearn.linear_model import LogisticRegression

x1 = np.random.randn(100)
x2 = 4*np.random.randn(100)
x3 = 0.5*np.random.randn(100)
y = (3 + x1 + x2 + x3 + 0.2*np.random.randn()) > 0
X = np.column_stack([x1, x2, x3])

m = LogisticRegression()
m.fit(X, y)

# The estimated coefficients will all be around 1:
print(m.coef_)

# Those values, however, will show that the second parameter
# is more influential
print(np.std(X, 0)*m.coef_)

获得类似结果的另一种方法是检查符合标准参数的模型系数：

m.fit(X / np.std(X, 0), y)
print(m.coef_)

请注意，这是最基本的方法，并且存在许多其他用于发现特征重要性或参数影响的技术（使用p值、bootstrap分数、各种“判别指数”等）。

我很肯定你会在https://stats.stackexchange.com/得到更多有趣的答案。

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何找出特征对logistic回归模型的重要性？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >