我不知道我的代码出了什么问题
import pandas as pd
import numpy as np
woe = [1.1147295474833758,0.364043491078754,-0.05525053172192353,-0.3950007109750665,-0.6784658191115104,-0.9522135140050229,-1.1441658353033486]
iv = [0.29078213954085946,0.29078213954085946,0.29078213954085946,0.29078213954085946,0.29078213954085946,0.29078213954085946,0.29078213954085946]
lis = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
fin = [lis,woe,iv]
fin = np.array(fin).T
df_disc = pd.DataFrame(fin,columns=['Label','WoE','IV'])
print(df_disc)
df_disc = df_disc.sort_values(by=['WoE'])
df_disc = df_disc.reset_index(drop=True)
print(df_disc)
结果
Label WoE IV
0 A 1.1147295474833758 0.29078213954085946
1 B 0.364043491078754 0.29078213954085946
2 C -0.05525053172192353 0.29078213954085946
3 D -0.3950007109750665 0.29078213954085946
4 E -0.6784658191115104 0.29078213954085946
5 F -0.9522135140050229 0.29078213954085946
6 G -1.1441658353033486 0.29078213954085946
Label WoE IV
0 C -0.05525053172192353 0.29078213954085946
1 D -0.3950007109750665 0.29078213954085946
2 E -0.6784658191115104 0.29078213954085946
3 F -0.9522135140050229 0.29078213954085946
4 G -1.1441658353033486 0.29078213954085946
5 B 0.364043491078754 0.29078213954085946
6 A 1.1147295474833758 0.29078213954085946
我认为正确的应该是标签G、F、E、D、C、B、A,但结果似乎是错误的
如上所述,该列包含字符串。要保持精度,请将序列转换为
Decimal
:印刷品:
问题是在数据框中,列由对象填充,而不是数字
在代码中,如果转换字符串和数值,所有值都将转换为对象:
解决方案是按列名称使用字典并传递到^{} :
如果将字典传递给
DataFrame
构造函数,则可以防止它:您的列}。需要将其转换为
WoE
和IV
是dtype
{float
以获得正确的sort
:相关问题 更多 >
编程相关推荐