如何将我的数据集(mssqldb)导入并拆分到python预测模型中

2024-04-26 01:23:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我将sqldb作为Python编写的预测模型的数据集,Python使用XGBOOST表示特性重要性,我能够导入数据集,但是我希望能够编写能够分割数据集的代码

 from numpy import loadtxt
 from xgboost import XGBClassifier
 from matplotlib import pyplot
 from xgboost import plot_importance
 import pyodbc

 load data
 cnxn = pyodbc.connect("Driver={SQL Server};"
                  "server=MyInstance\CHURN;"
                  "Database=MyDB;"
                  "Trusted_Connection=yes;"
                  "user=sa;"
                  "password=Mypassword;")

 cursor = cnxn.cursor()
 dataset=cursor.execute('SELECT TOP 1000 [msno],[msnoid], 
 [payment_method_id],[payment_plan_days],[plan_list_price], 
 [actual_amount_paid],[is_auto_renew],[transaction_date], 
 [membership_expire_date],[is_cancel],[date],[num_25],[num_50],[num_75], 
 [num_985],[num_100],[num_unq],[total_secs]
 FROM [Churn_pred].[dbo].[Features]')

我使用下面的代码从excel数据集中加载和拆分数据,它工作正常,我需要对上面描述的sqldb执行相同的操作

dataset = loadtxt('D:\dataset\half_pima-indians-diabetes for testing.csv', 
delimiter=",")
split data into X and y
X = dataset[:,0:17]
y = dataset[:,17]
pyplot.xlabel('Smarts')
pyplot.ylabel('Probability')
model.fit(X, y)
model = XGBClassifier()
plot_importance(model)
pyplot.show()
fit model no training data
feature importance
print(model.feature_importances_)
plot
pyplot.bar(range(len(model.feature_importances_)), 
model.feature_importances_)
pyplot.show()

Tags: 数据fromimportdatadatemodelplotcursor