熊猫、scikit学习和xgboost集成
pandas-ml的Python项目详细描述
概述
pandas,scikit-learn 以及xgboost集成。
安装
$ pip install pandas_ml
示例
>>>importpandas_mlaspdml>>>importsklearn.datasetsasdatasets# create ModelFrame instance from sklearn.datasets>>>df=pdml.ModelFrame(datasets.load_digits())>>>type(df)<class'pandas_ml.core.frame.ModelFrame'># binarize data (features), not touching target>>>df.data=df.data.preprocessing.binarize()>>>df.head().target012345678...5455565758596061626300001111000...000011100011000111000...000001110022000111000...100001111033001111000...100011110044000110000...0000011100[5rowsx65columns]# split to training and test data>>>train_df,test_df=df.model_selection.train_test_split()# create estimator (accessor is mapped to sklearn namespace)>>>estimator=df.svm.LinearSVC()# fit to training data>>>train_df.fit(estimator)# predict test data>>>test_df.predict(estimator)041227...44854498Length:450,dtype:int64# Evaluate the result>>>test_df.metrics.confusion_matrix()Predicted0123456789Target052000000000103710010033202481000110311044010031410004301000501000390000601001035000700002004210802101000331902120000138
支持的软件包
- scikit-learn
- patsy
- xgboost