Python TextFeatureSelection包_程序模块 - PyPI

基于过滤方法实现文本特征的各种特征选择算法

TextFeatureSelection的Python项目详细描述

这是什么？在

TextFeatureSelection是一个Python包，通过特征选择的过滤方法为文本标记提供特征选择，我们可以设置一个阈值来决定要包含哪些单词。有4种方法可以帮助特征选择。在

在
Chi square它衡量术语（t）和类（c）之间缺乏独立性。如果t和c是独立的，它的自然值为零。若它更高，那个么这个项是依赖的。对于低频项，它是不可靠的
在
在
Mutual information稀有术语的得分将高于普通术语。对于多类类别，我们将计算所有类别的MI值，并在单词级别上获取所有类别的最大（MI）值。在
在
在
比例差两个数字距离相等的距离有多近。它有助于找到通常出现在一类文档或另一类文档中的单字语法。在
在
在
信息增益它赋予了单词的辨别力。在
在

输入参数

target列出具有标签类别的对象。对于多个类别，不需要伪代码，而是将标签编码的值作为列表对象提供。在
input_doc_list列出包含文本的对象。列表的每个元素都是文本语料库。不需要标记化，因为文本将在处理时在模块中标记化。目标和输入文档列表的长度应该相同。在
stop_words不希望计算度量值的单词。默认值为空
metric_list列出要计算的度量的对象。有4个度量被计算为'MI'、'CHI'、'PD'、'IG'。可以指定一个或多个作为列表对象。默认值为['MI'、'CHI'、'PD'、'IG']。卡方（Chi）、互信息（MI）、比例差分（PD）和信息增益（IG）是4个度量标准，这些度量是针对语料库中每个标记化单词计算的，以帮助用户进行特征选择。在

怎么用呢？在

fromTextFeatureSelectionimportTextFeatureSelection#Multiclass classification probleminput_doc_list=['i am very happy','i just had an awesome weekend','this is a very difficult terrain to trek. i wish i stayed back at home.','i just had lunch','Do you want chips?']target=['Positive','Positive','Negative','Neutral','Neutral']fsOBJ=TextFeatureSelection(target=target,input_doc_list=input_doc_list)result_df=fsOBJ.getScore()print(result_df)#Binary classificationinput_doc_list=['i am content with this location','i am having the time of my life','you cannot learn machine learning without linear algebra','i want to go to mars']target=[1,1,0,1]fsOBJ=TextFeatureSelection(target=target,input_doc_list=input_doc_list)result_df=fsOBJ.getScore()print(result_df)

去哪里买？在

pip install TextFeatureSelection

依赖关系

在
numpy
在
在
pandas
在
在
scikit-learn
在
在
collections
在

参考文献

A Comparative Study on Feature Selection in Text Categorization作者：杨一鸣和詹O佩德森
Entropy based feature selection for text categorization作者：克里斯汀·拉格龙、克里斯托夫·莫林、马蒂亚斯·盖里
Categorical Proportional Difference: A Feature Selection Method for Text Categorization作者：蒙代尔·西蒙，罗伯特·J·希尔德曼
Feature Selection and Weighting Methods in Sentiment Analysis作者：Tim O`Keefe和Irena Koprinska

欢迎加入QQ群-->： 979659372

TextFeatureSelection 0.0.8

TextFeatureSelection的Python项目详细描述

这是什么？在

输入参数

怎么用呢？在

去哪里买？在

依赖关系

参考文献

推荐PyPI第三方库

azureclistorage

git-pylint-commit-hook

cleverlab

python-mpesawrapper

mastersync

python-googlegeocoder

python-ilorest-librar

wc-rules

django-json

packagetest80

pygherk

volkswagencarnet

youtube-batch

implor

Avatar

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

TextFeatureSelection 0.0.8

TextFeatureSelection的Python项目详细描述

这是什么？在

输入参数

怎么用呢？在

去哪里买？在

依赖关系

参考文献

推荐PyPI第三方库

azureclistorage

git-pylint-commit-hook

cleverlab

python-mpesawrapper

mastersync

python-googlegeocoder

python-ilorest-librar

wc-rules

django-json

packagetest80

pygherk

volkswagencarnet

youtube-batch

implor

Avatar

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签