Python old-fashioned-nlp包_程序模块 - PyPI

Sklearn基本nlp模型

old-fashioned-nlp的Python项目详细描述

老式NLP

构建 pypi

这个包的目的是将老式的NLP管道带回您的建模工作流中，在您进入transformer模型之前提供基线参考。在

安装

pip install git+https://github.com/ChenghaoMou/old-fashioned-nlp.git

使用

分类

目前，我们有TfidfLinearSVC，和TfidfLDALinearSVC。在

^{pr2}$

序列标记

我们现在只有CharTfidfTagger。在

importnltkfromold_fashioned_nlp.taggingimportCharTfidfTaggernltk.download('conll2002')train_sents=list(nltk.corpus.conll2002.iob_sents('esp.train'))train_tokens,train_pos,train_ner=zip(*[zip(*e)foreintrain_sents])model=CharTfidfTagger()model.fit(train_tokens,train_pos)model.score(test_tokens,test_pos)

回归

与分类类似，我们有TfidfLinearSVR和{}。在

文本清理

CleanTextTransformer可以插入任何sklearn管道。在

transformer=CleanTextTransformer(replace_dates_with='DATE',replace_times_with='TIME',replace_emails_with='EMAIL',replace_numbers_with='NUMBER',replace_percentages_with='PERCENT',replace_money_with='MONEY',replace_hashtags_with='HASHTAG',replace_handles_with='HANDLE',expand_contractions=True)transformer.transform(["#now @me I'll log 80% entries are due by January 4th, 2017at 8:00pm contact me at chenghao@armorblox.com send me $500.00 now 3,415"])

基准

分类

所有分数都是使用Huggingface的nlp数据集的测试分数。详见基准目录。在

宿沟

              precision    recall  f1-score   support

           0       0.96      0.95      0.95     12000
           1       0.93      0.95      0.94     12000
           2       0.95      0.97      0.96     12000
           3       0.95      0.96      0.96     12000
           4       0.96      0.92      0.94     12000

    accuracy                           0.95     60000
   macro avg       0.95      0.95      0.95     60000
weighted avg       0.95      0.95      0.95     60000

胶水/可乐

              precision    recall  f1-score   support

           0       0.00      0.00      0.00       322
           1       0.69      1.00      0.82       721

    accuracy                           0.69      1043
   macro avg       0.35      0.50      0.41      1043
weighted avg       0.48      0.69      0.57      1043

胶水/SST2

              precision    recall  f1-score   support

           0       0.84      0.77      0.80       428
           1       0.79      0.86      0.82       444

    accuracy                           0.81       872
   macro avg       0.82      0.81      0.81       872
weighted avg       0.82      0.81      0.81       872

尖叫

              precision    recall  f1-score   support

           0       0.94      0.94      0.94     19000
           1       0.94      0.94      0.94     19000

    accuracy                           0.94     38000
   macro avg       0.94      0.94      0.94     38000
weighted avg       0.94      0.94      0.94     38000

AG新闻

              precision    recall  f1-score   support

           0       0.94      0.91      0.92      1900
           1       0.96      0.98      0.97      1900
           2       0.90      0.89      0.89      1900
           3       0.89      0.91      0.90      1900

    accuracy                           0.92      7600
   macro avg       0.92      0.92      0.92      7600
weighted avg       0.92      0.92      0.92      7600

同种异体

              precision    recall  f1-score   support

           0       0.93      0.93      0.93     10408
           1       0.92      0.93      0.92      9592

    accuracy                           0.93     20000
   macro avg       0.93      0.93      0.93     20000
weighted avg       0.93      0.93      0.93     20000

标记

默认值CharTfidfTagger

位置：18458943595 康奈尔得分：0.15840812513116917

欢迎加入QQ群-->： 979659372

old-fashioned-nlp 0.1.3

old-fashioned-nlp的Python项目详细描述

老式NLP

安装

使用

分类

序列标记

回归

文本清理

基准

分类

标记

推荐PyPI第三方库

odoo9-addon-base-geoengine-demo

youtube-curses

SQLAlchemy-Fixture-Factor

ekg

nb-clean

buildboticon

ReutersNews

ecs-tool

quantum

django-compresshtml

odoo11-addon-delivery-carrier-partner

collective.linguatags

fs-watcher

djangorestframework-signed-permissions

aliyun-python-sdk-batchcompute

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

old-fashioned-nlp 0.1.3

old-fashioned-nlp的Python项目详细描述

老式NLP

安装

使用

分类

序列标记

回归

文本清理

基准

分类

标记

推荐PyPI第三方库

odoo9-addon-base-geoengine-demo

youtube-curses

SQLAlchemy-Fixture-Factor

ekg

nb-clean

buildboticon

ReutersNews

ecs-tool

quantum

django-compresshtml

odoo11-addon-delivery-carrier-partner

collective.linguatags

fs-watcher

djangorestframework-signed-permissions

aliyun-python-sdk-batchcompute

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签