一种用Python包装器实现NLTK BLUU的快速多线程C++实现。

fast-bleu的Python项目详细描述


快速bleu包

这是一个用Python包装器实现NLTK BLUU的快速多线程C++实现,计算固定参考集的BLUU和SelfBLEU得分。 它可以同时高效地返回不同(最大)n个克的(自身)BLEU(例如BLEU-2、BLEU-3等)。在

安装

Linux和WSL

正在安装PyPI latest stable release

pip install --user fast-bleu

MacOS

由于macOS使用clang,并且它不支持OpenMP;一个解决方法是首先使用brew install gcc安装gcc。之后,将添加gcc特定的二进制文件(例如,它可能是gcc-10g++-10)。在

要更改默认编译器,将在安装命令中添加一个选项。因此,您可以使用以下命令安装PyPI latest stable release

^{pr2}$

Windows

还没有测试!在

示例用法

以下是计算BLEU-2、BLEU-3、SelfBLEU-2和SelfBLEU-3的示例:

>>>fromfast_bleuimportBLEU,SelfBLEU>>>ref1=['It','is','a','guide','to','action','that',...'ensures','that','the','military','will','forever',...'heed','Party','commands']>>>ref2=['It','is','the','guiding','principle','which',...'guarantees','the','military','forces','always',...'being','under','the','command','of','the','Party']>>>ref3=['It','is','the','practical','guide','for','the',...'army','always','to','heed','the','directions',...'of','the','party']>>>hyp1=['It','is','a','guide','to','action','which',...'ensures','that','the','military','always',...'obeys','the','commands','of','the','party']>>>hyp2=['he','read','the','book','because','he','was',...'interested','in','world','history']>>>list_of_references=[ref1,ref2,ref3]>>>hypotheses=[hyp1,hyp2]>>>weights={'bigram':(1/2.,1/2.),'trigram':(1/3.,1/3.,1/3.)}>>>bleu=BLEU(list_of_references,weights)>>>bleu.get_score(hypotheses){'bigram':[0.7453559924999299,0.0191380231127159],'trigram':[0.6240726901657495,0.013720869575946234]}

也就是说:

  • hyp1的BLEU-2为0.7453559924999299

  • hyp2的BLEU-2为0.0191380231127159

  • hyp1的BLEU-3为0.6240726901657495

  • hyp2的BLEU-3为0.013720869575946234

>>>self_bleu=SelfBLEU(list_of_references,weights)>>>self_bleu.get_score(){'bigram':[0.25819888974716115,0.3615507630310936,0.37080992435478316],'trigram':[0.07808966062765045,0.20140620205719248,0.21415334758254043]}

也就是说:

  • ref1的SelfBLEU-2为0.25819888974716115

  • ref2的SelfBLEU-2为0.3615507630310936

  • ref3的SelfBLEU-2为0.37080992435478316

  • ref1的SelfBLEU-3为0.07808966062765045

  • ref2的SelfBLEU-3为0.20140620205719248

  • ref3的SelfBLEU-3为0.21415334758254043

Caution在计算期间,引用集的每个标记都转换为字符串格式。在

有关详细信息,请参阅源代码中提供的文档。在

引文

如果对你的研究有帮助,请引用我们的论文。在

@inproceedings{alihosseini-etal-2019-jointly,
    title = {Jointly Measuring Diversity and Quality in Text Generation Models},
    author = {Alihosseini, Danial  and
      Montahaei, Ehsan  and
      Soleymani Baghshah, Mahdieh},
    booktitle = {Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation},
    month = {jun},
    year = {2019},
    address = {Minneapolis, Minnesota},
    publisher = {Association for Computational Linguistics},
    url = {https://www.aclweb.org/anthology/W19-2311},
    doi = {10.18653/v1/W19-2311},
    pages = {90--98},
}

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java如何使用jaxb整理集合?   java改装添加带有令牌和id的标头   Java Webstart在启动应用程序之前停止   mysql将请求主体作为JSON存储到Java数据库中   春天3。从Java 7更新到Java 8后x应用程序不工作   java如何为我的mock实例化unirest HttpResponse<JsonNode>?   java两个servlet在同一场战争中与两场独立战争中的利弊?   java Mockito验证未失败   GWT中的java文件读取器   java避免代码重复   java谁将设置saml cookie,其中包含凭证信息   java如何修改jar包代码,然后重新导出更新的jar包?   BST数据结构中的java递归差异   java如何从文本文件中读取存储的哈希表?   带有quercus的java php comet   java从SeleniumWebDriver写入json变量   javascript如何在同一个action类中对方法调用action?