Python archives_org_latin_toolkit包_程序模块 - PyPI

跨http://www.cs.cmu.edu/~dbaman/latin.html进行分析和搜索的工具

archives_org_latin_toolkit的Python项目详细描述

https://coveralls.io/repos/github/PonteIneptique/archives_org_latin_toolkit/badge.svg?branch=master

https://travis-ci.org/PonteIneptique/archives_org_latin_toolkit.svg?branch=master

https://badge.fury.io/py/archives_org_latin_toolkit.svg

什么？

这一软件将用于david bamman（http://www.cs.cmu.edu/~dbamman/latin.html）编写的11k个拉丁文本。它只支持纯文本格式和元数据github repo csv文件。仅用python3进行了测试。我欢迎任何新功能或向后兼容支持。

如何安装？

开发版本：
- 克隆存储库：git clone https://github.com/ponteineptique/archives_org_latin_toolkit.git
- 转到目录：cd archives_org_latin_toolkit
- 使用develop选项安装源代码：python setup.py install
带PIP:
- 从pip安装：pip install archives_org_latin_toolkit

示例

下面的示例应该使用tests/test_data中的数据运行。示例可以使用python example.py

运行

# We import the main classes from the modulefromarchives_org_latin_toolkitimportRepo,Metadatafrompprintimportpprint# We initiate a Metadata object and a Repo objectmetadata=Metadata("./test/test_data/latin_metadata.csv")# We want the text to be set in lowercaserepo=Repo("./test/test_data/archive_org_latin/",metadata=metadata,lowercase=True)# We define a list of token we want to search fortokens=["ecclesiastico","ecclesia","ecclesiis","&quot;"]# We instantiate a result storageresults=[]# We iter over text having those tokens :# Note that we need to "unzip" the listfortext_matchinginrepo.find(*tokens):# For each text, we iter over embeddings found in the text# We want 3 words left, 3 words right,# and we want to keep the original token (Default behaviour)forembeddingintext_matching.find_embedding(*tokens,window=3,ignore_center=False):# We add it to the resultsresults.append(embedding)# We print the result (list of list of strings)pprint(results)

欢迎加入QQ群-->： 979659372

archives_org_latin_toolkit 0.0.2

archives_org_latin_toolkit的Python项目详细描述

什么？

如何安装？

示例

推荐PyPI第三方库

typus

apachescan

capybaras

dashifyML

distributions-Antonio-M

fsgcsfs

frbcat

pythonvxi11

distributions-ekimir

lostruct

chartstudio

openshift-client

iclstat-distributions

quantumgrove

dists-udacity-trial

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

archives_org_latin_toolkit 0.0.2

archives_org_latin_toolkit的Python项目详细描述

什么？

如何安装？

示例

推荐PyPI第三方库

typus

apachescan

capybaras

dashifyML

distributions-Antonio-M

fsgcsfs

frbcat

pythonvxi11

distributions-ekimir

lostruct

chartstudio

openshift-client

iclstat-distributions

quantumgrove

dists-udacity-trial

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签