通过大文件生成器将ris文件读入字典

RISparser的Python项目详细描述


用法

>>>importos>>>frompprintimportpprint>>>fromRISparserimportreadris>>>filepath='tests/example_full.ris'>>>withopen(filepath,'r')asbibliography_file:...entries=readris(bibliography_file)...forentryinentries:...print(entry['id'])...print(entry['first_authors'])12345['Marx, Karl','Lindgren, Astrid']12345['Marxus, Karlus','Lindgren, Astrid']

RIS条目示例

1.
TY  - JOUR
ID  - 12345
T1  - Title of reference
A1  - Marx, Karl
A1  - Lindgren, Astrid
A2  - Glattauer, Daniel
Y1  - 2014//
N2  - BACKGROUND: Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.  RESULTS: Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. CONCLUSIONS: Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium.
KW  - Pippi
KW  - Nordwind
KW  - Piraten
JF  - Lorem
JA  - lorem
VL  - 9
IS  - 3
SP  - e0815
CY  - United States
PB  - Fun Factory
PB  - Fun Factory USA
SN  - 1932-6208
M1  - 1008150341
L2  - http://example.com
ER  -

标记键映射

大多数字段包含字符串值,但有些字段(如first_authors(A1))被解析为列表

列表类型标记的完整列表

>>>fromRISparser.configimportLIST_TYPE_TAGS>>>pprint(LIST_TYPE_TAGS)('A1','A2','A3','A4','AU','KW','N1')

完成默认映射

>>>fromRISparser.configimportTAG_KEY_MAPPING>>>pprint(TAG_KEY_MAPPING){'A1':'first_authors','A2':'secondary_authors','A3':'tertiary_authors','A4':'subsidiary_authors','AB':'abstract','AD':'author_address','AN':'accession_number','AU':'authors','C1':'custom1','C2':'custom2','C3':'custom3','C4':'custom4','C5':'custom5','C6':'custom6','C7':'custom7','C8':'custom8','CA':'caption','CN':'call_number','CY':'place_published','DA':'date','DB':'name_of_database','DO':'doi','DP':'database_provider','EP':'end_page','ER':'end_of_reference','ET':'edition','ID':'id','IS':'number','J2':'alternate_title1','JA':'alternate_title2','JF':'alternate_title3','JO':'journal_name','KW':'keywords','L1':'file_attachments1','L2':'file_attachments2','L4':'figure','LA':'language','LB':'label','M1':'note','M3':'type_of_work','N1':'notes','N2':'abstract','NV':'number_of_Volumes','OP':'original_publication','PB':'publisher','PY':'year','RI':'reviewed_item','RN':'research_notes','RP':'reprint_edition','SE':'version','SN':'issn','SP':'start_page','ST':'short_title','T1':'primary_title','T2':'secondary_title','T3':'tertiary_title','TA':'translated_author','TI':'title','TT':'translated_title','TY':'type_of_reference','UK':'unknown_tag','UR':'url','VL':'volume','Y1':'publication_year','Y2':'access_date'}

覆盖键映射

解析器使用TAG_KEY_MAPPING,可以通过使用自定义映射调用readris()来覆盖它。

>>>importos>>>fromRISparserimportreadris,TAG_KEY_MAPPING>>>frompprintimportpprint>>>filepath='tests/example_full.ris'>>>mapping=TAG_KEY_MAPPING>>>mapping["SP"]="pages_this_is_my_fun">>>withopen(filepath,'r')asbibliography_file:...entries=list(readris(bibliography_file,mapping=mapping))...pprint(sorted(entries[0].keys()))['abstract','alternate_title2','alternate_title3','file_attachments2','first_authors','id','issn','keywords','note','number','pages_this_is_my_fun','place_published','primary_title','publication_year','publisher','secondary_authors','type_of_reference','url','volume']

测试

通过命令行使用pytest启动测试:

$ cd <path_to_the_repo>/RISparser
$ py.test

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java Spring数据JPA+Hibernate在不首先找到父实体的情况下保存子实体   php Java:如何从CLI接收命令   spring为java中的导出数据创建访问文件   java在Windows 8.1上安装Play Framework   java Spring启动白标签错误页面(类型=未找到,状态=404)   java如何在单击时从数组中绘制?   java fn:substringAfter()上次出现   java在IFR语句中使用方法返回   java onPause()或onStop()的名称   对关联对象的关联对象具有条件的java HQL查询   java只打印一次总值,无需迭代   java如何使用抽象Uri buildOn()方法?   如何在Java中执行sudo命令并获得错误输出?   java反射:避免对getConstructor(类<?>…)的未经检查的警告调用作为原始类型类的成员   Java:如何从类中创建的对象调用类方法?   java如何在电子邮件中嵌入图像?   java如何在Android上启用详细GC?   java什么是串行版本id?