我试图用python解析一个超过4gb的db
文件。在
数据库文件中的示例:
% Tags relating to '217.89.104.48 - 217.89.104.63'
% RIPE-USER-RESOURCE
inetnum: 194.243.227.240 - 194.243.227.255
netname: PRINCESINDUSTRIEALIMENTARI
remarks: INFRA-AW
descr: PRINCES INDUSTRIE ALIMENTARI
descr: Provider Local Registry
descr: BB IBS
country: IT
admin-c: DUMY-RIPE
tech-c: DUMY-RIPE
status: ASSIGNED PA
notify: order.manager2@telecomitalia.it
mnt-by: INTERB-MNT
changed: unread@ripe.net 20000101
source: RIPE
remarks: ****************************
remarks: * THIS OBJECT IS MODIFIED
remarks: * Please note that all data that is generally regarded as personal
remarks: * data has been removed from this object.
remarks: * To view the original object, please query the RIPE Database at:
remarks: * http://www.ripe.net/whois
remarks: ****************************
% Tags relating to '194.243.227.240 - 194.243.227.255'
% RIPE-USER-RESOURCE
inetnum: 194.16.216.176 - 194.16.216.183
netname: SE-CARLSTEINS
descr: CARLSTEINS TRAFIK AB
org: ORG-CTA17-RIPE
country: SE
admin-c: DUMY-RIPE
tech-c: DUMY-RIPE
status: ASSIGNED PA
notify: mntripe@telia.net
mnt-by: TELIANET-LIR
changed: unread@ripe.net 20000101
source: RIPE
remarks: ****************************
remarks: * THIS OBJECT IS MODIFIED
remarks: * Please note that all data that is generally regarded as personal
remarks: * data has been removed from this object.
remarks: * To view the original object, please query the RIPE Database at:
remarks: * http://www.ripe.net/whois
remarks: ****************************
我想解析每个以% Tags relating to
开头的块
从这个块中,我要提取inetnum
和第一个descr
这是我目前得到的:(更新)
^{pr2}$
如果您只想获得第一个描述:
如果需要inetnum和first descr:
^{pr2}$我必须承认我没有使用}都是连续的。在
% Tags relating to
,我假设所有{由于文件超过4gb,所以您不希望使用f.read()一次性读取所有文件
但是使用file对象作为迭代器(当你迭代一个文件时,你会得到一行接一行)
下面的genererator应该可以完成这项工作
你可以把它用在下面
^{pr2}$测试文件的结果:
相关问题 更多 >
编程相关推荐