我编写了一个简单的脚本,从JSON文件中收集标题列表,并生成一个包含该列表的文本文件
结果如下:
Animal geography
Autobiogeography
Chorography
Economic geography
Footloose industry
Geomorphometry
Health geography
Human geography
Military geography
Philosophy of geography
Physical geography
Political geography
Regional geography
Satirical cartography
Settlement geography
Transport geography
Vernacular geography
Visual geography
Category:Cartography
Category:Economic geography
Category:Geodemography
Category:Human geography
Category:Military geography
Category:Physical geography
Category:Political geography
Category:Regional geography
Category:Settlement geography
Category:Topography
Category:Toponymy
Category:Transportation geography
Category:Vernacular geography
Category:Geography by place
问题:
我现在面临的问题是如何将文本文件分为两部分:
第一部分是文本文件,包含:
Animal geography
Autobiogeography
Chorography
Economic geography
Footloose industry
Geomorphometry
Health geography
Human geography
Military geography
Philosophy of geography
Physical geography
Political geography
Regional geography
Satirical cartography
Settlement geography
Transport geography
Vernacular geography
Visual geography
以及第二个文本文件,其中包含以单词类别开头的文本文件:
Category:Cartography
Category:Economic geography
Category:Geodemography
Category:Human geography
Category:Military geography
Category:Physical geography
Category:Political geography
Category:Regional geography
Category:Settlement geography
Category:Topography
Category:Toponymy
Category:Transportation geography
Category:Vernacular geography
Category:Geography by place
我完全不知道怎么做。请给我建议
抱歉,标题太混乱了。我不知道如何解释我的问题
谢谢你
编辑
例如,我已经从这个API(https://en.wikipedia.org/w/api.php?action=query&format=json&list=categorymembers&cmtitle=Category%3ABranches%20of%20geography&cmlimit=100)中提取了所有标题:
{
"batchcomplete":"",
"query":{
"categorymembers":[
{
"pageid":5259784,
"ns":0,
"title":"Animal geography"
},
{
"pageid":8670379,
"ns":0,
"title":"Autobiogeography"
},
{
"pageid":4254743,
"ns":0,
"title":"Chorography"
},
{
"pageid":177512,
"ns":0,
"title":"Economic geography"
},
{
"pageid":7907104,
"ns":0,
"title":"Footloose industry"
},
{
"pageid":5155886,
"ns":0,
"title":"Geomorphometry"
},
{
"pageid":2596739,
"ns":0,
"title":"Health geography"
},
{
"pageid":13372,
"ns":0,
"title":"Human geography"
},
{
"pageid":1794929,
"ns":0,
"title":"Military geography"
},
{
"pageid":5886597,
"ns":0,
"title":"Philosophy of geography"
},
{
"pageid":23263,
"ns":0,
"title":"Physical geography"
},
{
"pageid":1845092,
"ns":0,
"title":"Political geography"
},
{
"pageid":711230,
"ns":0,
"title":"Regional geography"
},
{
"pageid":42099944,
"ns":0,
"title":"Satirical cartography"
},
{
"pageid":33566568,
"ns":0,
"title":"Settlement geography"
},
{
"pageid":9710174,
"ns":0,
"title":"Transport geography"
},
{
"pageid":24644075,
"ns":0,
"title":"Vernacular geography"
},
{
"pageid":5329197,
"ns":0,
"title":"Visual geography"
},
{
"pageid":716309,
"ns":14,
"title":"Category:Cartography"
},
{
"pageid":2021084,
"ns":14,
"title":"Category:Economic geography"
},
{
"pageid":2245786,
"ns":14,
"title":"Category:Geodemography"
},
{
"pageid":1111700,
"ns":14,
"title":"Category:Human geography"
},
{
"pageid":7774333,
"ns":14,
"title":"Category:Military geography"
},
{
"pageid":2153059,
"ns":14,
"title":"Category:Physical geography"
},
{
"pageid":1898464,
"ns":14,
"title":"Category:Political geography"
},
{
"pageid":6645804,
"ns":14,
"title":"Category:Regional geography"
},
{
"pageid":44706236,
"ns":14,
"title":"Category:Settlement geography"
},
{
"pageid":6517504,
"ns":14,
"title":"Category:Topography"
},
{
"pageid":1086902,
"ns":14,
"title":"Category:Toponymy"
},
{
"pageid":41335672,
"ns":14,
"title":"Category:Transportation geography"
},
{
"pageid":24727902,
"ns":14,
"title":"Category:Vernacular geography"
}
]
}
}
如果你能给我指出解决这个问题的正确方向,我真的很感激
感谢大家的帮助和指导。
要测试文件中的行是否以“Category:”开头,只需执行以下操作:
谢谢李凯因斯基让我用“in”
结果没有我想象的那么复杂。谢谢大家的帮助和指导
你可以试试这个:
这将逐行读取文件
file.txt
,并测试它是否以“Category”开头。如果是这样,它会将行添加到category
数组,如果不是,则添加到data
数组处理完文件后,程序合并所有行并将它们写入category.txt和data.txt
希望能有所帮助
相关问题 更多 >
编程相关推荐