如何用Python轻松提取iTunes URL中的ID
iTunes 的网址大概是这样的:
http://itunes.apple.com/us/album/break-of-dawn/id472335316?ign-mpt=uo%3D
http://itunes.apple.com/us/app/monopoly-here-now-the-world/id299110947?mt=8
http://itunes.apple.com/es/app/revista-/id397781759?mt=8%3Futm_so%3Dtwitter
http://itunes.apple.com/app/id426698291&mt=8"
http://itunes.apple.com/us/album/respect-the-bull-single/id4899
http://itunes.apple.com/us/album/id6655669
我该如何简单地提取出这个 id 号码呢?
举个例子:
get_id("http://itunes.apple.com/us/album/brawn/id472335316?ign-mpt=uo")
#returns 472335316
3 个回答
1
没有使用正则表达式(没有特别的原因):
import urlparse
def get_id(url):
"""Extract an integer id from iTunes `url`.
Raise ValueError for invalid strings
"""
parts = urlparse.urlsplit(url)
if parts.hostname == 'itunes.apple.com':
idstr = parts.path.rpartition('/')[2] # extract 'id123456'
if idstr.startswith('id'):
try: return int(idstr[2:])
except ValueError: pass
raise ValueError("Invalid url: %r" % (url,))
示例
print get_id("http://itunes.apple.com/us/album/brawn/id472335316?ign-mpt=uo")
# -> 472335316
2
你可以用一种叫做正则表达式的工具,写出类似 "/id(\\d+).*"
这样的代码;这样的话,第一个捕获组里面就会有你想要的id号码。我记得在Python里,你也可以写成 r"/id(\d+).*"
。
9
import re
def get_id(toParse):
return re.search('id(\d+)', toParse).groups()[0]
我就不帮你处理错误了,你自己想办法吧...