使用urlparse库遍历url参数
我需要一些帮助,关于我正在尝试做的事情:
我想要修改字典中每个键(在这个例子中是一个 param
)的值,但每次只修改一个参数,并且这个值来自 self.fooz
,每次循环都要这样做。
像这样
比如一个网址是:
somesite.com?id=6&name=Bill
,那么它会变成somesite.com?id=<self.fooz>&name=Bill
(针对每个单独的 fooz 进行循环),然后变成somesite.com?id=6&name=<self.fooz>
(同样是针对每个单独的 fooz 进行循环)
最后,生成一个 full_param_vector
和 full_param
的值,如下所述。
有人能帮我吗?
我已经做了以下事情:
- 通过
self.path_object
导入了一组原始路径。 - 在
?
之后解析路径,以获取所有原始的参数化key/values
(通过parse_after
)。
我写了一些伪代码,描述我想要实现的目标:
if self.path_object is not None:
dictpath = {}
for path in self.path_object:
#path.pathToScan - returns a full url e.g. somesite.com?id=6&name=Bill
#parse_after returns a string with parameters only, like: {u'id': [u'2'], u'name': [u'Dog']}
parse_after = urlparse.parse_qs(path.pathToScan[path.pathToScan.find('?') + 1:], keep_blank_values=0, strict_parsing=0)
#for each params in 'parse_after':
#replace a key's value from params with a value from self.foozs,
#loop over this single key inserting a single value from self.fooz for each param for all fooz_objects, then continue to the next param and do the same
#create full_param_vector var with these new values
#construct full_path made up of: path.pathToScan - <part before '?'> + "?" + full_param_vector
#add all 'full_path' to a dictionary named dictpath
#print dictpath
任何帮助都非常欢迎。谢谢!
1 个回答
1
像这样可能会解决问题,不过我还是没太明白你的问题是什么。
from collections import defaultdict
import urllib
import urlparse
# parse the url into parts
parsed = urlparse.urlparse('http://somesite.com/blog/posting/?id=6&name=Bill')
# and parse the query string into a dictionary
qs = urlparse.parse_qs(parsed.query, keep_blank_values=0, strict_parsing=0)
# this makes a new dictionary, with same keys, but all values changed to "foobar"
foozsified = { i: 'foobar' for i in qs }
# make them back to a query string: id=foobar&name=foobar
quoted = urllib.urlencode(foozsified, doseq=True)
# the original parsed result is a named tuple and cannot be changed,
# make it into a list
parsed = list(parsed)
# replace the 4th element - the query string with our new
parsed[4] = quoted
# and unparse it into a full url
print(urlparse.urlunparse(parsed))
这段代码会输出
http://somesite.com/blog/posting/?id=foobar&name=foobar
所以你可以在这里对qs
这个字典进行任何修改,然后再用urlunparse
把它变回一个完整的网址。