如何在Python(lxml)中实现字符串匹配不区分大小写[text()="sTrInG"]?
我有这个xpath模式:
tags = doc.xpath('/html/body//a[text() = "' + name.encode('utf8') + '"]/@href'
这个模式会返回每个包含文本name
的标签的链接。有没有办法让这个匹配不区分大小写呢?
编辑
当我尝试@Shelhamer的解决方案时,我得到了:
>>> a_tag_list = html_string.xpath('/html/body//a[lower-case(text()) = "' + author_name.lower() + '"]/@href')
File "lxml.etree.pyx", line 1459, in lxml.etree._Element.xpath (src/lxml/lxml.etree.c:40530)
File "xpath.pxi", line 324, in lxml.etree.XPathElementEvaluator.__call__ (src/lxml/lxml.etree.c:113864)
File "xpath.pxi", line 242, in lxml.etree._XPathEvaluatorBase._handle_result (src/lxml/lxml.etree.c:113063)
File "xpath.pxi", line 227, in lxml.etree._XPathEvaluatorBase._raise_eval_error (src/lxml/lxml.etree.c:112894)
lxml.etree.XPathEvalError: Unregistered function
2 个回答
0
不,XPath是区分大小写的。你可以尝试把所有文本都转换成小写,这样就可以避免这个问题了。
3
这可以通过使用小写函数来实现:
tags = doc.xpath('/html/body//a[lower-case(text()) = "' + name.encode('utf8') + '"]/@href'
这里有一个有用的函数列表,可以参考: http://www.w3schools.com/xpath/xpath_functions.asp