我有一个包含多个div的HTML页面,比如
<div class="article-additional-info">
A peculiar situation arose in the Supreme Court on Tuesday when two lawyers claimed to be the representative of one of the six accused in the December 16 gangrape case who has sought shifting of t...
<a class="more" href="http://www.thehindu.com/news/national/gangrape-case-two-lawyers-claim-to-be-engaged-by-accused/article4332680.ece">
<span class="arrows">»</span>
</a>
</div>
<div class="article-additional-info">
Power consumers in the city will have to brace for spending more on their monthly bills as all three power distribution companies – the Anil Ambani-owned BRPL and BYPL and the Tatas-owned Tata Powe...
<a class="more" href="http://www.thehindu.com/news/cities/Delhi/power-discoms-demand-yet-another-hike-in-charges/article4331482.ece">
<a class="commentsCount" href="http://www.thehindu.com/news/cities/Delhi/power-discoms-demand-yet-another-hike-in-charges/article4331482.ece#comments">
</div>
我需要得到类为article-additional-info
的所有div的<a href=>
值
我是新来的美女
所以我需要网址
"http://www.thehindu.com/news/national/gangrape-case-two-lawyers-claim-to-be-engaged-by-accused/article4332680.ece"
"http://www.thehindu.com/news/cities/Delhi/power-discoms-demand-yet-another-hike-in-charges/article4331482.ece"
实现这一目标的最佳方法是什么?
打印内容:
在处理完文档后,我按照以下方式完成了,谢谢大家的回答,我很感激他们
根据你的标准,它返回三个url(不是两个)-你想过滤掉第三个吗?
基本思想是遍历HTML,只拉出类中的那些元素,然后遍历该类中的所有链接,拉出实际链接:
这将限制您只搜索那些带有
article-additional-info
类标记的元素,并且在其中查找所有锚(a
)标记并获取它们相应的href
链接。相关问题 更多 >
编程相关推荐