python使用lxml和xpath解析html表上的特定数据问题的回答

python使用lxml和xpath解析html表上的特定数据

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

首先，我不熟悉python和堆栈溢出，所以请友好一点。在 这是我要从中提取数据的html页面的源代码。在 网页：<a href="http://gbgfotboll.se/information/?scr=table&ftid=51168" rel="nofollow noreferrer">http://gbgfotboll.se/information/?scr=table&ftid=51168</a> 这张表在这页的底部 <pre><code> <html> table class="clCommonGrid" cellspacing="0"> <thead> <tr> <td colspan="3">Kommande matcher</td> </tr> <tr> <th style="width:1%;">Tid</th> <th style="width:69%;">Match</th> <th style="width:30%;">Arena</th> </tr> </thead> <tbody class="clGrid"> <tr class="clTrOdd"> <td nowrap="nowrap" class="no-line-through"> 2014-09-26 19:30 </td> <td><a href="?scr=result&amp;fmid=2669197">Guldhedens IK - IF Warta</a></td> <td><a href="?scr=venue&amp;faid=847">Guldheden Södra 1 Konstgräs</a> </td> </tr> <tr class="clTrEven"> <td nowrap="nowrap" class="no-line-through"> 2014-09-26 13:00 </td> <td><a href="?scr=result&amp;fmid=2669176">Romelanda UF - IK Virgo</a></td> <td><a href="?scr=venue&amp;faid=941">Romevi 1 Gräs</a> </td> </tr> <tr class="clTrOdd"> <td nowrap="nowrap" class="no-line-through"> 2014-09-27 13:00 </td> <td><a href="?scr=result&amp;fmid=2669167">Kode IF - IK Kongahälla</a></td> <td><a href="?scr=venue&amp;faid=912">Kode IP 1 Gräs</a> </td> </tr> <tr class="clTrEven"> <td nowrap="nowrap" class="no-line-through"> 2014-09-27 14:00 </td> <td><a href="?scr=result&amp;fmid=2669147">Floda BoIF - Partille IF FK </a></td> <td><a href="?scr=venue&amp;faid=218">Flodala IP 1</a> </td> </tr> </tbody> </table> </html> </code></pre> 我需要提取时间：19:30和团队名称：Guldhedens IK-IF Warta表示第一个和第二个表单元格（不是第三个），从第二个表格行提取13:00/Romelanda UF-IK Virgo等等。。所有的表格行都有。在 正如您可以看到的，每个表行在时间之前都有一个日期，所以这里有一个棘手的部分。我只想从那些日期等于我运行此代码的日期的表行中获取上面提到的时间和团队名称。在 到目前为止，我唯一能做的事情并不多，我只能使用以下代码获得时间和球队名称： ^{pr2}$ 这给了我结果['2014-09-26'，'19:30']在这之后，我不知道如何迭代不同的表行，希望找到与我运行代码的日期相匹配的特定表单元格。在 我希望你能尽可能多地回答。在

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

python使用lxml和xpath解析html表上的特定数据

1 个回答

相关Python问题