<p>如果我能理解你,试试这样的方法:</p>
<pre><code>import lxml.html
url = "http://gbgfotboll.se/information/?scr=table&ftid=51168"
html = lxml.html.parse(url)
for i in range(12):
xpath1 = ".//*[@id='content-primary']/table[3]/tbody/tr[%d]/td[1]/span/span//text()" %(i+1)
xpath2 = ".//*[@id='content-primary']/table[3]/tbody/tr[%d]/td[2]/a/text()" %(i+1)
print html.xpath(xpath1)[1], html.xpath(xpath2)[0]
</code></pre>
<p>我知道这是脆弱的,有更好的解决办法,但它是有效的。;)</p>
<p><strong>编辑:</strong><br/>
使用BeautifulSoup的更好方法:</p>
^{pr2}$
<p><strong>编辑2:</strong>
页面没有响应,但应该可以:</p>
<pre><code>from bs4 import BeautifulSoup
import requests
respond = requests.get("http://gbgfotboll.se/information/?scr=table&ftid=51168")
soup = BeautifulSoup(respond.text)
l = soup.find_all('table')
t = l[2].find_all('tr')
time = ""
for i in t:
try:
dateTime = i.find('span').get_text()
teamName = i.find('a').get_text()
if time == dateTime[:-5]:
print dateTime[-5,], teamName
else:
print dateTime, teamName
time = dateTime[:-5]
except AttributeError:
pass
</code></pre>
<p>lxml公司:</p>
<pre><code>import lxml.html
url = "http://gbgfotboll.se/information/?scr=table&ftid=51168"
html = lxml.html.parse(url)
dateTemp = ""
for i in range(12):
xpath1 = ".//*[@id='content-primary']/table[3]/tbody/tr[%d]/td[1]/span/span// text()" %(i+1)
xpath2 = ".//*[@id='content-primary']/table[3]/tbody/tr[%d]/td[2]/a/text()" %(i+1)
time = html.xpath(xpath1)[1]
date = html.xpath(xpath1)[0]
teamName = html.xpath(xpath2)[0]
if date == dateTemp:
print time, teamName
else:
print date, time, teamName
</code></pre>