使用Python从html中获取一个表并显示它？

# Use 'with' to ensure the session context is closed after use. with requests.Session() as s: s.post(LOGINURL, data=login) # print r = s.get(LOGINURL) print r.url # An authorised request. r = s.get(APURL) print r.url # etc... s.post(APURL) # r = s.post(APURL, data=findaps) r = s.get(APURL) #print r.text f = open("makethisfile.html", "w") f.write('\n'.join(['<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">', '<html>', ' <head>', ' <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">', ' <title>THE TITLE</title>', ' <link rel="stylesheet" href="css/displayEventLists.css" type="text/css">', r.text #this just does everything, i need to get the table. ]) ) f.close()

1条回答

网友

1楼 · 发布于 2024-05-14 09:13:37

虽然正确地解析文件是最好的，但是快速而肮脏的方法使用正则表达式。在

m = re.search("<table.*?>(.+)</table>", r.text, re.S)
if (m):
  print m.group()
else:
  print "Error: table not found"

作为为什么解析更好的一个例子，编写的regex将因以下原因失败（相当做作！）示例：

<!  <table>  >
blah
blah
<table>
this is the actual
table
</table>

正如所写的，它将得到文件中的第一个表。但是您可以循环获取第二个，等等（如果可能的话，让regex特定于您想要的表），所以这不是问题。在

相关问题更多 >

编程相关推荐

热门问题

热门文章