用lxm实现Python网页抓取

2024-05-23 16:11:28 发布

您现在位置：Python中文网/ 问答频道 /正文

1060

网友

男 | 程序猿一只，喜欢编程写python代码。

我试图从下面的页面中删除列名（player，cost，sel.，form，pts）：

https://fantasy.premierleague.com/a/statistics/total_points

但是，我没有这样做。在我进一步说之前，让我向你展示一下我所做的一切。你知道吗

from lxml import html
import requests


page = 'https://fantasy.premierleague.com/a/statistics/total_points'
#Take site and structure html
page = requests.get(page)
tree = html.fromstring(page.content)

#Using the page's CSS classes, extract all links pointing to a team
Location = tree.cssselect('.ism-thead-bold tr .ism-table--el-stats__name')

当我这样做时，位置应该是一个包含字符串“Player”的列表。但是，它返回一个空列表，这意味着cssselect没有捕获任何内容。你知道吗

虽然每个列名都有不同的'th class'，但我在这个特定的试验中使用了其中的一个（ism table--el-stats\uu name），只是为了简单起见。你知道吗

当这个问题解决后，我想使用regex，因为每个类在两个下划线后都有不同的后缀。你知道吗

如果有人能帮我完成这两项任务，我将不胜感激！你知道吗

谢谢你们。你知道吗

Tags： https import com tree html page table requests

0条回答

目前没有回答

用lxm实现Python网页抓取

相关问题更多 >

编程相关推荐

热门问题

热门文章

用lxm实现Python网页抓取

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >