尝试使用Python3.x和pandas从Basketball引用中获取工资数据。我没有收到任何错误消息,但我没有输出。我想要表格中的第二列和第四列:“球员”和工资“2019-20”。我做错了什么
这就是我到目前为止所做的:
# URL page we will scraping
salaries_url = 'https://www.basketball-reference.com/contracts/players.html'
salaries_response = requests.get(salaries_url)
page = salaries_response.text
# this is the HTML from the given URL
soup = BeautifulSoup(html)
#This takes the player salaries data, and creates a list of a lists, where a list is all the values of a player
salaries = []
for x in soup.find_all('tr')[2:]:
tds_salaries = x.find_all('td')
name_s = tds_salaries[0].text
salary = tds_salaries[2].text
salaries.append([name_s, salary[1:]])
#create a salary pandas dataframe
salaries_df = pd.DataFrame(salaries, columns=['name', 'salary'])
salaries_df.head()
这里很好用。我所做的只是在for循环中尝试跳过表头
代码
Outuput
相关问题 更多 >
编程相关推荐