我正在学习Python上的web报废,我决定在HackerRank Leaderboard page中测试我的技能,因此我编写了下面的代码,希望在将国家限制添加到tester
函数之前不会出现错误,以便成功导出我的csv文件
但是Python控制台回答说:
AttributeError: 'NoneType' object has no attribute 'find_all'
上面的错误对应于我的代码(for i in table.find_all({'class':'ellipsis'}):
)中的第29行,所以我决定来这里寻求帮助,我担心可能会有更多的语法或逻辑错误,所以最好通过专家的反馈来消除我的疑虑
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np
from time import sleep
from random import randint
pd.set_option('display.max_columns', None)
#Declaring a variable for looping over all the pages
pages = np.arange(1, 93, 1)
a = pd.DataFrame()
#loop cycle
for url in pages:
#get html for each new page
url ='https://www.hackerrank.com/leaderboard?page='+str(url)
page = requests.get(url)
sleep(randint(3,10))
soup = BeautifulSoup(page.text, 'lxml')
#get the table
table = soup.find('header', {'class':'table-header flex'})
headers = []
#get the headers of the table and delete the "white space"
for i in table.find_all({'class':'ellipsis'}):
title = i.text.strip()
headers.append(title)
#set the headers to columns in a new dataframe
df = pd.DataFrame(columns=headers)
rows = soup.find('div', {'class':'table-body'})
#get the rows of the table but omit the first row (which are headers)
for row in rows.find_all('table-row-wrapper')[1:]:
data = row.find_all('table-row-column ellipsis')
row_data = [td.text.strip() for td in data]
length = len(df)
df.loc[length] = row_data
#set the data of the Txn Count column to float
Txn = df['SCORE'].values
#combine all the data rows in one single dataframe
a = a.append(pd.DataFrame(df))
def tester(mejora):
mejora = mejora[(mejora['SCORE']>2250.0)]
return mejora.to_csv('new_test_Score_Count.csv')
tester(a)
你们有什么想法或建议可以解决这个问题吗
错误表明,您的表元素为“无”。我在这里猜测,但是您无法从使用bs4的页面获取表,因为它是在使用javascript加载之后加载的。我建议用硒来代替
相关问题 更多 >
编程相关推荐