硒和靓汤一起使用

url = "https://scholar.google.com/citations?user=VjJm3zYAAAAJ&hl=en" while True: response = requests.get(url) data = response.text soup = BeautifulSoup(data,'html.parser') research_article = soup.find_all('tr',{'class':'gsc_a_tr'}) for research in research_article: title = research.find('a',{'class':'gsc_a_at'}).text authors = research.find('div',{'class':'gs_gray'}).text print('Title:', title,'\n','\nAuthors:', authors)

driver = webdriver.Chrome(executable_path ="/Applications/chromedriver84") driver.get(url) try: #Wait up to 10s until the element is loaded on the page element = WebDriverWait(driver, 10).until( #Locate element by id EC.presence_of_element_located((By.ID, 'gsc_bpf_more')) ) finally: element.click()

1条回答

网友

1楼 · 发布于 2024-06-17 15:35:07

此脚本将打印页面中的所有标题和作者：

import re
import requests
from bs4 import BeautifulSoup


url = 'https://scholar.google.com/citations?user=VjJm3zYAAAAJ&hl=en'
api_url = 'https://scholar.google.com/citations?user={user}&hl=en&cstart={start}&pagesize={pagesize}'
user_id = re.search(r'user=(.*?)&', url).group(1)

start = 0
while True:
    soup = BeautifulSoup( requests.post(api_url.format(user=user_id, start=start, pagesize=100)).content, 'html.parser' )

    research_article = soup.find_all('tr',{'class':'gsc_a_tr'})

    for i, research in enumerate(research_article, 1):
        title = research.find('a',{'class':'gsc_a_at'})
        authors = research.find('div',{'class':'gs_gray'})

        print('{:04d} {:<80} {}'.format(start+i, title.text, authors.text))

    if len(research_article) != 100:
        break

    start += 100

印刷品：

0001 Hyper-heuristics: A Survey of the State of the Art                               EK Burke, M Hyde, G Kendall, G Ochoa, E Ozcan, R Qu
0002 Hyper-heuristics: An emerging direction in modern search technology              E Burke, G Kendall, J Newall, E Hart, P Ross, S Schulenburg
0003 Search methodologies: introductory tutorials in optimization and decision support techniques E Burke, EK Burke, G Kendall
0004 A tabu-search hyperheuristic for timetabling and rostering                       EK Burke, G Kendall, E Soubeiga
0005 A hyperheuristic approach to scheduling a sales summit                           P Cowling, G Kendall, E Soubeiga
0006 A classification of hyper-heuristic approaches                                   EK Burker, M Hyde, G Kendall, G Ochoa, E Özcan, JR Woodward
0007 Genetic algorithms                                                               K Sastry, D Goldberg, G Kendall

...

0431 Solution Methodologies for generating robust Airline Schedules                   F Bian, E Burke, S Jain, G Kendall, GM Koole, J Mulder, MCE Paelinck, ...
0432 A Triple objective function with a chebychev dynamic point specification approach to optimise the surface mount placement machine M Ayob, G Kendall
0433 A Library of Vehicle Routing Problems                                            T Pigden, G Kendall, SD Ehsan, E Ozcan, R Eglese
0434 This symposium could not have taken place without the help of a great many people and organisations. We would like to thank the IEEE Computational Intelligence Society for … S Louis, G Kendall

相关问题更多 >

编程相关推荐

热门问题

热门文章