无法区分这两个表达式应该以相同的方式工作

import time import requests from bs4 import BeautifulSoup links = [ "https://stackoverflow.com/questions/tagged/web-scraping?sort=newest&page=2", "https://stackoverflow.com/questions/tagged/web-scraping?sort=newest&page=3", "https://stackoverflow.com/questions/tagged/web-scraping?sort=newest&page=4" ] counter = 0 def fetch_data(link): global counter res = requests.get(link) soup = BeautifulSoup(res.text,"lxml") try: title = soup.select_one("p.tcode").text except AttributeError: title = "" if not title: while counter<=3: time.sleep(1) print("trying {} times".format(counter)) counter += 1 return fetch_data(link) #First fix counter=0 #Second fix print("tried with this link:",link) if __name__ == '__main__': for link in links: fetch_data(link)

trying 0 times trying 1 times trying 2 times trying 3 times tried with this link: https://stackoverflow.com/questions/tagged/web-scraping?sort=newest&page=2 trying 0 times trying 1 times trying 2 times trying 3 times tried with this link: https://stackoverflow.com/questions/tagged/web-scraping?sort=newest&page=3 trying 0 times trying 1 times trying 2 times trying 3 times tried with this link: https://stackoverflow.com/questions/tagged/web-scraping?sort=newest&page=4

1条回答

网友

1楼 · 发布于 2024-04-19 20:45:22

如果函数中的while循环无法获取标题，它将启动一个递归调用。当您使用return fetch_data(link)时，它可以工作，因为每当计数器小于或等于3 while counter<=3时，它将在while循环结束时立即退出函数，因此不会转到将计数器重置为0 counter=0的下行。因为计数器是一个全局变量，每个递归深度只增加1，所以最大递归深度只有4个，因为只要counter大于3，它就不会进入调用另一个fetch_data(link)的while循环。你知道吗

fetch_data (counter=0)
   > fetch_data (counter=1)
     > fetch_data (counter=2)
       > fetch_data (counter=3)
         > fetch_data (counter=4) 
        - not go into while loop, reset counter, print url
        - return to above function
      - return to above function
    - return to above function
  - return to above function

如果使用fetch_data(link)，函数仍将在while循环中启动递归调用。但是，不会立即退出，并将计数器重置为0。这是危险的，因为在计数器转到4之后，函数返回while循环中上一个函数调用的while循环，while循环将不会中断并继续启动其他递归调用，因为计数器当前设置为0，即<；=3。这将最终达到最大递归深度，并将使程序崩溃。你知道吗

fetch_data (counter=0)
   > fetch_data (counter=1)
     > fetch_data (counter=2)
       > fetch_data (counter=3)
         > fetch_data (counter=4) 
        - not go into while loop, !!!reset counter!!!, print url
        - return to above function
      - not return to above function call
      - since counter = 0, continue the while loop
         > fetch_data (counter=1)
           > fetch_data (counter=2)
             > fetch_data (counter=3)
...

相关问题更多 >

编程相关推荐

热门问题

热门文章