Python网络爬虫不打印任何resu

1条回答

网友

1楼 · 发布于 2024-06-16 11:09:10

"video-title"在div标记中，还需要传递字符串"href"：

def stepashka_spider(max_pages):
    page = 1
    while page <= max_pages:
        url = "http://online.stepashka.com/filmy/#/page/" + str(page)
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)
        for resoult in soup.findAll("div", {"class": "video-title"}):
            a_tag = resoult.a
            print(a_tag["href"])
        page += 1

stepashka_spider(1)

输出：

http://online.stepashka.com/filmy/komedii/37878-klub-grust.html
http://online.stepashka.com/filmy/dramy/37875-kadr.html
http://online.stepashka.com/filmy/multfilmy/37874-betmen-protiv-robina.html
http://online.stepashka.com/filmy/fantastika/37263-hrustalnye-cherepa.html
http://online.stepashka.com/filmy/dramy/34369-bozhiy-syn.html
http://online.stepashka.com/filmy/trillery/37873-horoshee-ubiystvo.html
http://online.stepashka.com/filmy/trillery/34983-zateryannaya-reka.html
http://online.stepashka.com/filmy/priklucheniya/37871-totem-volka.html
http://online.stepashka.com/filmy/fantastika/35224-zheleznaya-shvatka.html
http://online.stepashka.com/filmy/dramy/37870-bercy.html

您实际上使用了错误的url格式，我们也可以使用范围而不是循环：

def stepashka_spider(max_pages):
    for page in range(1,max_pages+1):
        url = "http://online.stepashka.com/filmy/page/{}/".format(page)
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)
        print("Movies for page {}".format(page))
        for resoult in soup.findAll("div", {"class": "video-title"}):
            a_tag = resoult.a
            print(a_tag["href"])
        print()

输出：

Movies for page 1
http://online.stepashka.com/filmy/dramy/37895-raskop.html
http://online.stepashka.com/filmy/semejnyj/36275-domik-v-serdce.html
http://online.stepashka.com/filmy/dramy/35371-enni.html
http://online.stepashka.com/filmy/trillery/37729-igra-na-vyzhivanie.html
http://online.stepashka.com/filmy/trillery/37893-vosstavshie-mertvecy.html
http://online.stepashka.com/filmy/semejnyj/30104-sedmoy-syn-seventh-son-2013-treyler.html
http://online.stepashka.com/filmy/dramy/37892-sekret-schastya.html
http://online.stepashka.com/filmy/uzhasy/37891-davayte-poohotimsya.html
http://online.stepashka.com/filmy/multfilmy/3404-specagent-archer-archer-archer-2010-2013.html
http://online.stepashka.com/filmy/trillery/37334-posledniy-reys.html

Movies for page 2
http://online.stepashka.com/filmy/komedii/37890-top-5.html
http://online.stepashka.com/filmy/komedii/37889-igra-v-doktora.html
http://online.stepashka.com/filmy/dramy/36651-vrozhdennyy-porok.html
http://online.stepashka.com/filmy/komedii/37786-superforsazh.html
http://online.stepashka.com/filmy/fantastika/35003-voshozhdenie-yupiter.html
http://online.stepashka.com/filmy/sport/37888-ufc-on-fox-15-machida-vs-rockhold.html
http://online.stepashka.com/filmy/semejnyj/37558-prizrak.html
http://online.stepashka.com/filmy/boeviki/36865-mordekay.html
http://online.stepashka.com/filmy/dramy/37884-stanovlenie-legendy.html
http://online.stepashka.com/filmy/trillery/37883-tainstvo.html

Movies for page 3
http://online.stepashka.com/filmy/dramy/37551-nochnoy-beglec.html
http://online.stepashka.com/filmy/dramy/37763-mech-drakona.html
http://online.stepashka.com/filmy/trillery/36471-paren-po-sosedstvu.html
http://online.stepashka.com/filmy/dramy/36652-amerikanskiy-snayper.html
http://online.stepashka.com/filmy/dramy/37555-feniks.html
http://online.stepashka.com/filmy/semejnyj/35156-gnezdo-drakona-vosstanie-chernogo-drakona.html
http://online.stepashka.com/filmy/kriminal/37882-ch-b.html
http://online.stepashka.com/filmy/priklucheniya/37881-admiral-bitva-za-men-ryan.html
http://online.stepashka.com/filmy/trillery/37880-malyshka.html
http://online.stepashka.com/filmy/trillery/36417-poteryannyy-ray.html

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python网络爬虫不打印任何resu

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >