尝试使用bs4制作Reddit图像刮板

2024-05-14 14:52:26 发布

您现在位置:Python中文网/ 问答频道 /正文

import requests
from bs4 import BeautifulSoup as bs
import os
url = 'https://www.reddit.com/r/memes'
req = requests.get(url)
parser = bs(req.text,'html.parser')
imgs = parser.findAll('img',{"src":True})
rep = 0
print(len(imgs))
for img in imgs:
    src = img['src']
    os.chdir(r'C:\Users\ellio\Desktop\my code\mm\images')
    with open(str(rep)+'.jpg','wb') as file:
        im = requests.get(src)
        if img[alt] == 'Post image':
            rep+=1
            file.write(im.content)
    if rep == 25:
        break

这应该是从r/模因中刮取图像。当我运行它时,它什么也不做就完成了


Tags: importsrcparserurlimggetbsos
1条回答
网友
1楼 · 发布于 2024-05-14 14:52:26

reddit使用javascript呈现页面。但您可以将.json添加到reddit URL并获取JSON提要:

import json
import requests


url = "https://old.reddit.com/r/memes.json"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0"
}
data = requests.get(url, headers=headers).json()

# uncomment this to print all data:
# print(json.dumps(data, indent=4))

for ch in data["data"]["children"]:
    pic_url = ch["data"].get("url_overridden_by_dest")
    if pic_url:
        file_name = pic_url.split("/")[-1]
        if not "." in file_name:
            continue
        with open(file_name, "wb") as f_out:
            print("Downloading {}".format(pic_url))
            c = requests.get(pic_url, headers=headers).content
            f_out.write(c)

打印和下载文件:

Downloading https://i.redd.it/l34mx68djsr61.png
Downloading https://media4.giphy.com/media/VASgH937CSYF969Q1w/giphy.gif?cid=4d1e4f2965663c56e168c398f4dcd35f7b31f1451e568a06&rid=giphy.gif&ct=g
Downloading https://i.redd.it/oy0xme2ncsr61.jpg
Downloading https://i.redd.it/02rwljynxrr61.jpg
Downloading https://i.redd.it/aj9ste2vmsr61.jpg
Downloading https://i.redd.it/9wc4vm7nisr61.jpg
Downloading https://i.redd.it/mqfrhqnrnsr61.jpg
Downloading https://i.redd.it/hcbirqok3sr61.jpg
Downloading https://i.redd.it/da6dz6m0jsr61.jpg
Downloading https://i.redd.it/e9gtf4z29sr61.jpg
Downloading https://i.redd.it/o24odz06trr61.png
Downloading https://i.redd.it/zna7xkkncsr61.png
Downloading https://i.redd.it/j77ovgrovrr61.png
Downloading https://i.redd.it/9acir5koprr61.png
Downloading https://i.redd.it/m7th84obcsr61.png
Downloading https://i.redd.it/xh512h94zsr61.jpg
Downloading https://i.redd.it/p2zkd1opcsr61.jpg
Downloading https://i.redd.it/pfe0lgl0zrr61.jpg
Downloading https://i.redd.it/pwqg36zcesr61.jpg
Downloading https://i.redd.it/gmfzwxgywrr61.png
Downloading https://i.redd.it/9za9i15ywrr61.jpg
Downloading https://i.redd.it/h844j0658sr61.jpg
Downloading https://i.redd.it/oqeuzso2prr61.gif

相关问题 更多 >

    热门问题