下载内容时遇到请求问题
我正在尝试使用Python的requests模块从一个需要付费才能访问的网站下载一个mp4视频。以下是我的代码:
link_href = # Found the mp4 link on the website
with open('D:/filename','wb') as f:
response = requests.get(link_href)
f.write(response.content)
我查看了response.content
,发现它是网站登录页面的html代码。我该怎么才能获取到mp4视频呢?
1 个回答
-1
好的,看起来你想从一个需要先登录的网站下载一个mp4视频。因为你没有在请求中提供任何登录信息,所以网站返回的是登录页面的HTML内容。要突破这个限制,你需要按照正确的登录流程进行身份验证,比如发送一个包含你登录信息的请求到登录的地址。
下面,我会给你一个大概的思路,告诉你如何用requests模块来处理这个问题:
import requests
from bs4 import BeautifulSoup
# Change these variables with your actual login information
login_url = "https://www.example.com/login"
username = "your_username"
password = "your_password"
mp4_link = "https://www.example.com/video.mp4" # Found the mp4 link on the website
# Step 1: Visit the login page
with requests.Session() as session:
login_page_response = session.get(login_url)
# Step 2: Parse the login page to get CSRF token or other required information (if necessary)
soup = BeautifulSoup(login_page_response.content, "html.parser")
csrf_token = soup.find("input", {"name": "_csrf"})["value"] # Example of getting CSRF token
# Step 3: Prepare the login data
login_data = {
"username": username,
"password": password,
"_csrf": csrf_token,
}
# Step 4: Send a POST request to the login endpoint with the login data
login_response = session.post(login_url, data=login_data)
# Step 5: Verify if the login is successful by checking the response
# (e.g., by checking if the redirected URL is the expected page after login)
# Step 6: Download the mp4 file
if login_response.status_code == 200:
video_response = session.get(mp4_link)
if video_response.status_code == 200:
with open('D:/filename.mp4','wb') as f:
f.write(video_response.content)
你可以根据上面的内容来修正你的代码,如果还有什么错误,随时告诉我。