我想从twitter上抓取,但我得到了这个结果“[]”

2022-12-05 02:05:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我写了这段代码,我想从twitter的页面上获取一些信息,但结果是空的

import requests
from bs4 import BeautifulSoup
import csv
from itertools import zip_longest

result = requests.get("https://twitter.com/search?q=BTC&src=typed_query&f=live")
src = result.content

soup = BeautifulSoup(src, "lxml")
#print(soup)

tweets = soup.find_all("a",{"css-901oao css-16my406 r-1k78y06 r-bcqeeo r-qvutc0"})
print(tweets)

Tags: 代码fromimportsrc信息twitter页面resultrequestscsstweetsprintsoupbs4beautifulsoup
1条回答
网友
1楼 · 发布于 2022-12-05 02:05:57

您可以使用此代码获取数据。此外,数据以JSON格式返回,所以您甚至不需要使用BeautifulSoup

代码

import requests
import json

headers = {
    'authority': 'twitter.com',
    'pragma': 'no-cache',
    'cache-control': 'no-cache',
    'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
    'x-twitter-client-language': 'en',
    'x-csrf-token': 'bdf13388fcc19da71adc494b1f7f0b67',
    'sec-ch-ua-mobile': '?0',
    'authorization': 'Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36',
    'x-guest-token': '1409186325543092227',
    'x-twitter-active-user': 'yes',
    'accept': '*/*',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-mode': 'cors',
    'sec-fetch-dest': 'empty',
    'referer': 'https://twitter.com/search?q=BTC&src=typed_query&f=live',
    'accept-language': 'en-US,en;q=0.9',
    'cookie': 'personalization_id="v1_22YA+2q/xRm1KgF/0U66Qw=="; guest_id=v1%3A162480151777727574; ct0=bdf13388fcc19da71adc494b1f7f0b67; _sl=1; _twitter_sess=BAh7CSIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNo%250ASGFzaHsABjoKQHVzZWR7ADoPY3JlYXRlZF9hdGwrCDent016AToMY3NyZl9p%250AZCIlMjk3MTY3MTIyZjRlZTYxMDdhMDhmMzgwNmVlNGRlOTg6B2lkIiU1Mjll%250AODBkMjI0YTUwYzVjZGY1NGViNTNiM2JmNWZiNg%253D%253D b7f16af5121c08a00f8216aeff1cd54d00c2f14b; _ga=GA1.2.1976020620.1624801520; _gid=GA1.2.1056783113.1624801520; gt=1409186325543092227',
}

params = (
    ('include_profile_interstitial_type', '1'),
    ('include_blocking', '1'),
    ('include_blocked_by', '1'),
    ('include_followed_by', '1'),
    ('include_want_retweets', '1'),
    ('include_mute_edge', '1'),
    ('include_can_dm', '1'),
    ('include_can_media_tag', '1'),
    ('skip_status', '1'),
    ('cards_platform', 'Web-12'),
    ('include_cards', '1'),
    ('include_ext_alt_text', 'true'),
    ('include_quote_count', 'true'),
    ('include_reply_count', '1'),
    ('tweet_mode', 'extended'),
    ('include_entities', 'true'),
    ('include_user_entities', 'true'),
    ('include_ext_media_color', 'true'),
    ('include_ext_media_availability', 'true'),
    ('send_error_codes', 'true'),
    ('simple_quoted_tweet', 'true'),
    ('q', 'BTC'),
    ('tweet_search_mode', 'live'),
    ('count', '20'),
    ('query_source', 'typed_query'),
    ('pc', '1'),
    ('spelling_corrections', '1'),
    ('ext', 'mediaStats,highlightedLabel'),
)

response = requests.get('https://twitter.com/i/api/2/search/adaptive.json', headers=headers, params=params)
json_text = json.loads(response.content)
tweet = json_text.get("globalObjects").get('tweets')

for i in tweet:
    print(tweet[i].get('full_text'))

结果

@Zenon_Network Should trust this project! The dev always hard work to growup Zenon_Network!! Send it boysss 
@Zenon_Network

#NoM #AlphanetBigBang $ZNN $QSR $PP $BTC
U can def see the binance fud impact but we are recovering fast on news sentiment social sentiment  not so much #BTC #ETH #Crypto 
Alerts before spikes and right as big news drops out of the largest cryto market join us instantly 

$BTC $LTC $ETH $ADA $DOGE $XRP $XLM $BIFI $GAJ $FISH $FOX $WOLF $DB $MONO $KENNY $HH $YLD $GEN $DMT $BTU $SAFU $BULL $PUSSY $SSGT $PCAKE $GFARM $SUPER $PINGU $GFI $POLR $FRAX $EOS 
@TylerDurden Don't you hold BTC???
...