tweet_mode='extended'和文本.追加(tweet.全文)不工作并导致['…']和筛选不起作用

2024-04-16 15:05:17 发布

您现在位置:Python中文网/ 问答频道 /正文

无法检索完整的tweet文本。大多数tweet以“…”结尾 由于API有限,我试图过滤结果,但没有成功。寻找不同的解决方案:

    import tweepy #https://github.com/tweepy/tweepy
    import csv
    import pandas as pd
    # Used for progress bar
    import time
    import sys

    #Twitter API credentials
    consumer_key = ""
    consumer_secret = ""
    access_key = ""
    access_secret = ""

    OAUTH_KEYS = {'consumer_key':consumer_key,         'consumer_secret':consumer_secret,
     'access_token_key':access_key, 'access_token_secret':access_secret}
    auth = tweepy.OAuthHandler(OAUTH_KEYS['consumer_key'],         OAUTH_KEYS['consumer_secret'])
    api = tweepy.API(auth, wait_on_rate_limit=True,         wait_on_rate_limit_notify=True)


    search = tweepy.Cursor(api.search, q='#tips -filter:media', 
                   tweet_mode='extended', 
                   include_rts = False,
                   lang="en").items(2000)

    # Create lists for each field desired from the tweets.
    sn = []
    text = []
    timestamp =[]
    for tweet in search:
        #    print (tweet.user.screen_name, tweet.created_at,         tweet.full_text,)
        print(tweet.full_text)
        timestamp.append(tweet.created_at)
        sn.append(tweet.user.screen_name)
        text.append(tweet.full_text)
        print('-------------------------------------------------------------------------------')

    #Convert lists to dataframe
    df = pd.DataFrame()
    df['timestamp'] = timestamp
    df['sn'] = sn
    df['text'] = text

    # Prepare ford date filtering. Adding an EST time column since chat hosted by people in that time zone.
    df['timestamp'] = pd.to_datetime(df['timestamp'])
    df['EST'] = df['timestamp'] - pd.Timedelta(hours=5) #Convert to EST

    df['EST'] = pd.to_datetime(df['EST'])
    #============================================================================
    # list of timestamp, EST, sn, text  
    col1 = df['timestamp']
    col2 = df['EST']
    col3 = df['sn']
    col4 = df['text']

    # dictionary of lists 
    dict = {'TimeStamp': col1, 'EST': col2, 'SN': col3, 'Text': col4} 
    data = pd.DataFrame(dict) 

    # saving the dataframe 
    data.to_csv('tipstweets.csv') 

代码打印以“…”结尾的tweet,我无法过滤这些tweets,我还可以尝试使用什么呢?我在git或twitter文档中找不到解决方案-我应该搜索什么?在


Tags: tokeytextimportapidfsecretaccess