如何基于同一日期创建列

created_at date time timezone \ 0 2021-06-03 09:01:59 India Standard Time 2021-06-03 09:01:59 530 1 2021-06-03 09:01:41 India Standard Time 2021-06-03 09:01:41 530 2 2021-06-03 07:32:58 India Standard Time 2021-06-03 07:32:58 530 3 2021-06-03 07:31:55 India Standard Time 2021-06-03 07:31:55 530 4 2021-06-03 06:00:52 India Standard Time 2021-06-03 06:00:52 530 tweet 0 "Advertisers offering #cryptocurrency exchange... 1 Beijing to Disperse $6 Million in Digital Yuan... 2 “Combating ransomware is a priority for the ad... 3 Guggenheim has registered a fund with the SEC ... 4 The most recent 2009 spend moved last year, an...

from collections import defaultdict d = defaultdict(list) i = 1 j = 1 while i < btc_news.shape[0]: if btc_news.loc[i, 'date'] == btc_news.loc[i-1, 'date']: temp = 'headline' + str(j) d[temp].append(btc_news.loc[i-1, 'tweet']) j += 1 i += 1 continue else: temp = 'headline' + str(j) d[temp].append(btc_news.loc[i-1, 'tweet']) d['date'].append(btc_news.loc[i-1, 'date']) j = 1 i += 1

1条回答

网友

1楼 · 发布于 2024-04-19 13:58:16

可能不是最有效的解决方案，但这是可行的

首先，您groupby指定日期并连接一个日期的所有tweet：

df2 = df.groupby("date").apply(lambda x: x["tweet"].to_list())

接下来，将列表拆分为各个列：

output = pd.DataFrame(df2.values.tolist()).add_prefix("top_").set_index(df2.index)

输出格式为：

>>> output.head()
                                                        top_0  ... top_35
date                                                           ...       
2015-07-12  Bitcoin the Next Logical Step in the Rise of U...  ...   None
2015-07-13  BitGive Foundation Announces New Initiatives a...  ...   None
2015-07-14  Keynote 2015: Harnessing the Distributed Ledge...  ...   None
2015-07-15  Patrick Byrne Says Wil                        ...  ...   None
2015-07-16  2015 Q1 Bitcoin Investment Trumps 2014 Numbers...  ...   None

相关问题更多 >

编程相关推荐

热门问题

热门文章