如何聚合文本文件的出现，然后将其绘制到Python中

discussion = pd.read_csv('discussion_thread_data.csv') dt | text 2021-03-19 20:59:49+06 | I only need GME to hit 20 eod to make up 2021-03-19 20:59:51+06 | lads why is my account covered in more red 2021-05-21 15:54:27+06 | Oh my god, we might have 2 green days in a row 2021-05-21 15:56:06+06 | Why are people so hype about a 4% TSLA move

4条回答

网友

1楼 · 编辑于 2024-06-16 15:06:41

对所有匹配值使用^{}，对单词边界使用\b\b：

top = ['GME', 'MVIS', 'TSLA', 'AMC']

pat = '|'.join(r"\b{}\b".format(x) for x in top)
df = discussion.set_index('dt')['text'].str.extractall('('+pat+')')[0].reset_index(name='v')
print (df)
                       dt  match     v
0  2021-03-19 20:59:49+06      0   GME
1  2021-05-21 15:56:06+06      0  TSLA

对于计数使用^{}：

df1 = pd.crosstab(df['dt'], df['v'])
print (df1)
val                     GME  TSLA
dt                               
2021-03-19 20:59:49+06    1     0
2021-05-21 15:56:06+06    0     1

由^{}进行的最后绘图：

df1.plot()

按this解决方案编辑：

import matplotlib.pyplot as plt

for col in df1.columns:
    df1[col].plot()
    plt.show()

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何聚合文本文件的出现，然后将其绘制到Python中

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >