使用散点图绘制数据集中的多列

0 投票
1 回答
1279 浏览
提问于 2025-06-18 04:08
import plotly.offline as pyo
import plotly.express as px
import matplotlib.pyplot as pls

pyo.init_notebook_mode()


data = pd.read_csv(r'C:.......Coronovirus Datasets\time_series_covid19_deaths_global.csv')

countries = ['US']
filtered_data = data[data['Country/Region'].isin(countries)]


wanted_values = filtered_data[['Country/Region','1/22/2020','1/23/2020','1/24/2020', '1/25/2020','1/26/2020','1/27/2020','1/28/2020','1/28/2020','1/29/2020',
      '1/30/2020','1/31/2020','2/1/2020','2/2/2020','2/3/2020','2/4/2020','2/5/2020','2/6/2020','2/7/2020','2/8/2020','2/9/2020','2/10/2020',
    '2/11/2020','2/12/2020','2/13/2020','2/14/2020','2/15/2020','2/16/2020','2/17/2020','2/18/2020','2/19/2020','2/20/2020','2/21/2020','2/22/2020','2/23/2020',
    '2/24/2020','2/25/2020','2/26/2020','2/27/2020','2/28/2020','2/29/2020','3/1/2020','3/2/2020','3/3/2020','3/4/2020','3/5/2020','3/6/2020','3/7/2020',
    '3/8/2020','3/9/2020','3/10/2020','3/11/2020','3/12/2020','3/13/2020','3/14/2020','3/15/2020','3/16/2020','3/17/2020','3/18/2020','3/19/2020',
    '3/20/2020','3/21/2020','4/1/2020','4/2/2020','4/3/2020','4/4/2020','4/5/2020','4/6/2020','4/7/2020','4/8/2020','4/9/2020','4/10/2020',
    '4/11/2020','4/12/2020','4/13/2020','4/14/2020','4/15/2020','4/16/2020','4/17/2020','4/18/2020','4/19/2020','4/20/2020','4/21/2020','4/22/2020','4/23/2020',
    '4/24/2020','4/25/2020','4/26/2020','4/27/2020','4/28/2020','4/29/2020','5/1/2020','5/2/2020','5/3/2020','5/4/2020','5/5/2020','5/6/2020','5/7/2020','5/8/2020','5/9/2020']]



fig = px.scatter(wanted_values, x ='Country/Region', y = 'dates' , title = 'Number of Deaths Per Day')
fig.show()


#wanted_values.plot(x="5/9/2020, 5/8/2020", y = 'filtered_data' kind = 'bar')
#pls.show()

我想知道怎么把所有日期和对应的死亡人数画成散点图。我打算用线性回归来预测从一月一日开始的死亡人数。因为我对Python还很陌生,所以在绘制这些数据时遇到了很多困难。

你可以在这里找到数据集:https://data.humdata.org/dataset/novel-coronavirus-2019-ncov-cases

相关问题:

  • 暂无相关问题
暂无标签

1 个回答

0

你的数据看起来是这样的:

import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv("time_series_covid19_deaths_global.csv")
data.iloc[:2,:7]

Province/State  Country/Region  Lat Long    1/22/20 1/23/20 1/24/20
0   NaN Afghanistan 33.0000 65.0000 0   0   0
1   NaN Albania 41.1533 20.1683 0   0   0

首先,你需要通过指定开始和结束日期(这些日期要和列名匹配)来筛选数据,然后把数据转换成长格式:

data = data[data['Country/Region']=='US']
data = data.loc[:,'1/22/20':'5/9/20'].melt(var_name="date")
data['date'] = pd.to_datetime(data['date'])

现在看起来是这样的:

    date    value
0   2020-01-22  0
1   2020-01-23  0
2   2020-01-24  0

绘图其实很简单:

data.plot.scatter(x="date",y="value",rot=45)

在这里输入图片描述

撰写回答