密度图

2条回答

网友

1楼 · 编辑于 2024-04-26 07:11:50

大家好，你们可以试试下面的例子，我只是在这个例子中使用了randon法线，显然不可能有负流。不管怎样，免责声明结束，代码如下：

import random 

categories = ['classical','hip-hop','indiepop','indierock','jazz'
          ,'metal','pop','rap','rock']

df = pd.DataFrame({'Type':[random.choice(categories) for _ in range(10000)],
              'stream':[random.normalvariate(0,random.randint(0,15)) for _ in 
               range(10000)]})

###split the data into groups based on types
g = df.groupby('Type')



###access the classical group 
classical = g.get_group('classical')
plt.figure(figsize=(15,6))
plt.hist(classical.stream, histtype='stepfilled', bins=50, alpha=0.2,
     label="Classical Streams", color="#D73A30", density=True)
plt.legend(loc="upper left")

###hip hop

hiphop = g.get_group('hip-hop')

plt.hist(hiphop.stream, histtype='stepfilled', bins=50, alpha=0.2,
     label="hiphop Streams", color="#2A3586", density=True)
plt.legend(loc="upper left")

###indie pop
indiepop = g.get_group('indiepop')

plt.hist(indiepop.stream, histtype='stepfilled', bins=50, alpha=0.2,
     label="indie pop streams", color="#5D271B", density=True)
plt.legend(loc="upper left")


#indierock

indierock = g.get_group('indierock')

plt.hist(indierock.stream, histtype='stepfilled', bins=50, alpha=0.2,
     label="indie rock Streams", color="#30A9D7", density=True)
plt.legend(loc="upper left")


##jazz
jazz = g.get_group('jazz')
plt.hist(jazz.stream, histtype='stepfilled', bins=50, alpha=0.2,
     label="jazz Streams", color="#30A9D7", density=True)
plt.legend(loc="upper left")


####you can add other here if you wish

##modify this to control x-axis, possibly useful for high-variance data
plt.xlim([-20,20])

plt.title('Distribution of Streams by Genre')
plt.xlabel('Count')
plt.ylabel('Density')

如果你想获得我在本例中使用的格式的特定“#000000”颜色，你可以在谷歌上搜索“十六进制颜色选择器”

修改变量“alpha”如果您想更改颜色的显示密度，还可以在我提供的示例中使用“bin”，因为这将允许您在50太大或太小时使其看起来更好

我希望这会有所帮助，在matplotlib中进行绘图可能是一种学习的痛苦，但它肯定是值得的

网友

2楼 · 编辑于 2024-04-26 07:11:50

为了增加@Student240的答案，您可以使用seaborn库，它可以很容易地拟合“内核密度估计”。换句话说，要有类似于你问题中的平滑曲线，而不是分块直方图。这是通过KDEplot类完成的。一个相关的绘图类型是distplot，它给出KDE估计，但也显示直方图箱

我回答的另一个不同之处是在matplotlib/seaborn中使用显式面向对象方法。这涉及到最初使用plt.subplots()而不是fig.hist的隐式方法声明地物和轴对象。有关详细信息，请参见this really good tutorial

import matplotlib.pyplot as plt
import seaborn as sns

## This block of code is copied from Student240's answer:
import random 

categories = ['classical','hip-hop','indiepop','indierock','jazz'
          ,'metal','pop','rap','rock']

# NB I use a slightly different random variable assignment to introduce a bit more variety in my random numbers.
df = pd.DataFrame({'Type':[random.choice(categories) for _ in range(1000)],
              'stream':[random.normalvariate(i,random.randint(0,15)) for i in 
               range(1000)]})


###split the data into groups based on types
g = df.groupby('Type')

## From here things change as I make use of the seaborn library
classical = g.get_group('classical')
hiphop = g.get_group('hip-hop')
indiepop = g.get_group('indiepop')
indierock = g.get_group('indierock')
fig, ax = plt.subplots()

ax = sns.kdeplot(data=classical['stream'], label='classical streams', ax=ax)
ax = sns.kdeplot(data=hiphop['stream'], label='hiphop streams', ax=ax)
ax = sns.kdeplot(data=indiepop['stream'], label='indiepop streams', ax=ax)

# for this final one I use the shade option just to show how it is done:
ax = sns.kdeplot(data=indierock['stream'], label='indierock streams', ax=ax, shade=True)

ax.set_xtitle('Count')
ax.set_ytitle('Density')
ax.set_title('KDE plot example from seaborn")

相关问题更多 >

编程相关推荐

热门问题

热门文章

密度图

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >