Seaborn:带发散标准/Cmap的条形图

2024-05-19 00:42:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个从-0.15到0.08的数值列表:

array([-9.28024375e-03, -7.74566792e-03,  6.89222284e-02,  1.98236910e-03,
        1.05891798e-02,  7.36737261e-03,  6.25777898e-03, -1.78726642e-02,
       -5.06597295e-03,  5.17623104e-02, -1.13474442e-02,  1.06263056e-02,
        2.09431952e-03, -1.54730073e-02, -1.93402164e-02,  1.04915526e-02,
        2.04725155e-03,  2.65222141e-02,  1.43185909e-02, -3.73984434e-03,
        2.62798866e-02, -2.67092615e-02,  3.48239927e-02,  3.08109938e-03,
       -9.12865632e-03,  2.46767319e-03, -2.36669926e-02,  2.07367834e-02,
        3.06733189e-02, -5.56772675e-03, -2.40482345e-03, -4.24432795e-02,
       -3.79769064e-03,  2.51791666e-02,  2.32164137e-02, -1.74955467e-02,
        7.47313626e-03,  6.86957861e-03,  1.38965986e-02,  7.68997312e-05,
       -4.59857112e-03,  1.37564169e-02, -6.25312715e-03,  1.66797351e-02,
       -7.13480355e-03, -2.38543967e-02,  2.48704615e-02,  2.99393285e-02,
       -1.17281194e-03,  1.78675678e-03,  8.04761250e-03, -1.50505912e-01,
        8.25650062e-02])

这些值中的每一个都对应于句子的特定部分。我试图在条形图中形象化这一点。我希望所有负值的范围为橙色->;红色,以及一些黄色的正值->;绿色或蓝色->;深蓝色。我希望这些值的阈值为0.00

我已经尝试使用Matplotlib的DivergingNorm功能来创建它并将其传递给我的Seaborn绘图。不幸的是,它似乎根本不起作用,我得到以下信息:

Faulty saliency plot

要创建此文件,我有以下代码:

import matplotlib.pyplot as plt
import matplotlib.colors as colors
import seaborn as sns

divnorm = colors.DivergingNorm(vmin=df["saliency"].values.min(), vcenter=0, vmax=df["saliency"].values.max())
div_colors = plt.cm.RdYlGn(divnorm(df["saliency"])) 
ax = sns.barplot(x='saliency', y='tokens', data=df, palette=div_colors, edgecolor='black')

我不确定我做错了什么,或者我是否应该使用不同的量表,但任何帮助都是非常感谢的

编辑:使用数据帧值更新

            tokens  saliency
0               Go -0.009280
1               to -0.007746
2         Walmart,  0.068922
3              get  0.001982
4            shot.  0.010589
5               Go  0.007367
6               to  0.006258
7                a -0.017873
8           garlic -0.005066
9        festival,  0.051762
10             get -0.011347
11           shot.  0.010626
12              Go  0.002094
13              to -0.015473
14               a -0.019340
15        concert,  0.010492
16             get  0.002047
17           shot.  0.026522
18              Go  0.014319
19              to -0.003740
20         church,  0.026280
21             get -0.026709
22           shot.  0.034824
23              Go  0.003081
24              to -0.009129
25             the  0.002468
26         movies, -0.023667
27             get  0.020737
28           shot.  0.030673
29              Go -0.005568
30              to -0.002405
31           work, -0.042443
32             get -0.003798
33           shot.  0.025179
34              Go  0.023216
35              to -0.017496
36        college,  0.007473
37             get  0.006870
38           shot.  0.013897
39              Go  0.000077
40              to -0.004599
41         school,  0.013756
42             get -0.006253
43           shot.  0.016680
44              Go -0.007135
45              to -0.023854
46             the  0.024870
47            bar,  0.029939
48             get -0.001173
49           shot.  0.001787
50  #ThisIsAmerica  0.008048
51         #ElPaso -0.150506
52         #Dayton  0.082565

Tags: toimportgtgodfgetmatplotlibas
1条回答
网友
1楼 · 发布于 2024-05-19 00:42:08

当代币具有多个显著性时,seaborn的条形图取这些显著性的平均值。因此,显示的不是原始的显著性,而是每个标记的平均值,以及一个错误条

要给它们上色,可以首先创建条形图,然后在第二次循环中通过创建的条形图,并根据其宽度为它们指定颜色。(请注意,在最新的matplotlib版本中,DivergingNorm已重命名为TwoSlopeNorm。)

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns

saliency = np.array([-9.28024375e-03, -7.74566792e-03, 6.89222284e-02, 1.98236910e-03, 1.05891798e-02, 7.36737261e-03, 6.25777898e-03, -1.78726642e-02, -5.06597295e-03, 5.17623104e-02, -1.13474442e-02, 1.06263056e-02, 2.09431952e-03, -1.54730073e-02, -1.93402164e-02, 1.04915526e-02, 2.04725155e-03, 2.65222141e-02, 1.43185909e-02, -3.73984434e-03, 2.62798866e-02, -2.67092615e-02, 3.48239927e-02, 3.08109938e-03, -9.12865632e-03, 2.46767319e-03, -2.36669926e-02, 2.07367834e-02, 3.06733189e-02, -5.56772675e-03, -2.40482345e-03, -4.24432795e-02, -3.79769064e-03, 2.51791666e-02, 2.32164137e-02, -1.74955467e-02, 7.47313626e-03, 6.86957861e-03, 1.38965986e-02, 7.68997312e-05, -4.59857112e-03, 1.37564169e-02, -6.25312715e-03, 1.66797351e-02, -7.13480355e-03, -2.38543967e-02, 2.48704615e-02, 2.99393285e-02, -1.17281194e-03, 1.78675678e-03, 8.04761250e-03, -1.50505912e-01, 8.25650062e-02])
tokens = ['Go', 'to', 'Walmart,', 'get', 'shot.', 'Go', 'to', 'a', 'garlic', 'festival,', 'get', 'shot.', 'Go', 'to', 'a', 'concert,', 'get', 'shot.', 'Go', 'to', 'church,', 'get', 'shot.', 'Go', 'to', 'the', 'movies,', 'get', 'shot.', 'Go', 'to', 'work,', 'get', 'shot.', 'Go', 'to', 'college,', 'get', 'shot.', 'Go', 'to', 'school,', 'get', 'shot.', 'Go', 'to', 'the', 'bar,', 'get', 'shot.', '#ThisIsAmerica', '#ElPaso', '#Dayton']

ax = sns.barplot(x=saliency, y=tokens, edgecolor='black')

widths = np.array( [bar.get_width() for bar in ax.containers[0]])
divnorm = mpl.colors.TwoSlopeNorm(vmin=widths.min(), vcenter=0, vmax=widths.max())
div_colors = plt.cm.RdYlGn(divnorm(widths))
for bar, color in zip(ax.containers[0], div_colors):
    bar.set_facecolor(color)
plt.tight_layout()
plt.show()

example plot

现在,为了从问题中获得颜色,您可以创建两个线性颜色贴图,一个用于负片,一个用于正片

widths = np.array([bar.get_width() for bar in ax.containers[0]])
neg_cmap = mpl.colors.LinearSegmentedColormap.from_list('', ['orange', 'red'])
pos_cmap = mpl.colors.LinearSegmentedColormap.from_list('', ['yellow', 'green'])
min_width, max_width = widths.min(), widths.max()
for bar, w in zip(ax.containers[0], widths):
    bar.set_facecolor(neg_cmap(w / min_width) if w < 0 else pos_cmap(w / max_width))

using LinearSegmentedColormap

相关问题 更多 >

    热门问题