使用matplotlib绘制非重叠散点图标签

5 投票

2 回答

8400 浏览

提问于 2025-04-18 18:38

我有一个散点图，上面有很多点。每个点都有一个字符串（长度不一），我想给它们加上标签，但我发现有些标签放不下。所以我想按照重要性从高到低一个个检查这些数据点，只有在标签不会和已有的标签重叠时才加上标签。因为这些字符串的长度不同。有位评论者提到可以用背包问题的方法来找到最佳解决方案。在我的情况下，贪心算法（总是给最重要的、且不会重叠的点加标签）是个不错的开始，可能就足够了。

这里有个简单的例子。我能让Python只给那些不会重叠的点加标签吗？

import matplotlib.pylab as plt, numpy as np

npoints = 100
xs = np.random.rand(npoints)
ys = np.random.rand(npoints)

plt.scatter(xs, ys)

labels = iter(dir(np))
for x, y, in zip(xs, ys):
    # Ideally I'd condition the next line on whether or not the new label would overlap with an existing one
    plt.annotate(labels.next(), xy = (x, y))
plt.show()

数据可视化算法优化散点图贪心算法背包问题标签重叠重要性排序

2 个回答

补充一下：为了让我的代码正常工作，我需要在 get_window_extent() 方法中添加一个额外的 renderer=fig.canvas.get_renderer() 参数，而不是使用默认的 get_window_extent(renderer=None)。我觉得这个额外参数的必要性可能和操作系统有关。 https://github.com/matplotlib/matplotlib/issues/10874

回答于 2025-04-18 由 Python大师

分享举报

你可以先画出所有的标注，然后用一个遮罩数组来检查它们是否重叠，再用 set_visible() 来隐藏那些重叠的部分。下面是一个例子：

import numpy as np
import pylab as pl
import random
import string
import math
random.seed(0)
np.random.seed(0)
n = 100
labels = ["".join(random.sample(string.ascii_letters, random.randint(4, 10))) for _ in range(n)]
x, y = np.random.randn(2, n)

fig, ax = pl.subplots()

ax.scatter(x, y)

ann = []
for i in range(n):
    ann.append(ax.annotate(labels[i], xy = (x[i], y[i])))

mask = np.zeros(fig.canvas.get_width_height(), bool)

fig.canvas.draw()

for a in ann:
    bbox = a.get_window_extent()
    x0 = int(bbox.x0)
    x1 = int(math.ceil(bbox.x1))
    y0 = int(bbox.y0)
    y1 = int(math.ceil(bbox.y1))

    s = np.s_[x0:x1+1, y0:y1+1]
    if np.any(mask[s]):
        a.set_visible(False)
    else:
        mask[s] = True

输出结果：

在这里输入图片描述

回答于 2025-04-18 由 Python大师

分享举报

使用matplotlib绘制非重叠散点图标签

2 个回答

撰写回答