基于Altair变换选择的散点图中动态标签显示点数?
简而言之: 有没有办法在Altair中获取过滤后图表中动态的点数统计?
在Altair的“通过选择过滤”示例中,可以根据链接的条形图中的选择来更新散点图。
import altair as alt
from vega_datasets import data
cars = data.cars()
click = alt.selection_point(encodings=['color'])
scatter = alt.Chart(cars).mark_point().encode(
x='Horsepower:Q',
y='Miles_per_Gallon:Q',
color='Origin:N'
).transform_filter(
click
)
hist = alt.Chart(cars).mark_bar().encode(
x='count()',
y='Origin',
color=alt.condition(click, 'Origin', alt.value('lightgray'))
).add_params(
click
)
scatter & hist
不过,显示过滤后散点图中剩余的数据点数量会很有用。这个怎么实现呢?
可以获取数据框中数据的静态数量,并用它来做标签,我尝试过另一种方法,比如:
import altair as alt
import pandas as pd
# Sample data
data = pd.DataFrame({'values': [1, 2, 3, 4, 5, 1, 2, 3]})
total_count = data['values'].sum()
# Create data frame for total count
count_data = pd.DataFrame({'label': ['Total Count'], 'count': [total_count]})
# Combine data
combined_data = pd.concat([data, count_data])
# Create histogram with text for total count
hist = alt.Chart(combined_data).mark_bar().encode(
x='values:Q',
y='count()',
)
text = hist.mark_text(
color='black',
baseline='middle'
).encode(
x=alt.value(200), # Adjust position as needed
y=alt.value(40),
text='count:Q',
)
# Display the chart
hist + text
这样会添加一个看起来不太好看的标签:
有没有办法添加一个动态标签,显示过滤后的散点图中存在的点数?一个外部元素也可以,但我对Altair还很陌生,尽管搜索了一些,但还没找到解决办法。
1 个回答
0
在Altair中显示过滤后数据的动态计数,使用 transform_aggregate
想要在图表上显示经过过滤后有多少个数据点,可以结合使用 transform_aggregate
来获取一个 count
(计数),以及 文本标记。
你可以在正在过滤的图表上添加聚合转换,比如:filtered_plot.transform_aggregate(count='count(*)')
。然后可以像添加任何标签一样应用文本标记。
使用鸢尾花萼数据的示例
注意:添加计数文本会在图例中增加一个“未定义”的字段,我还没有找到去掉它的方法
代码:
import altair as alt
from vega_datasets import data # Example dataset library
# Sample data - iris sepal length
source = data('iris')
# Create the scatterplot - eg length by width, coloured by species
scatterplot = alt.Chart(source).mark_point().encode(
x='sepalLength:Q',
y='sepalWidth:Q',
color='species:N'
)
# Create a selection to filter points
selection = alt.selection_point(encodings=['color'], resolve='global')
# count
countplot = alt.Chart(source).mark_bar().encode(
y='species:N',
x='count()',
color=alt.condition(selection, 'species:N', alt.ColorValue('gray'))
).add_params(selection)
# Filter scatterplot based on selection
filtered_scatterplot = scatterplot.transform_filter(selection)
# Calculate count of filtered points
filtered_count = filtered_scatterplot.transform_aggregate(
count='count(*)')
# Display count as text
text = filtered_count.mark_text(
color='black',
fontSize=14,
baseline='middle'
).encode(
x=alt.value(100), # Adjust position as needed
y=alt.value(10),
text='count:Q',
)
# Display the chart with linked selection and count
filtered_scatterplot & text & countplot