在Pandas DataFrame列中创建Sparkline条形图

-1 投票
1 回答
74 浏览
提问于 2025-04-14 16:31

我想在Jupyter笔记本中的pandas数据框里复制下面的图片。2020年的订单标题不需要。我发现了这个页面 https://github.com/crdietrich/sparklines/blob/master/Pandas%20Sparklines%20Demo.ipynb,它似乎可以在数据框中添加小图表,但不是条形图。任何建议都非常感谢!下面的代码可以在数据框的一列中显示小图表,但我想要的是像下面示例数据那样的堆叠条形图。

# example data
data = [[20, 10], [50, 15], [6, 14]]
 
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Orchid', 'Rose'])
 
# print dataframe.
print(df)

我想要的输出应该和下面的图片类似。

Sparkline Bar Chart Example

示例代码显示了在一列中有小图表,但没有堆叠条形图。

import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
%matplotlib inline

import sparklines

# Create some data

density_func = 78
mean, var, skew, kurt = stats.chi.stats(density_func, moments='mvsk')
x_chi = np.linspace(stats.chi.ppf(0.01, density_func),
                    stats.chi.ppf(0.99, density_func), 100)
y_chi = stats.chi.pdf(x_chi, density_func)

x_expon = np.linspace(stats.expon.ppf(0.01), stats.expon.ppf(0.99), 100)
y_expon = stats.expon.pdf(x_expon)

a_gamma = 1.99
x_gamma = np.linspace(stats.gamma.ppf(0.01, a_gamma),
                      stats.gamma.ppf(0.99, a_gamma), 100)
y_gamma = stats.gamma.pdf(x_gamma, a_gamma)

n = 100

np.random.seed(0)  # keep generated data the same for git commit

data = [np.random.rand(n), 
        np.random.randn(n), 
        np.random.beta(2, 1, size=n), 
        np.random.binomial(3.4, 0.22, size=n), 
        np.random.exponential(size=n),
        np.random.geometric(0.5, size=n), 
        np.random.laplace(size=n), 
        y_chi, 
        y_expon, 
        y_gamma]

function = ['rand',
            'randn',
            'beta',
            'binomial',
            'exponential',
            'geometric',
            'laplace',
            'chi',
            'expon',
            'gamma']

df = pd.DataFrame(data)
df['function'] = function

df

# Define range of data to make sparklines

a = df.ix[:, 0:100]

# Output to new DataFrame of Sparklines

df_out = pd.DataFrame()
df_out['sparkline'] = sparklines.create(data=a)
sparklines.show(df_out[['sparkline']])

# Insert Sparklines into source DataFrame

df['sparkline'] = sparklines.create(data=a)
sparklines.show(df[['function', 'sparkline']])

# Detailed Formatting

df_out = pd.DataFrame()
df_out['sparkline'] = sparklines.create(data=a,
                                        color='#1b470a',
                                        fill_color='#99a894',
                                        fill_alpha=0.2,
                                        point_color='blue',
                                        point_fill='none',
                                        point_marker='*',
                                        point_size=3,
                                        figsize=(6, 0.25))
sparklines.show(df_out[['sparkline']])

# Example Data and Sparklines Layout

df_copy = df[['function', 'sparkline']].copy()

df_copy['value'] = df.ix[:, 100]

df_copy['change'] = df.ix[:,98] - df.ix[:,99]

df_copy['change_%'] = df_copy.change / df.ix[:,99]

sparklines.show(df_copy)

1 个回答

0

你可以用 Styler 来调整它,使用一种有点“黑科技”的 bar 方法:

LCOLOR, RCOLOR = "#fb9c04", "#668ed2"

st = (
    df.assign(
        bar=df["Orchid"].fillna(df["Rose"]),
        tmp=df["Orchid"] + df["Rose"],
    ).style
    .set_caption("Orders in 2020") # optional
    .bar(subset=["bar", "tmp"], color=LCOLOR, axis=1)
    .set_table_styles(
        [
            { # optional
                "selector": "caption",
                "props": "font-size: large; font-weight: bold",
            },
            {
                "selector": "td.col2",
                "props": f"background-color: {RCOLOR}; width: 200px",
            },
            {
                "selector": "*",
                "props": "border: 7px solid white; background-color: white",
            },
        ],
    )
    .hide(subset="tmp", axis=1)
    .format("", na_rep="", subset="bar")
    .format("${:,.2f}", na_rep="No Data", subset=["Orchid", "Rose"])
    .format_index(lambda c: c if c != "bar" else "", axis=1)
    .map_index(lambda i: "font-weight: bold")
    .format_index(lambda i: i + 1)
)

输出结果(在Notebook中):

这里输入图片描述

使用的输入数据(df):

df = pd.DataFrame(
    {
        "Orchid": [
            3500.0, 4800.0, 1400.0, None, 7800.0, 6800.0,
            8500.0, 6200.0, 9000.0, 7300.0, 8300.0, 11300.0,
        ],
        "Rose": [
            3200.0, 2500.0, 3700.0, 5600.0, 8000.0, 3800.0,
            None, 7720.0, 8380.0, 9100.0, 9700.0, 10360.0,
        ],
    }
)

撰写回答