How to create bar chart with geomean, mean, max and min from data using Plotnine

49 views Asked by At

How to create bar chart with gmean, mean, max and min stats for each category. For the data below,

X B Y
A1 b1 4
A1 b2 2
A1 b3 3
A1 b4 8
A2 b1 7
A2 c1 10
A2 c2 8
A2 b3 7
A3 b4 10
A3 b5 9
A3 b1 4
A3 b3 1

The chart should look like, enter image description here

1

There are 1 answers

1
has2k1 On BEST ANSWER

You need to prepare(calculate the aggregates) the data you want to visualise.

import pandas as pd
from plotnine import ggplot, aes, geom_col
from scipy.stats import gmean
from pandas.api.types import CategoricalDtype

# Original Data
df = pd.DataFrame({
    "X": sorted(("A1", "A2", "A3") * 4),
    "Y": [4, 2, 3, 8, 7, 10, 8, 7, 10, 9, 4, 1]
})

# Calculate the aggregates
df2 = (df.groupby("X")
 .agg({"Y": [gmean, "mean", "max", "min"]})
 .unstack()
 .reset_index()
 .rename(columns={0: "value", "level_1": "agg"})
)

# Order the aggregates
df2["agg"] = df2["agg"].astype(CategoricalDtype(["gmean", "mean", "max", "min"]))

(ggplot(df2, aes("X", "value", fill="agg"))
 + geom_col(position="dodge")
)

enter image description here