How to Plot Statistically Significant Letters on Bar Plots Using Tukey Test Results in Python

6 minute read

Published: August 15, 2024

This is lets you create a sing letters in form post hoc test, great if you want to learn about the more python functionalities.

starting

Have you ever wondered how to add letters on top of a bar plot to represent statistically significant relationships between groups using Python? In this tutorial, I’ll guide you step by step through the process. We’ll keep things simple, using only a few essential libraries like statsmodels, along with Python’s built-in dictionaries and list comprehensions. No need for heavy external libraries—just straightforward Python code!

At first we need to generate some some random data, for this I am going to use random module from python

I am generating randomly car brand name and some weird hypothetical prices using this code, and making sure some are statically different!

import pandas as pd import random import numpy as np

Define the list of car names and their respective price ranges (mean and std deviation)

car_names = [“toyota”, “mercedes”, “mazda”, “chevy”, “ram”] car_price_stats = { “toyota”: {“mean”: 7000, “std”: 500}, “mercedes”: {“mean”: 9000, “std”: 800}, “mazda”: {“mean”: 6500, “std”: 400}, “chevy”: {“mean”: 7500, “std”: 600}, “ram”: {“mean”: 8000, “std”: 700} }

Set the number of rows for the DataFrame

num_rows = 1000

Generate random data

random_car_names = [random.choice(car_names) for _ in range(num_rows)] random_avg_prices = [ int(np.clip( np.random.normal( car_price_stats[car][“mean”], car_price_stats[car][“std”] ), 5000, 10000 )) for car in random_car_names ]

Create the DataFrame

data = { “car_name”: random_car_names, “avg_price”: random_avg_prices } df = pd.DataFrame(data)

Display the DataFrame

print(df) if we do df.head()

Now, we want to determine whether there is a statistically significant difference in price between the cars.

Before doing that we can return some nice descriptive statistics using this formula form researchpy, it is really clean to see what is happening in data before jumping to do any analysis.

Press enter or click to view image in full size

Now we can to anova, and see if these are statically different or not, anova tells us, at least one is different to another but it does not tells us which one

Press enter or click to view image in full size

From the model p value, we see that they are statistically different, now we can go ahead and do post hoc test, which one are different to each other,

Press enter or click to view image in full size

We observe that Chevy and Mazda are not statistically different, while the rest of the car brands show significant differences. Therefore, Chevy and Mazda should share the same letter, distinguishing them from the other car brands.

we can use tukey plot see the differences

Press enter or click to view image in full size

df_tukey_car = pd.DataFrame(data=tukey_car._results_table.data[1:], columns=tukey_car._results_table.data[0]) Press enter or click to view image in full size

Now we have changed the tukey results into dataframe, we can extract the relevent information from this, to generate letters.

import string import pandas as pd

def letters(df, alpha=0.05):

df["p-adj"] = df["p-adj"].astype(float)

# Creating a list of the different treatment groups from Tukey's
group1 = set(df.group1.tolist())  # Dropping duplicates by creating a set
group2 = set(df.group2.tolist())  # Dropping duplicates by creating a set
groupSet = group1 | group2  # Set operation that creates a union of 2 sets
groups = list(groupSet) #removed sorted from here

# Creating lists of letters that will be assigned to treatment groups
letters = list(string.ascii_lowercase)[:len(groups)]
cldgroups = letters

# the following algoritm is a simplification of the classical cld,

cld = pd.DataFrame(list(zip(groups, letters, cldgroups)))
cld[3]=""

for row in df.itertuples():
    if df["p-adj"][row[0]] > (alpha):
        cld.iat[groups.index(df["group1"][row[0]]), 2] += cld.iat[groups.index(df["group2"][row[0]]), 1]
        cld.iat[groups.index(df["group2"][row[0]]), 2] += cld.iat[groups.index(df["group1"][row[0]]), 1]
        
    if df["p-adj"][row[0]] < (alpha):
            cld.iat[groups.index(df["group1"][row[0]]), 3] +=  cld.iat[groups.index(df["group2"][row[0]]), 1]
            cld.iat[groups.index(df["group2"][row[0]]), 3] +=  cld.iat[groups.index(df["group1"][row[0]]), 1]

cld[2] = cld[2].apply(lambda x: "".join(sorted(x)))
cld[3] = cld[3].apply(lambda x: "".join(sorted(x)))
cld.rename(columns={0: "groups"}, inplace=True)

# this part will reassign the final name to the group
# for sure there are more elegant ways of doing this
cld = cld.sort_values(cld.columns[2], key=lambda x: x.str.len())
cld["labels"] = ""
letters = list(string.ascii_lowercase)
unique = []
for item in cld[2]:

    for fitem in cld["labels"].unique():
        for c in range(0, len(fitem)):
            if not set(unique).issuperset(set(fitem[c])):
                unique.append(fitem[c])
    g = len(unique)

    for kitem in cld[1]:
        if kitem in item:
            if cld["labels"].loc[cld[1] == kitem].iloc[0] == "":
                cld["labels"].loc[cld[1] == kitem] += letters[g]

            #Checking if there are forbidden pairing (proposition of solution to the imperfect script)                
            if kitem in ' '.join(cld[3][cld["labels"]==letters[g]]): 
                g=len(unique)+1
           
            # Checking if columns 1 & 2 of cld share at least 1 letter
            if len(set(cld["labels"].loc[cld[1] == kitem].iloc[0]).intersection(cld.loc[cld[2] == item, "labels"].iloc[0])) <= 0:
                if letters[g] not in list(cld["labels"].loc[cld[1] == kitem].iloc[0]):
                    cld["labels"].loc[cld[1] == kitem] += letters[g]
                if letters[g] not in list(cld["labels"].loc[cld[2] == item].iloc[0]):
                    cld["labels"].loc[cld[2] == item] += letters[g]

cld = cld.sort_values("labels")

cld.drop(columns=[1, 2, 3], inplace=True)
cld= dict(zip(cld["groups"], cld["labels"]))

return(cld) This code block would take the input as a tukey dataframe that we generated and returns the dictionary with keys being the car name values being the letters

Press enter or click to view image in full size

Now, we have successfully generated the letters, and we can plot this letters in a bar chart, before that we need to generate the mean and standard error

Press enter or click to view image in full size

Now, we have data and related letters we can plot these together using this code block

import matplotlib.pyplot as plt import numpy as np

plt.figure(figsize=(12, 10), dpi=200) error = np.full(len(df_plot_car), df_plot_car[‘sem’]) custom_letters = group_labels

Create the bar plot

bars = plt.bar(df_plot_car[‘car_name’], df_plot_car[‘mean’], yerr=error, capsize=5)

Add annotations above bars

for bar, car_name in zip(bars, df_plot_car[‘car_name’]): height = bar.get_height() plt.annotate( custom_letters[car_name], xy=(bar.get_x() + bar.get_width() / 2, height + 0.8), xytext=(0, 5), # 3 points vertical offset textcoords=”offset points”, ha=’center’, va=’bottom’ )

Set x-ticks with rotation

plt.xticks( ticks=range(len(df_plot_car[‘car_name’])), labels=df_plot_car[‘car_name’], rotation=45, ha=’right’ )

Add labels and title with larger font sizes and spacing

plt.xlabel(‘Car Names’, fontsize=14) plt.ylabel(‘Average Price in USD’, fontsize=14) plt.title(‘Average Price By Different Car Names’, fontsize=16, pad=20) # Increased title font size and added padding

Adjust layout for better spacing

plt.tight_layout() plt.show() Press enter or click to view image in full size

Now we have successfully completed our task, to achieve this we have different other ways such as compact letter display, and other, However, I find this approach to be most reliable due to compatibility issues between libraries.

Thanks for the reading until the end, this is a google colab notebook with all code for this tutorial,

https://colab.research.google.com/drive/15agod0EqGeH3v9uVVIn8fq_CnhdH6huf?usp=sharing

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

You May Also Enjoy

Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Docker container setup inside synology nas

6 minute read

Published: November 07, 2025

From Terabytes to Analysis: Securely Connecting Your Synology NAS to HPRC with Globus and Docker

Virtual Environment Connect to VS code jupyter extension

3 minute read

Published: November 06, 2025

🐍 How Editing settings.json Helps VS Code Detect Your Python Virtual Environment on Linux & WSL

Chi-square test

6 minute read

Published: November 04, 2025

This tutorial employs the Null Hypothesis Significance Testing (NHST) framework and the Chi-Square ($\chi^2$) test to analyze categorical data. This approach is an alternative, yet equivalent, method to the two-sample proportion test for categorical data when a large sample size is assumed.