Matplotlib Box Plots
Box plots, also known as box-and-whisker plots, are a graphical representation of statistical data based on the minimum, first quartile, median, third quartile, and maximum. They are useful for highlighting the central tendency, dispersion, and skewness of the data, as well as identifying outliers. In this article, we will explore how to create box plots using Matplotlib, a comprehensive library for creating static, animated, and interactive visualizations in Python.
Introduction to Box Plots
A box plot displays the five-number summary of a set of data: the minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. These components are crucial for understanding the distribution of data. The box represents the interquartile range (IQR), which is the distance between the first and third quartiles. The line inside the box shows the median of the data. Whiskers extend from the box to show the range of the data, and points outside of the whiskers are considered outliers.
Creating a Basic Box Plot
Let’s start with a basic example of a box plot. This example will show you how to create a simple box plot using Matplotlib.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.normal(loc=0, scale=1, size=100)
plt.boxplot(data)
plt.title("Basic Box Plot - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-1.png)
Customizing Box Plots
Matplotlib allows for extensive customization of box plots. You can change the properties of the boxes, whiskers, caps, medians, and fliers (outliers).
Changing Box Properties
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(10, 2) * 100
plt.boxplot(data, patch_artist=True, boxprops=dict(facecolor="cyan", color="blue"))
plt.title("Custom Box Properties - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-2.png)
Modifying Whisker Properties
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(10, 3) * 100
plt.boxplot(data, whiskerprops=dict(color="green", linewidth=2))
plt.title("Custom Whisker Properties - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-3.png)
Adjusting Median Properties
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(10, 4) * 100
plt.boxplot(data, medianprops=dict(color="red", linewidth=3))
plt.title("Custom Median Properties - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-4.png)
Customizing Flier (Outlier) Properties
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(10, 5) * 100
plt.boxplot(data, flierprops=dict(marker='o', color='yellow', markersize=12))
plt.title("Custom Flier Properties - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-5.png)
Horizontal Box Plots
Box plots can be oriented horizontally by setting the vert
parameter to False
.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(10, 2) * 100
plt.boxplot(data, vert=False)
plt.title("Horizontal Box Plot - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-6.png)
Multiple Box Plots
You can display multiple box plots side-by-side to compare different datasets.
import matplotlib.pyplot as plt
import numpy as np
data1 = np.random.normal(0, 1, 100)
data2 = np.random.normal(1, 1.5, 100)
data3 = np.random.normal(2, 2, 100)
data = [data1, data2, data3]
plt.boxplot(data)
plt.title("Multiple Box Plots - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-7.png)
Box Plots with Custom Fill Colors
You can customize the fill colors of box plots to enhance their visual appeal.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(10, 3) * 100
plt.boxplot(data, patch_artist=True, boxprops=dict(facecolor="lightgreen"))
plt.title("Box Plots with Custom Fill Colors - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-8.png)
Box Plots with Notches
Adding notches to a box plot can provide a visual indication of the confidence interval around the median.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(10, 2) * 100
plt.boxplot(data, notch=True)
plt.title("Box Plots with Notches - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-9.png)
Box Plots with Custom Outlier Symbols
You can customize the appearance of outliers using the flierprops
parameter.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(10, 4) * 100
plt.boxplot(data, flierprops=dict(marker='x', color='purple', markersize=8))
plt.title("Box Plots with Custom Outlier Symbols - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-10.png)
Box Plots Without Outliers
It’s possible to create box plots that do not display outliers by setting the showfliers
parameter to False
.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(10, 3) * 100
plt.boxplot(data, showfliers=False)
plt.title("Box Plots Without Outliers - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-11.png)
Grouped Box Plots
Grouped box plots can be created to compare distributions across different categories.
import matplotlib.pyplot as plt
import numpy as np
data1 = np.random.normal(0, 1, 100)
data2 = np.random.normal(1, 1.5, 100)
data3 = np.random.normal(2, 2, 100)
data = [data1, data2, data3]
positions = [1, 2, 4]
plt.boxplot(data, positions=positions)
plt.title("Grouped Box Plots - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-12.png)
Box Plots with Custom Whisker Length
The length of the whiskers can be customized by setting the whis
parameter.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(10, 2) * 100
plt.boxplot(data, whis=0.75)
plt.title("Box Plots with Custom Whisker Length - how2matplotlib.com")
plt.show()
Output:
![Matplotlib Box Plots](https://apidemos.geek-docs.com/matplotlib/2024/07/18/20240622001114-13.png)
Conclusion
Box plots are a powerful tool for statistical analysis, providing a compact representation of data distributions. With Matplotlib, you can create, customize, and compare box plots with ease. By adjusting properties such as color, width, orientation, and outlier symbols, you can tailor your plots to your specific needs, making your data analysis both effective and visually appealing. Whether you’re exploring a single dataset or comparing multiple groups, box plots can provide valuable insights into your data’s structure and outliers.