Code
import numpy as np
import matplotlib.pyplot as plt
Exercise 06 due Thursday.
April 1, 2025
April 1, 2025
This page provides a very basic introduction to Python and the matplotlib
plotting library.
Python is an awesome language for data science and data visualization. It is more popular than R, and it is widely used in scientific research and in industry.
I find that it has a very readable syntax, meaning that it’s relatively easy to see what well-written Python code is doing.
We start by importing the numpy
and pyplot
libraries and giving them convenient short names for future reference.
From https://matplotlib.org/stable/gallery/statistics/hist.html#sphx-glr-gallery-statistics-hist-py
Load components from matplotlib
.
Generate data and render it.
N_points = 100000
n_bins = 20
# Generate two normal distributions
dist1 = rng.standard_normal(N_points)
dist2 = 0.4 * rng.standard_normal(N_points) + 5
fig, axs = plt.subplots(1, 2, sharey=True, tight_layout=True)
# We can set the number of bins with the *bins* keyword argument.
axs[0].hist(dist1, bins=n_bins)
axs[1].hist(dist2, bins=n_bins)
plt.show()
Violin plots are another way to depict the distribution of a single continuous variable.
The following code is copied verbatim from the following site:
https://matplotlib.org/stable/gallery/statistics/violinplot.html
# fake data
fs = 10 # fontsize
pos = [1, 2, 4, 5, 7, 8]
data = [np.random.normal(0, std, size=100) for std in pos]
# Create a plot with 2 rows and 6 columns
fig, axs = plt.subplots(nrows=2, ncols=6, figsize=(10, 4))
axs[0, 0].violinplot(data, pos, points=20, widths=0.3,
showmeans=True, showextrema=True, showmedians=True)
axs[0, 0].set_title('Custom violin 1', fontsize=fs)
axs[0, 1].violinplot(data, pos, points=40, widths=0.5,
showmeans=True, showextrema=True, showmedians=True,
bw_method='silverman')
axs[0, 1].set_title('Custom violin 2', fontsize=fs)
axs[0, 2].violinplot(data, pos, points=60, widths=0.7, showmeans=True,
showextrema=True, showmedians=True, bw_method=0.5)
axs[0, 2].set_title('Custom violin 3', fontsize=fs)
axs[0, 3].violinplot(data, pos, points=60, widths=0.7, showmeans=True,
showextrema=True, showmedians=True, bw_method=0.5,
quantiles=[[0.1], [], [], [0.175, 0.954], [0.75], [0.25]])
axs[0, 3].set_title('Custom violin 4', fontsize=fs)
axs[0, 4].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5)
axs[0, 4].set_title('Custom violin 5', fontsize=fs)
axs[0, 5].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='low')
axs[0, 5].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='high')
axs[0, 5].set_title('Custom violin 6', fontsize=fs)
axs[1, 0].violinplot(data, pos, points=80, vert=False, widths=0.7,
showmeans=True, showextrema=True, showmedians=True)
axs[1, 0].set_title('Custom violin 7', fontsize=fs)
axs[1, 1].violinplot(data, pos, points=100, vert=False, widths=0.9,
showmeans=True, showextrema=True, showmedians=True,
bw_method='silverman')
axs[1, 1].set_title('Custom violin 8', fontsize=fs)
axs[1, 2].violinplot(data, pos, points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
bw_method=0.5)
axs[1, 2].set_title('Custom violin 9', fontsize=fs)
axs[1, 3].violinplot(data, pos, points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[[0.1], [], [], [0.175, 0.954], [0.75], [0.25]],
bw_method=0.5)
axs[1, 3].set_title('Custom violin 10', fontsize=fs)
axs[1, 4].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5)
axs[1, 4].set_title('Custom violin 11', fontsize=fs)
axs[1, 5].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='low')
axs[1, 5].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='high')
axs[1, 5].set_title('Custom violin 12', fontsize=fs)
for ax in axs.flat:
ax.set_yticklabels([])
fig.suptitle("Violin Plotting Examples")
fig.subplots_adjust(hspace=0.4)
plt.show()
plt.style.use('_mpl-gallery')
# make data:
np.random.seed(10)
D = np.random.normal((3, 5, 4), (1.25, 1.00, 1.25), (100, 3))
# plot
fig, ax = plt.subplots()
VP = ax.boxplot(D, positions=[2, 4, 6], widths=1.5, patch_artist=True,
showmeans=False, showfliers=False,
medianprops={"color": "white", "linewidth": 0.5},
boxprops={"facecolor": "C0", "edgecolor": "white",
"linewidth": 0.5},
whiskerprops={"color": "C0", "linewidth": 1.5},
capprops={"color": "C0", "linewidth": 1.5})
ax.set(xlim=(0, 8), xticks=np.arange(1, 8),
ylim=(0, 8), yticks=np.arange(1, 8))
plt.show()
Source: https://matplotlib.org/stable/plot_types/basic/bar.html#sphx-glr-plot-types-basic-bar-py
The following is copied verbatim from the Quarto website:
https://quarto.org/docs/get-started/hello/vscode.html
For a demonstration of a line plot on a polar axis, see Figure 5.
---
title: "Plotting in Python"
format: html
---
## About {-}
This page provides a very basic introduction to Python and the `matplotlib` plotting library.
## Why Python? {-}
Python is an awesome language for data science and data visualization.
It is more popular than R, and it is widely used in scientific research and in industry.
I find that it has a very readable syntax, meaning that it's relatively easy to see what well-written Python code is doing.
## Setup {-}
We start by importing the `numpy` and `pyplot` libraries and giving them convenient short names for future reference.
```{python}
import numpy as np
import matplotlib.pyplot as plt
```
## Plotting one variable {-}
### Continuous {-}
#### Histograms {-}
From <https://matplotlib.org/stable/gallery/statistics/hist.html#sphx-glr-gallery-statistics-hist-py>
Load components from `matplotlib`.
```{python}
from matplotlib import colors
from matplotlib.ticker import PercentFormatter
# Create a random number generator with a fixed seed for reproducibility
rng = np.random.default_rng(19680801)
```
Generate data and render it.
```{python}
#| label: fig-dual-histograms-matplotlib
#| fig-cap: "Two histograms with 100K points each."
N_points = 100000
n_bins = 20
# Generate two normal distributions
dist1 = rng.standard_normal(N_points)
dist2 = 0.4 * rng.standard_normal(N_points) + 5
fig, axs = plt.subplots(1, 2, sharey=True, tight_layout=True)
# We can set the number of bins with the *bins* keyword argument.
axs[0].hist(dist1, bins=n_bins)
axs[1].hist(dist2, bins=n_bins)
plt.show()
```
#### Violin {-}
Violin plots are another way to depict the distribution of a single continuous variable.
The following code is copied verbatim from the following site:
<https://matplotlib.org/stable/gallery/statistics/violinplot.html>
```{python}
#| label: fig-multiple-violin-plots
#| fig-cap: "Multiple violin plots with different parameters."
# fake data
fs = 10 # fontsize
pos = [1, 2, 4, 5, 7, 8]
data = [np.random.normal(0, std, size=100) for std in pos]
# Create a plot with 2 rows and 6 columns
fig, axs = plt.subplots(nrows=2, ncols=6, figsize=(10, 4))
axs[0, 0].violinplot(data, pos, points=20, widths=0.3,
showmeans=True, showextrema=True, showmedians=True)
axs[0, 0].set_title('Custom violin 1', fontsize=fs)
axs[0, 1].violinplot(data, pos, points=40, widths=0.5,
showmeans=True, showextrema=True, showmedians=True,
bw_method='silverman')
axs[0, 1].set_title('Custom violin 2', fontsize=fs)
axs[0, 2].violinplot(data, pos, points=60, widths=0.7, showmeans=True,
showextrema=True, showmedians=True, bw_method=0.5)
axs[0, 2].set_title('Custom violin 3', fontsize=fs)
axs[0, 3].violinplot(data, pos, points=60, widths=0.7, showmeans=True,
showextrema=True, showmedians=True, bw_method=0.5,
quantiles=[[0.1], [], [], [0.175, 0.954], [0.75], [0.25]])
axs[0, 3].set_title('Custom violin 4', fontsize=fs)
axs[0, 4].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5)
axs[0, 4].set_title('Custom violin 5', fontsize=fs)
axs[0, 5].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='low')
axs[0, 5].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='high')
axs[0, 5].set_title('Custom violin 6', fontsize=fs)
axs[1, 0].violinplot(data, pos, points=80, vert=False, widths=0.7,
showmeans=True, showextrema=True, showmedians=True)
axs[1, 0].set_title('Custom violin 7', fontsize=fs)
axs[1, 1].violinplot(data, pos, points=100, vert=False, widths=0.9,
showmeans=True, showextrema=True, showmedians=True,
bw_method='silverman')
axs[1, 1].set_title('Custom violin 8', fontsize=fs)
axs[1, 2].violinplot(data, pos, points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
bw_method=0.5)
axs[1, 2].set_title('Custom violin 9', fontsize=fs)
axs[1, 3].violinplot(data, pos, points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[[0.1], [], [], [0.175, 0.954], [0.75], [0.25]],
bw_method=0.5)
axs[1, 3].set_title('Custom violin 10', fontsize=fs)
axs[1, 4].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5)
axs[1, 4].set_title('Custom violin 11', fontsize=fs)
axs[1, 5].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='low')
axs[1, 5].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='high')
axs[1, 5].set_title('Custom violin 12', fontsize=fs)
for ax in axs.flat:
ax.set_yticklabels([])
fig.suptitle("Violin Plotting Examples")
fig.subplots_adjust(hspace=0.4)
plt.show()
```
#### Boxplot {-}
From <https://matplotlib.org/stable/plot_types/stats/boxplot_plot.html#sphx-glr-plot-types-stats-boxplot-plot-py>.
```{python}
#| label: fig-boxplot-example
#| fig-cap: "Example of several boxplots with whiskers."
plt.style.use('_mpl-gallery')
# make data:
np.random.seed(10)
D = np.random.normal((3, 5, 4), (1.25, 1.00, 1.25), (100, 3))
# plot
fig, ax = plt.subplots()
VP = ax.boxplot(D, positions=[2, 4, 6], widths=1.5, patch_artist=True,
showmeans=False, showfliers=False,
medianprops={"color": "white", "linewidth": 0.5},
boxprops={"facecolor": "C0", "edgecolor": "white",
"linewidth": 0.5},
whiskerprops={"color": "C0", "linewidth": 1.5},
capprops={"color": "C0", "linewidth": 1.5})
ax.set(xlim=(0, 8), xticks=np.arange(1, 8),
ylim=(0, 8), yticks=np.arange(1, 8))
plt.show()
```
### Discrete/nominal {-}
#### Bar chart {-}
Source: <https://matplotlib.org/stable/plot_types/basic/bar.html#sphx-glr-plot-types-basic-bar-py>
```{python}
#| label: fig-bar-plot-example
#| fig-cap: "Example of a bar plot."
plt.style.use('_mpl-gallery')
# make data:
x = 0.5 + np.arange(8)
y = [4.8, 5.5, 3.5, 4.6, 6.5, 6.6, 2.6, 3.0]
# plot
fig, ax = plt.subplots()
ax.bar(x, y, width=1, edgecolor="white", linewidth=0.7)
ax.set(xlim=(0, 8), xticks=np.arange(1, 8),
ylim=(0, 8), yticks=np.arange(1, 8))
plt.show()
```
## Plotting two variables {-}
### Scatter plots {-}
## Other kinds of plots {-}
The following is copied verbatim from the Quarto website:
<https://quarto.org/docs/get-started/hello/vscode.html>
For a demonstration of a line plot on a polar axis, see @fig-polar.
```{python}
#| label: fig-polar
#| fig-cap: "A line plot on a polar axis"
r = np.arange(0, 2, 0.01)
theta = 2 * np.pi * r
fig, ax = plt.subplots(
subplot_kw = {'projection': 'polar'}
)
ax.plot(theta, r)
ax.set_rticks([0.5, 1, 1.5, 2])
ax.grid(True)
plt.show()
```