Code
import numpy as np
import matplotlib.pyplot as plt
We’re moving to 009 Sparks!
January 18, 2025
November 22, 2024
Figures don’t seem to be cross-referencing across documents. Let’s see if they work within documents.
Does the link to Figure 1 work?
How about Figure 2?
Apparently so.
Okay, so do the ggplot2 figures work in this document? Specifically, does the reference to ?@fig-dual-boxplots-diff-ggplot2 work?
No. So, the issue is cross-document figure references.
This page provides a very basic introduction to Python and the matplotlib
plotting library.
Python is an awesome language for data science and data visualization. It is more popular than R, and it is widely used in scientific research and in industry.
I find that it has a very readable syntax, meaning that it’s relatively easy to see what well-written Python code is doing.
We start by importing the numpy
and pyplot
libraries and giving them convenient short names for future reference.
From https://matplotlib.org/stable/gallery/statistics/hist.html#sphx-glr-gallery-statistics-hist-py
Load components from matplotlib
.
Generate data and render it.
N_points = 100000
n_bins = 20
# Generate two normal distributions
dist1 = rng.standard_normal(N_points)
dist2 = 0.4 * rng.standard_normal(N_points) + 5
fig, axs = plt.subplots(1, 2, sharey=True, tight_layout=True)
# We can set the number of bins with the *bins* keyword argument.
axs[0].hist(dist1, bins=n_bins)
axs[1].hist(dist2, bins=n_bins)
plt.show()
Violin plots are another way to depict the distribution of a single continuous variable.
The following code is copied verbatim from the following site:
https://matplotlib.org/stable/gallery/statistics/violinplot.html
# fake data
fs = 10 # fontsize
pos = [1, 2, 4, 5, 7, 8]
data = [np.random.normal(0, std, size=100) for std in pos]
# Create a plot with 2 rows and 6 columns
fig, axs = plt.subplots(nrows=2, ncols=6, figsize=(10, 4))
axs[0, 0].violinplot(data, pos, points=20, widths=0.3,
showmeans=True, showextrema=True, showmedians=True)
axs[0, 0].set_title('Custom violin 1', fontsize=fs)
axs[0, 1].violinplot(data, pos, points=40, widths=0.5,
showmeans=True, showextrema=True, showmedians=True,
bw_method='silverman')
axs[0, 1].set_title('Custom violin 2', fontsize=fs)
axs[0, 2].violinplot(data, pos, points=60, widths=0.7, showmeans=True,
showextrema=True, showmedians=True, bw_method=0.5)
axs[0, 2].set_title('Custom violin 3', fontsize=fs)
axs[0, 3].violinplot(data, pos, points=60, widths=0.7, showmeans=True,
showextrema=True, showmedians=True, bw_method=0.5,
quantiles=[[0.1], [], [], [0.175, 0.954], [0.75], [0.25]])
axs[0, 3].set_title('Custom violin 4', fontsize=fs)
axs[0, 4].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5)
axs[0, 4].set_title('Custom violin 5', fontsize=fs)
axs[0, 5].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='low')
axs[0, 5].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='high')
axs[0, 5].set_title('Custom violin 6', fontsize=fs)
axs[1, 0].violinplot(data, pos, points=80, vert=False, widths=0.7,
showmeans=True, showextrema=True, showmedians=True)
axs[1, 0].set_title('Custom violin 7', fontsize=fs)
axs[1, 1].violinplot(data, pos, points=100, vert=False, widths=0.9,
showmeans=True, showextrema=True, showmedians=True,
bw_method='silverman')
axs[1, 1].set_title('Custom violin 8', fontsize=fs)
axs[1, 2].violinplot(data, pos, points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
bw_method=0.5)
axs[1, 2].set_title('Custom violin 9', fontsize=fs)
axs[1, 3].violinplot(data, pos, points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[[0.1], [], [], [0.175, 0.954], [0.75], [0.25]],
bw_method=0.5)
axs[1, 3].set_title('Custom violin 10', fontsize=fs)
axs[1, 4].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5)
axs[1, 4].set_title('Custom violin 11', fontsize=fs)
axs[1, 5].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='low')
axs[1, 5].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='high')
axs[1, 5].set_title('Custom violin 12', fontsize=fs)
for ax in axs.flat:
ax.set_yticklabels([])
fig.suptitle("Violin Plotting Examples")
fig.subplots_adjust(hspace=0.4)
plt.show()
plt.style.use('_mpl-gallery')
# make data:
np.random.seed(10)
D = np.random.normal((3, 5, 4), (1.25, 1.00, 1.25), (100, 3))
# plot
fig, ax = plt.subplots()
VP = ax.boxplot(D, positions=[2, 4, 6], widths=1.5, patch_artist=True,
showmeans=False, showfliers=False,
medianprops={"color": "white", "linewidth": 0.5},
boxprops={"facecolor": "C0", "edgecolor": "white",
"linewidth": 0.5},
whiskerprops={"color": "C0", "linewidth": 1.5},
capprops={"color": "C0", "linewidth": 1.5})
ax.set(xlim=(0, 8), xticks=np.arange(1, 8),
ylim=(0, 8), yticks=np.arange(1, 8))
plt.show()
Source: https://matplotlib.org/stable/plot_types/basic/bar.html#sphx-glr-plot-types-basic-bar-py
The following is copied verbatim from the Quarto website:
https://quarto.org/docs/get-started/hello/vscode.html
For a demonstration of a line plot on a polar axis, see Figure 5.
# Plotting in Python
::: {.callout-note}
Figures don't seem to be cross-referencing *across* documents.
Let's see if they work *within* documents.
Does the link to @fig-dual-histograms-matplotlib work?
How about @fig-multiple-violin-plots?
Apparently so.
Okay, so do the ggplot2 figures work in this document?
Specifically, does the reference to @fig-dual-boxplots-diff-ggplot2 work?
No. So, the issue is cross-document figure references.
:::
## About {-}
This page provides a very basic introduction to Python and the `matplotlib` plotting library.
## Why Python? {-}
Python is an awesome language for data science and data visualization.
It is more popular than R, and it is widely used in scientific research and in industry.
I find that it has a very readable syntax, meaning that it's relatively easy to see what well-written Python code is doing.
## Setup {-}
We start by importing the `numpy` and `pyplot` libraries and giving them convenient short names for future reference.
```{python}
import numpy as np
import matplotlib.pyplot as plt
```
## Plotting one variable {-}
### Continuous {-}
#### Histograms {-}
From <https://matplotlib.org/stable/gallery/statistics/hist.html#sphx-glr-gallery-statistics-hist-py>
Load components from `matplotlib`.
```{python}
from matplotlib import colors
from matplotlib.ticker import PercentFormatter
# Create a random number generator with a fixed seed for reproducibility
rng = np.random.default_rng(19680801)
```
Generate data and render it.
```{python}
#| label: fig-dual-histograms-matplotlib
#| fig-cap: "Two histograms with 100K points each."
N_points = 100000
n_bins = 20
# Generate two normal distributions
dist1 = rng.standard_normal(N_points)
dist2 = 0.4 * rng.standard_normal(N_points) + 5
fig, axs = plt.subplots(1, 2, sharey=True, tight_layout=True)
# We can set the number of bins with the *bins* keyword argument.
axs[0].hist(dist1, bins=n_bins)
axs[1].hist(dist2, bins=n_bins)
plt.show()
```
#### Violin {-}
Violin plots are another way to depict the distribution of a single continuous variable.
The following code is copied verbatim from the following site:
<https://matplotlib.org/stable/gallery/statistics/violinplot.html>
```{python}
#| label: fig-multiple-violin-plots
#| fig-cap: "Multiple violin plots with different parameters."
# fake data
fs = 10 # fontsize
pos = [1, 2, 4, 5, 7, 8]
data = [np.random.normal(0, std, size=100) for std in pos]
# Create a plot with 2 rows and 6 columns
fig, axs = plt.subplots(nrows=2, ncols=6, figsize=(10, 4))
axs[0, 0].violinplot(data, pos, points=20, widths=0.3,
showmeans=True, showextrema=True, showmedians=True)
axs[0, 0].set_title('Custom violin 1', fontsize=fs)
axs[0, 1].violinplot(data, pos, points=40, widths=0.5,
showmeans=True, showextrema=True, showmedians=True,
bw_method='silverman')
axs[0, 1].set_title('Custom violin 2', fontsize=fs)
axs[0, 2].violinplot(data, pos, points=60, widths=0.7, showmeans=True,
showextrema=True, showmedians=True, bw_method=0.5)
axs[0, 2].set_title('Custom violin 3', fontsize=fs)
axs[0, 3].violinplot(data, pos, points=60, widths=0.7, showmeans=True,
showextrema=True, showmedians=True, bw_method=0.5,
quantiles=[[0.1], [], [], [0.175, 0.954], [0.75], [0.25]])
axs[0, 3].set_title('Custom violin 4', fontsize=fs)
axs[0, 4].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5)
axs[0, 4].set_title('Custom violin 5', fontsize=fs)
axs[0, 5].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='low')
axs[0, 5].violinplot(data[-1:], pos[-1:], points=60, widths=0.7,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='high')
axs[0, 5].set_title('Custom violin 6', fontsize=fs)
axs[1, 0].violinplot(data, pos, points=80, vert=False, widths=0.7,
showmeans=True, showextrema=True, showmedians=True)
axs[1, 0].set_title('Custom violin 7', fontsize=fs)
axs[1, 1].violinplot(data, pos, points=100, vert=False, widths=0.9,
showmeans=True, showextrema=True, showmedians=True,
bw_method='silverman')
axs[1, 1].set_title('Custom violin 8', fontsize=fs)
axs[1, 2].violinplot(data, pos, points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
bw_method=0.5)
axs[1, 2].set_title('Custom violin 9', fontsize=fs)
axs[1, 3].violinplot(data, pos, points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[[0.1], [], [], [0.175, 0.954], [0.75], [0.25]],
bw_method=0.5)
axs[1, 3].set_title('Custom violin 10', fontsize=fs)
axs[1, 4].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5)
axs[1, 4].set_title('Custom violin 11', fontsize=fs)
axs[1, 5].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='low')
axs[1, 5].violinplot(data[-1:], pos[-1:], points=200, vert=False, widths=1.1,
showmeans=True, showextrema=True, showmedians=True,
quantiles=[0.05, 0.1, 0.8, 0.9], bw_method=0.5, side='high')
axs[1, 5].set_title('Custom violin 12', fontsize=fs)
for ax in axs.flat:
ax.set_yticklabels([])
fig.suptitle("Violin Plotting Examples")
fig.subplots_adjust(hspace=0.4)
plt.show()
```
#### Boxplot {-}
From <https://matplotlib.org/stable/plot_types/stats/boxplot_plot.html#sphx-glr-plot-types-stats-boxplot-plot-py>.
```{python}
#| label: fig-boxplot-example
#| fig-cap: "Example of several boxplots with whiskers."
plt.style.use('_mpl-gallery')
# make data:
np.random.seed(10)
D = np.random.normal((3, 5, 4), (1.25, 1.00, 1.25), (100, 3))
# plot
fig, ax = plt.subplots()
VP = ax.boxplot(D, positions=[2, 4, 6], widths=1.5, patch_artist=True,
showmeans=False, showfliers=False,
medianprops={"color": "white", "linewidth": 0.5},
boxprops={"facecolor": "C0", "edgecolor": "white",
"linewidth": 0.5},
whiskerprops={"color": "C0", "linewidth": 1.5},
capprops={"color": "C0", "linewidth": 1.5})
ax.set(xlim=(0, 8), xticks=np.arange(1, 8),
ylim=(0, 8), yticks=np.arange(1, 8))
plt.show()
```
### Discrete/nominal {-}
#### Bar chart {-}
Source: <https://matplotlib.org/stable/plot_types/basic/bar.html#sphx-glr-plot-types-basic-bar-py>
```{python}
#| label: fig-bar-plot-example
#| fig-cap: "Example of a bar plot."
plt.style.use('_mpl-gallery')
# make data:
x = 0.5 + np.arange(8)
y = [4.8, 5.5, 3.5, 4.6, 6.5, 6.6, 2.6, 3.0]
# plot
fig, ax = plt.subplots()
ax.bar(x, y, width=1, edgecolor="white", linewidth=0.7)
ax.set(xlim=(0, 8), xticks=np.arange(1, 8),
ylim=(0, 8), yticks=np.arange(1, 8))
plt.show()
```
## Plotting two variables {-}
### Scatter plots {-}
## Other kinds of plots {-}
The following is copied verbatim from the Quarto website:
<https://quarto.org/docs/get-started/hello/vscode.html>
For a demonstration of a line plot on a polar axis, see @fig-polar.
```{python}
#| label: fig-polar
#| fig-cap: "A line plot on a polar axis"
r = np.arange(0, 2, 0.01)
theta = 2 * np.pi * r
fig, ax = plt.subplots(
subplot_kw = {'projection': 'polar'}
)
ax.plot(theta, r)
ax.set_rticks([0.5, 1, 1.5, 2])
ax.grid(True)
plt.show()
```