Skip to content

Commit a08e5ad

Browse files
committed
Add notebook on Kullback-Leibler divergence
1 parent 547ac37 commit a08e5ad

File tree

1 file changed

+193
-0
lines changed

1 file changed

+193
-0
lines changed
Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": 1,
6+
"id": "5d8abf15-5fe0-49cc-ad5c-c688cbde3304",
7+
"metadata": {},
8+
"outputs": [],
9+
"source": [
10+
"import matplotlib.pyplot as plt\n",
11+
"%matplotlib inline\n",
12+
"import numpy as np\n",
13+
"import scipy.stats"
14+
]
15+
},
16+
{
17+
"cell_type": "markdown",
18+
"id": "660d443e-26c2-4406-a035-d9dd599f79b0",
19+
"metadata": {},
20+
"source": [
21+
"# Kullback-Leibler divergence"
22+
]
23+
},
24+
{
25+
"cell_type": "markdown",
26+
"id": "6b1ad233-a4a9-4ef0-abcd-1d20be1feb34",
27+
"metadata": {},
28+
"source": [
29+
"Define the Kullback-Leibler divergence as:\n",
30+
"$$\n",
31+
"\\textrm{KL}(p, q) = \\sum_{i} p_i \\log \\frac{p_i}{q_i}\n",
32+
"$$"
33+
]
34+
},
35+
{
36+
"cell_type": "code",
37+
"execution_count": 2,
38+
"id": "fe17c58f-6154-41e7-a83b-87ecf7a8ed00",
39+
"metadata": {},
40+
"outputs": [],
41+
"source": [
42+
"def kullback_leibler_div(p, q):\n",
43+
" return np.sum(p*np.log(p/q))"
44+
]
45+
},
46+
{
47+
"cell_type": "markdown",
48+
"id": "f79fdff3-8608-4233-9021-cfa174c48398",
49+
"metadata": {},
50+
"source": [
51+
"# Example"
52+
]
53+
},
54+
{
55+
"cell_type": "markdown",
56+
"id": "2d361c22-645c-4a55-9fc5-f45edf5463cb",
57+
"metadata": {},
58+
"source": [
59+
"Consider a normal distribution with mean $\\mu = 0.0$ and standard deviation $\\sigma = 1.0$. The distribution is computed on the interval $[-3, 3]$."
60+
]
61+
},
62+
{
63+
"cell_type": "code",
64+
"execution_count": 3,
65+
"id": "782dee3c-5265-48ab-ab4c-1b92f334ade5",
66+
"metadata": {},
67+
"outputs": [],
68+
"source": [
69+
"μ, σ = 0.0, 1.0"
70+
]
71+
},
72+
{
73+
"cell_type": "code",
74+
"execution_count": 4,
75+
"id": "77c3a026-a09a-4837-8633-a7f6796d1a23",
76+
"metadata": {},
77+
"outputs": [],
78+
"source": [
79+
"x = np.linspace(-3.0, 3.0, 601)"
80+
]
81+
},
82+
{
83+
"cell_type": "code",
84+
"execution_count": 5,
85+
"id": "69f724b8-008c-40b9-9d54-7d10109a4030",
86+
"metadata": {},
87+
"outputs": [],
88+
"source": [
89+
"p = scipy.stats.norm(loc=μ, scale=σ).pdf(x)"
90+
]
91+
},
92+
{
93+
"cell_type": "markdown",
94+
"id": "98d280f3-b05c-41e5-b51d-2cbb79b3a7d9",
95+
"metadata": {},
96+
"source": [
97+
"Next, we consider distribution with the same mean value $\\mu = 0.0$, but various different values for the standard deviation."
98+
]
99+
},
100+
{
101+
"cell_type": "code",
102+
"execution_count": 6,
103+
"id": "a028fe7b-9657-437f-8675-4f2aa7118d13",
104+
"metadata": {},
105+
"outputs": [],
106+
"source": [
107+
"σs = np.arange(0.1, 2.1, 0.01)"
108+
]
109+
},
110+
{
111+
"cell_type": "code",
112+
"execution_count": 7,
113+
"id": "e6fc4509-a9a9-47b0-a7a0-9916edc3509b",
114+
"metadata": {},
115+
"outputs": [],
116+
"source": [
117+
"qs = [scipy.stats.norm(loc=μ, scale=σ).pdf(x) for σ in σs]"
118+
]
119+
},
120+
{
121+
"cell_type": "markdown",
122+
"id": "55fd805b-0265-4e7b-9396-36aabd0d2b84",
123+
"metadata": {},
124+
"source": [
125+
"Compute the divergence between the distributions $q$ and $p$."
126+
]
127+
},
128+
{
129+
"cell_type": "code",
130+
"execution_count": 8,
131+
"id": "0c374926-759f-4893-bbde-41a4830aabde",
132+
"metadata": {},
133+
"outputs": [],
134+
"source": [
135+
"divs = [kullback_leibler_div(p, q) for q in qs]"
136+
]
137+
},
138+
{
139+
"cell_type": "code",
140+
"execution_count": 9,
141+
"id": "9bcca801-a677-4ee3-a8ea-a4d89057032f",
142+
"metadata": {},
143+
"outputs": [
144+
{
145+
"data": {
146+
"image/png": "\n",
147+
"text/plain": [
148+
"<Figure size 432x288 with 1 Axes>"
149+
]
150+
},
151+
"metadata": {
152+
"needs_background": "light"
153+
},
154+
"output_type": "display_data"
155+
}
156+
],
157+
"source": [
158+
"plt.semilogy(σs, divs);\n",
159+
"plt.xlabel(r'$\\sigma$');\n",
160+
"plt.ylabel(r'$KL(p, q)$');"
161+
]
162+
},
163+
{
164+
"cell_type": "markdown",
165+
"id": "92d8aece-f1e3-447a-a387-52a4d2d1658e",
166+
"metadata": {},
167+
"source": [
168+
"For $\\sigma = 1.0$ the distributions $p$ and $q$ are identical, and the Kullback-Leibler divergence is 0."
169+
]
170+
}
171+
],
172+
"metadata": {
173+
"kernelspec": {
174+
"display_name": "Python 3 (ipykernel)",
175+
"language": "python",
176+
"name": "python3"
177+
},
178+
"language_info": {
179+
"codemirror_mode": {
180+
"name": "ipython",
181+
"version": 3
182+
},
183+
"file_extension": ".py",
184+
"mimetype": "text/x-python",
185+
"name": "python",
186+
"nbconvert_exporter": "python",
187+
"pygments_lexer": "ipython3",
188+
"version": "3.9.7"
189+
}
190+
},
191+
"nbformat": 4,
192+
"nbformat_minor": 5
193+
}

0 commit comments

Comments
 (0)