KL divergence between two univariate Gaussians

Question

I need to determine the KL-divergence between two Gaussians. I am comparing my results to these, but I can't reproduce their result. My result is obviously wrong, because the KL is not 0 for KL(p, p).

I wonder where I am doing a mistake and ask if anyone can spot it.

Let $p(x) = N(\mu_1, \sigma_1)$ and $q(x) = N(\mu_2, \sigma_2)$ . From Bishop's PRML I know that

K L (p, q) = - \int p (x) \log q (x) d x + \int p (x) \log p (x) d x

$KL(p, q) = - \int p(x) \log q(x) dx + \int p(x) \log p(x) dx$

where integration is done over all real line, and that

\int p (x) \log p (x) d x = - \frac{1}{2} (1 + \log 2 π σ_{1}^{2}),

$\int p(x) \log p(x) dx = -\frac{1}{2} (1 + \log 2 \pi \sigma_1^2),$

so I restrict myself to $\int p(x) \log q(x) dx$ , which I can write out as

- \int p (x) \log \frac{1}{(2 π σ_{2}^{2})^{(1 / 2)}} e^{- \frac{(x - μ_{2})^{2}}{2 σ_{2}^{2}}} d x,

$-\int p(x) \log \frac{1}{(2 \pi \sigma_2^2)^{(1/2)}} e^{-\frac{(x-\mu_2)^2}{2 \sigma_2^2}} dx,$

which can be separated into

\frac{1}{2} \log (2 π σ_{2}^{2}) - \int p (x) \log e^{- \frac{(x - μ_{2})^{2}}{2 σ_{2}^{2}}} d x .

$\frac{1}{2} \log (2 \pi \sigma_2^2) - \int p(x) \log e^{-\frac{(x-\mu_2)^2}{2 \sigma_2^2}} dx.$

Taking the log I get

\frac{1}{2} \log (2 π σ_{2}^{2}) - \int p (x) (- \frac{(x - μ_{2})^{2}}{2 σ_{2}^{2}}) d x,

$\frac{1}{2} \log (2 \pi \sigma_2^2) - \int p(x) \bigg(-\frac{(x-\mu_2)^2}{2 \sigma_2^2} \bigg) dx,$

where I separate the sums and get $\sigma_2^2$ out of the integral.

\frac{1}{2} \log (2 π σ_{2}^{2}) + \frac{\int p (x) x^{2} d x - \int p (x) 2 x μ_{2} d x + \int p (x) μ_{2}^{2} d x}{2 σ_{2}^{2}}

$\frac{1}{2} \log (2 \pi \sigma^2_2) + \frac{\int p(x) x^2 dx - \int p(x) 2x\mu_2 dx + \int p(x) \mu_2^2 dx}{2 \sigma_2^2}$

Letting $\langle \rangle$ denote the expectation operator under $p$ , I can rewrite this as

\frac{1}{2} \log (2 π σ_{2}^{2}) + \frac{⟨ x^{2} ⟩ - 2 ⟨ x ⟩ μ_{2} + μ_{2}^{2}}{2 σ_{2}^{2}} .

$\frac{1}{2} \log (2 \pi \sigma_2^2) + \frac{\langle x^2 \rangle - 2 \langle x \rangle \mu_2 + \mu_2^2}{2 \sigma_2^2}.$

We know that $var(x) = \langle x^2 \rangle - \langle x \rangle ^2$ . Thus

⟨ x^{2} ⟩ = σ_{1}^{2} + μ_{1}^{2}

$\langle x^2 \rangle = \sigma_1^2 + \mu_1^2$

and therefore

\frac{1}{2} \log (2 π σ^{2}) + \frac{σ_{1}^{2} + μ_{1}^{2} - 2 μ_{1} μ_{2} + μ_{2}^{2}}{2 σ_{2}^{2}},

$\frac{1}{2} \log (2 \pi \sigma^2) + \frac{\sigma_1^2 + \mu_1^2 - 2 \mu_1 \mu_2 + \mu_2^2}{2 \sigma_2^2},$

which I can put as

\frac{1}{2} \log (2 π σ_{2}^{2}) + \frac{σ_{1}^{2} + (μ_{1} - μ_{2})^{2}}{2 σ_{2}^{2}} .

$\frac{1}{2} \log (2 \pi \sigma_2^2) + \frac{\sigma_1^2 + (\mu_1 - \mu_2)^2}{2 \sigma_2^2}.$

Putting everything together, I get to

\begin{aligned} K L (p, q) & = - \int p (x) \log q (x) d x + \int p (x) \log p (x) d x \\ = \frac{1}{2} \log (2 π σ_{2}^{2}) + \frac{σ_{1}^{2} + (μ_{1} - μ_{2})^{2}}{2 σ_{2}^{2}} - \frac{1}{2} (1 + \log 2 π σ_{1}^{2}) \\ = \log \frac{σ_{2}}{σ_{1}} + \frac{σ_{1}^{2} + (μ_{1} - μ_{2})^{2}}{2 σ_{2}^{2}} . \end{aligned}

$\begin{align*} KL(p, q) &= - \int p(x) \log q(x) dx + \int p(x) \log p(x) dx\\\\ &= \frac{1}{2} \log (2 \pi \sigma_2^2) + \frac{\sigma_1^2 + (\mu_1 - \mu_2)^2}{2 \sigma_2^2} - \frac{1}{2} (1 + \log 2 \pi \sigma_1^2)\\\\ &= \log \frac{\sigma_2}{\sigma_1} + \frac{\sigma_1^2 + (\mu_1 - \mu_2)^2}{2 \sigma_2^2}. \end{align*}$ Which is wrong since it equals

1

$1$ for two identical Gaussians.

Can anyone spot my error?

Update

Thanks to mpiktas for clearing things up. The correct answer is:

$KL(p, q) = \log \frac{\sigma_2}{\sigma_1} + \frac{\sigma_1^2 + (\mu_1 - \mu_2)^2}{2 \sigma_2^2} - \frac{1}{2}$

sorry for posting the incorrect answer in the first place. I just looked at $x-\mu_1$ and immediately thought that the integral is zero. The point that it was squared completely missed my mind :) — mpiktas, Feb 21 '11 at 12:02
I have just seen in a research paper that kld should be $KL(p, q) = ½ * ((μ₁-μ₂)² + σ₁²+σ₂²) * ( (1/σ₁²) + (1/σ₂²) ) - 2 — skyde, Aug 1 '13 at 14:26
I think there is a typo in your question, since I cannot validate it and it also seems that you used the correct version later in your question: $\int p (x) \log p (x) d x = \frac{1}{2} (1 + \log 2 π σ_{1}^{2})$ $\int p(x) \log p(x) dx = \frac{1}{2} (1 + \log 2 \pi \sigma_1^2)$ I think it should be (note the minus): $\int p (x) \log p (x) d x = - \frac{1}{2} (1 + \log 2 π σ_{1}^{2})$ $\int p(x) \log p(x) dx = -\frac{1}{2} (1 + \log 2 \pi \sigma_1^2)$ I tried to edit your question and got banned for it, so maybe do it yourself. — y-spreen, Jan 25 '18 at 13:49

mpiktas · Accepted Answer · 2016-07-21 05:42:54Z

OK, my bad. The error is in the last equation:

\begin{aligned} K L (p, q) & = - \int p (x) \log q (x) d x + \int p (x) \log p (x) d x \\ = \frac{1}{2} \log (2 π σ_{2}^{2}) + \frac{σ_{1}^{2} + (μ_{1} - μ_{2})^{2}}{2 σ_{2}^{2}} - \frac{1}{2} (1 + \log 2 π σ_{1}^{2}) \\ = \log \frac{σ_{2}}{σ_{1}} + \frac{σ_{1}^{2} + (μ_{1} - μ_{2})^{2}}{2 σ_{2}^{2}} - \frac{1}{2} \end{aligned}

$\begin{align} KL(p, q) &= - \int p(x) \log q(x) dx + \int p(x) \log p(x) dx\\\\ &=\frac{1}{2} \log (2 \pi \sigma_2^2) + \frac{\sigma_1^2 + (\mu_1 - \mu_2)^2}{2 \sigma_2^2} - \frac{1}{2} (1 + \log 2 \pi \sigma_1^2)\\\\ &= \log \frac{\sigma_2}{\sigma_1} + \frac{\sigma_1^2 + (\mu_1 - \mu_2)^2}{2 \sigma_2^2} - \frac{1}{2} \end{align}$

Note the missing $-\frac{1}{2}$ . The last line becomes zero when $\mu_1=\mu_2$ and $\sigma_1=\sigma_2$ .

@mpiktas I meant the question really - bayerj Is a well published researcher and I'm an undergrad. Nice to see that even the smart guys fall back to asking on the internet sometimes :) — N. McA., Apr 5 '16 at 10:19

Taylor · Accepted Answer · 2019-05-12 20:23:59Z

I did not have a look at your calculation but here is mine with a lot of details. Suppose $p$ is the density of a normal random variable with mean $\mu_1$ and variance $\sigma^2_1$ , and that $q$ is the density of a normal random variable with mean $\mu_2$ and variance $\sigma^2_2$ . The Kullback-Leibler distance from $q$ to $p$ is:

$\int \left[\log( p(x)) - log( q(x)) \right] p(x) dx$

$=\int \left[ -\frac{1}{2} \log(2\pi) - \log(\sigma_1) - \frac{1}{2} \left(\frac{x-\mu_1}{\sigma_1}\right)^2 + \frac{1}{2}\log(2\pi) + \log(\sigma_2) + \frac{1}{2} \left(\frac{x-\mu_2}{\sigma_2}\right)^2 \right]$ $\times \frac{1}{\sqrt{2\pi}\sigma_1} \exp\left[-\frac{1}{2}\left(\frac{x-\mu_1}{\sigma_1}\right)^2\right] dx$

$=\int \left\{\log\left(\frac{\sigma_2}{\sigma_1}\right) + \frac{1}{2} \left[ \left(\frac{x-\mu_2}{\sigma_2}\right)^2 - \left(\frac{x-\mu_1}{\sigma_1}\right)^2 \right] \right\}$ $\times \frac{1}{\sqrt{2\pi}\sigma_1} \exp\left[-\frac{1}{2}\left(\frac{x-\mu_1}{\sigma_1}\right)^2\right] dx$

$=E_{1} \left\{\log\left(\frac{\sigma_2}{\sigma_1}\right) + \frac{1}{2} \left[ \left(\frac{x-\mu_2}{\sigma_2}\right)^2 - \left(\frac{x-\mu_1}{\sigma_1}\right)^2 \right]\right\}$

$=\log\left(\frac{\sigma_2}{\sigma_1}\right) + \frac{1}{2\sigma_2^2} E_1 \left\{(X-\mu_2)^2\right\} - \frac{1}{2\sigma_1^2} E_1 \left\{(X-\mu_1)^2\right\}$

$=\log\left(\frac{\sigma_2}{\sigma_1}\right) + \frac{1}{2\sigma_2^2} E_1 \left\{(X-\mu_2)^2\right\} - \frac{1}{2}$

(Now note that $(X - \mu_2)^2 = (X-\mu_1+\mu_1-\mu_2)^2 = (X-\mu_1)^2 + 2(X-\mu_1)(\mu_1-\mu_2) + (\mu_1-\mu_2)^2$ )

$=\log\left(\frac{\sigma_2}{\sigma_1}\right) + \frac{1}{2\sigma_2^2} \left[E_1\left\{(X-\mu_1)^2\right\} + 2(\mu_1-\mu_2)E_1\left\{X-\mu_1\right\} + (\mu_1-\mu_2)^2\right] - \frac{1}{2}$

$=\log\left(\frac{\sigma_2}{\sigma_1}\right) + \frac{\sigma_1^2 + (\mu_1-\mu_2)^2}{2\sigma_2^2} - \frac{1}{2}$

Stack Exchange Network

current community

your communities

more stack exchange communities

KL divergence between two univariate Gaussians

2 Answers 2

protected by kjetil b halvorsen Nov 10 '18 at 21:59

Not the answer you're looking for? Browse other questions tagged normal-distribution kullback-leibler or ask your own question.

Linked

Hot Network Questions

KL divergence between two univariate Gaussians

2 Answers 2

protected by kjetil b halvorsen Nov 10 '18 at 21:59

Not the answer you're looking for? Browse other questions tagged normal-distribution kullback-leibler or ask your own question.

Linked

Related

Hot Network Questions