I need to determine the KL-divergence between two Gaussians. I am comparing my results to these, but I can't reproduce their result. My result is obviously wrong, because the KL is not 0 for KL(p, p).

I wonder where I am doing a mistake and ask if anyone can spot it.

Let 𝑝(𝑥)=𝑁(𝜇1,𝜎1)$p(x)=N({\mu}_{1},{\sigma}_{1})$ and 𝑞(𝑥)=𝑁(𝜇2,𝜎2)$q(x)=N({\mu}_{2},{\sigma}_{2})$. From Bishop's
PRML I know that

𝐾𝐿(𝑝,𝑞)=−∫𝑝(𝑥)log𝑞(𝑥)𝑑𝑥+∫𝑝(𝑥)log𝑝(𝑥)𝑑𝑥$$KL(p,q)=-\int p(x)\mathrm{log}q(x)dx+\int p(x)\mathrm{log}p(x)dx$$

where integration is done over all real line, and that

∫𝑝(𝑥)log𝑝(𝑥)𝑑𝑥=−12(1+log2𝜋𝜎21),$$\int p(x)\mathrm{log}p(x)dx=-\frac{1}{2}(1+\mathrm{log}2\pi {\sigma}_{1}^{2}),$$

so I restrict myself to ∫𝑝(𝑥)log𝑞(𝑥)𝑑𝑥$\int p(x)\mathrm{log}q(x)dx$, which I can write out as

−∫𝑝(𝑥)log1(2𝜋𝜎22)(1/2)𝑒−(𝑥−𝜇2)22𝜎22𝑑𝑥,$$-\int p(x)\mathrm{log}\frac{1}{(2\pi {\sigma}_{2}^{2}{)}^{(1/2)}}{e}^{-\frac{(x-{\mu}_{2}{)}^{2}}{2{\sigma}_{2}^{2}}}dx,$$

which can be separated into

12log(2𝜋𝜎22)−∫𝑝(𝑥)log𝑒−(𝑥−𝜇2)22𝜎22𝑑𝑥.$$\frac{1}{2}\mathrm{log}(2\pi {\sigma}_{2}^{2})-\int p(x)\mathrm{log}{e}^{-\frac{(x-{\mu}_{2}{)}^{2}}{2{\sigma}_{2}^{2}}}dx.$$

Taking the log I get

12log(2𝜋𝜎22)−∫𝑝(𝑥)(−(𝑥−𝜇2)22𝜎22)𝑑𝑥,$$\frac{1}{2}\mathrm{log}(2\pi {\sigma}_{2}^{2})-\int p(x){\textstyle (}-\frac{(x-{\mu}_{2}{)}^{2}}{2{\sigma}_{2}^{2}}{\textstyle )}dx,$$

where I separate the sums and get 𝜎22${\sigma}_{2}^{2}$ out of the integral.

12log(2𝜋𝜎22)+∫𝑝(𝑥)𝑥2𝑑𝑥−∫𝑝(𝑥)2𝑥𝜇2𝑑𝑥+∫𝑝(𝑥)𝜇22𝑑𝑥2𝜎22$$\frac{1}{2}\mathrm{log}(2\pi {\sigma}_{2}^{2})+\frac{\int p(x){x}^{2}dx-\int p(x)2x{\mu}_{2}dx+\int p(x){\mu}_{2}^{2}dx}{2{\sigma}_{2}^{2}}$$

Letting ⟨⟩$\u27e8\u27e9$ denote the expectation operator under 𝑝$p$, I can rewrite this as

12log(2𝜋𝜎22)+⟨𝑥2⟩−2⟨𝑥⟩𝜇2+𝜇222𝜎22.$$\frac{1}{2}\mathrm{log}(2\pi {\sigma}_{2}^{2})+\frac{\u27e8{x}^{2}\u27e9-2\u27e8x\u27e9{\mu}_{2}+{\mu}_{2}^{2}}{2{\sigma}_{2}^{2}}.$$

We know that 𝑣𝑎𝑟(𝑥)=⟨𝑥2⟩−⟨𝑥⟩2$var(x)=\u27e8{x}^{2}\u27e9-\u27e8x{\u27e9}^{2}$. Thus

⟨𝑥2⟩=𝜎21+𝜇21$$\u27e8{x}^{2}\u27e9={\sigma}_{1}^{2}+{\mu}_{1}^{2}$$

and therefore

12log(2𝜋𝜎2)+𝜎21+𝜇21−2𝜇1𝜇2+𝜇222𝜎22,$$\frac{1}{2}\mathrm{log}(2\pi {\sigma}^{2})+\frac{{\sigma}_{1}^{2}+{\mu}_{1}^{2}-2{\mu}_{1}{\mu}_{2}+{\mu}_{2}^{2}}{2{\sigma}_{2}^{2}},$$

which I can put as

12log(2𝜋𝜎22)+𝜎21+(𝜇1−𝜇2)22𝜎22.$$\frac{1}{2}\mathrm{log}(2\pi {\sigma}_{2}^{2})+\frac{{\sigma}_{1}^{2}+({\mu}_{1}-{\mu}_{2}{)}^{2}}{2{\sigma}_{2}^{2}}.$$

Putting everything together, I get to

𝐾𝐿(𝑝,𝑞)=−∫𝑝(𝑥)log𝑞(𝑥)𝑑𝑥+∫𝑝(𝑥)log𝑝(𝑥)𝑑𝑥=12log(2𝜋𝜎22)+𝜎21+(𝜇1−𝜇2)22𝜎22−12(1+log2𝜋𝜎21)=log𝜎2𝜎1+𝜎21+(𝜇1−𝜇2)22𝜎22.$$\begin{array}{rl}KL(p,q)& =-\int p(x)\mathrm{log}q(x)dx+\int p(x)\mathrm{log}p(x)dx\\ \\ & =\frac{1}{2}\mathrm{log}(2\pi {\sigma}_{2}^{2})+\frac{{\sigma}_{1}^{2}+({\mu}_{1}-{\mu}_{2}{)}^{2}}{2{\sigma}_{2}^{2}}-\frac{1}{2}(1+\mathrm{log}2\pi {\sigma}_{1}^{2})\\ \\ & =\mathrm{log}\frac{{\sigma}_{2}}{{\sigma}_{1}}+\frac{{\sigma}_{1}^{2}+({\mu}_{1}-{\mu}_{2}{)}^{2}}{2{\sigma}_{2}^{2}}.\end{array}$$

Which is wrong since it equals

1$1$ for two identical Gaussians.

Can anyone spot my error?

**Update**

Thanks to mpiktas for clearing things up. The correct answer is:

𝐾𝐿(𝑝,𝑞)=log𝜎2𝜎1+𝜎21+(𝜇1−𝜇2)22𝜎22−12$KL(p,q)=\mathrm{log}\frac{{\sigma}_{2}}{{\sigma}_{1}}+\frac{{\sigma}_{1}^{2}+({\mu}_{1}-{\mu}_{2}{)}^{2}}{2{\sigma}_{2}^{2}}-\frac{1}{2}$