Great Article! Thanks!
You seem to have a typo in your JS_{\pi} definition .. the first part should be multiplied with $(1 - \pi)$, and the second with $\pi$, instead of the other way round.
We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.
tom white — If you want affine space: Gamma.
https://arxiv.org/abs/1710....
Mahdi Kalayeh — I am very happy that somebody had the courage to talk about this. Thank you Ali. As a computer vision researcher, I can tell you that the problem is even worse in CV. People spend months on getting the best out of their own proposed methods but run the baseline models very carelessly, with default parameters. Then, claim that their proposed "novelty" yielded a better performance, while it was solely a matter of engineering and underreporting. Yet if you want to be honest and write these in your submission, your paper gets killed by the reviewers due to "not beating SOTA".
Gaurav Pandey — Thanks for this wonderful post. I had wanted to read this paper for some time. This article summarizes the most interesting aspect of the paper quite clearly, that is, computation of Fisher information for rectified networks. Bounding generalisation error with respect to the fisher information metric can be read from the paper directly.
azb — Machine Learning is just following vector fields - e.g. Maths 101 and Bayesian statistics is just calculus.