Mutual Information and Representation Learning
February 01, 2020
Preliminaries
Data Processing Inequality
Mutual Information Maximization
Often this summation is not tractable because of the cardinality of and .
Deriving the Contrastive Predictive Coding loss
We want to model
and we set it to
Mutual Information Minimization
- On mutual information maximization for representation learning
Tschannen, M., Djolonga, J., Rubenstein, P.K., Gelly, S. and Lucic, M., 2019. arXiv preprint arXiv:1907.13625. - Mutual information neural estimation
Belghazi, M.I., Baratin, A., Rajeshwar, S., Ozair, S., Bengio, Y., Courville, A. and Hjelm, D., 2018. International Conference on Machine Learning, pp. 531--540.