Divyendra Mikkilineni

Feb 9 2018

Mutual Learning in Neural nets and Hint-based pretraining for improved accuracies

Knowledge Distillation framework is a cool, easy to understand and yet effective tranfer-learning technique which has also been used as a model-compression strategy. In this blog, we will talk about a variant of knowledge distillation called ‘Mutual Learning’ and run an experiment adopting an idea to enhance this technique to give rise to better generalisation capabilities.

Jan 16 2018

Difficulties of Language Modelling and it's approximations

Language model is a parameterised probability function $P(x;\theta)$ which takes a sentence as an input ‘x’ and evaluates how probable the given sentence is. Given the entire corpus of english vocabulary, our goal is to learn a good probabilistic model i.e. learn the parameters $\theta$ such that it outputs a high probability value for a well formed sentence and low probability value for a grammatically incorrect sentence

This blog tries to quickly go over through the probabilistic formulation of a Language model, the difficulties in learning a model and how can we confront them with approximations having theoretical guarantees.