How We Model Language: From Word Embeddings to RNN to LSTM
Date:
How can we model language? From word vectors to RNN to LSTM. What is the fitting ability of RNN? What are its limitations? What motivated the creation of LSTM and GRU? PyTorch implementation of RNN, GRU, and LSTM. What else can be improved?
