Seq2Seq

👍 Transformer

TL;DR Transformer High-Level Look Let’s begin by looking at the model as a single black box. In a machine translation application, it would take a sentence in one language, and output its translation in another.

2020-08-23

👍 Attention

Core Idea The main assumption in sequence modelling networks such as RNNs, LSTMs and GRUs is that the current state holds information for the whole of input seen so far. Hence the final state of a RNN after reading the whole input sequence should contain complete information about that sequence.

2020-08-16

Sequence to Sequence

Language Modeling Language model is a particular model calculating the probability of a sequence $$ \begin{aligned} P(W) &= P(W\_1 W\_2 \dots W\_n) \\\\ &= P\left(W\_{1}\right) P\left(W_{2} \mid W\_{1}\right) P\left(W\_{3} \mid W\_{1} W\_{2}\right) \ldots P\left(W\_{n} \mid W\_{1 \ldots n-1}\right) \end{aligned} $$ Softmax layer

2020-08-16

Encoder-Decoder Models

2020-08-16