adv-nlp-7
SRL and Deep Learning
Assignments 1 & 2
explain why not take weighted averages too seriously (majority baseline already scores …)
Sequence classification is classifying a sequence of labels for an input sequence.
For assignment 1, we’re doing token classification for a sequence. Even when using features that encode context, the logistic regression approach is still fundamentally not a sequence labeling approach. In the logistic regression approach, the actual classification can be done in any random order.
With the BERT approach, we
Problem with RNN:
- Vanishing Gradient Problem: every token inference only has ‘access’ to the previous token. So the impact of a token’s processing on subsequent tokens fades with longer distance. Token dependencies don’t fade monotonically. A token may have a higher dependency on a token 5 tokens ago than on the previous token.