week-2

What we want to do

Study how meaning representations evolve during training.

Can you have models that represent different meanings based on different

2 experiments:

Training two models from scratch and comparing representations
- Use two different datasets and compare
- Use one dataset with systematic perturbations (e.g. change all occurences of “laptop” to “phone”)
Training a model and investigate changes over epochs (and layers)
- Visualize the vocabulary space over time
- Investigate layer 0 (static embeddings) over epochs
- Investigate layer 3 (contextualized ‘embeddings’) over epochs
- Compare/visualize layer 0 to layer 3 changes. Are static embeddings learned ‘sooner’?
  - Measure distance between embeddings
- Research question:
  - Compare learning curve of function words VS content words
  - Compare learning curve of trivial (frequent) versus non-trivial (less frequent) words