probing-llms-day2

Probing classifier (also known as ‘probes’): small supervised model trained on top of frozen LM representations to predict linguistic properties.

BERTology: study of how models like BERT work (also GPT models)

Pitfall:

Probing performance may say more about the probe than the representation
- Include controls to test for memorization
Performance is based on not just model architecture, but also dataset the model is trained on