📋 NLP Reading List

I keep track in this list of papers I've enjoyed reading on various topics in NLP including:

Tokenization: how do we represent natural text as an input to neural networks?

Tokenizer ▁behavior ▁that ▁keeps ▁me ▁up ▁at ▁night

Language Modeling: how do we learn a probability distribution of the next tokens given a context?

P(next token | pineapple on pizza is)

controversial

0.32

illegal

0.45

fine

0.08

delicious

0.15

Decoding: how do we choose tokens from the most likely outputs of a distribution?

If you have a paper recommendation, I’d love to hear from you — feel free to send me an email!