Sander Land
Sander is a Machine Learning engineer at Cohere, working on post-training, reward modelling, and model evaluation. Originally from Groningen, he now lives in Denmark.
Sessions
09-20
11:05
35min
Terrible tokenizer troubles in large language models
Sander Land
Huge amounts of resources are being spent training large language models in an end-to-end fashion. But did you know that at the bottom of all these models remains an important but often neglected component that converts text to numeric inputs? As a result of weaknesses in this ‘tokenizer’ component, some inputs can not be understood by language models, causing wild hallucinations, or worse.
This talk will cover some of our recent research in finding what text causes problems for a specific model, and show you how to break even the most advanced models.
Rembrandt