Vincent D. Warmerdam
Vincent is a senior data professional, and recovering consultant, who worked as an engineer, researcher, team lead, and educator in the past. I’m especially interested in understanding algorithmic systems so that one may prevent failure. As such, he prefers simpler solutions that scale and worry more about data quality than the number of tensors we throw at a problem. He's also well known for creating calmcode as well as a small dozen of open-source packages.
He's currently employed at probabl where he works together with scikit-learn core maintainers to improve the ecosystem of tooling.
Sessions
This is the story of a fun idea that turned into a huge benchmark before it turned into a rabbit hole.
I was trying to figure out reasonable default parameters for some of the components in the skrub library. In order to do that I was looking for datasets with a permissible license which I could use for benchmarking. This is how I stumbled on some old Kaggle competitions that still had their datasets publicly available. So I should just run a simple benchmark, right?
That's where it all started. There were many lessons. They will all be shared.