I spent 10 years as an astrophysics researcher analysing high-energy data from space telescopes in the search for new objects in the universe and a better understanding of what we already knew to be out there. In 2015 I transitioned to data science joining a smart-cities startup called HAL24K. Over the next 8 years, I built data science solutions that enabled city governments and suppliers to derive actionable intelligence from their data to make cities more efficient, better informed, and better use of resources. During that time I built and led a team of 10 data scientists and helped the company spin out four new companies. In 2022, I joined ComplyAdvantage as a Senior Data Scientist working to combat financial crime and fraud.
I have been an active member of the PyData community since 2015 and founded PyData Southampton in 2023. I am also a long-time supporter of DataKind UK in their mission to bring pro-bono data science support to charities and NGOs in the third sector.
- Mastering Data Flow: Empower Your Projects with Prefect's Pipeline Magic
Alex is a seasoned Computer Vision and MLOps Engineer with over nine years of experience shaping the future of AI-driven software development. Currently, Alex leads the ML/DevOps team at Zencoder, where he leverages his extensive background in Software Engineering, ML and DevOps to deliver high-quality machine learning solutions. His work spans complex data pipelines, cloud infrastructure management (GCP, Kubernetes), and advanced ML/DevOps pipelines, ensuring scalability and efficiency. Before Zencoder, Alex played pivotal roles in numerous projects, including leading teams at Sanas, ivi and MTS AI. His technical expertise in machine learning, data science, and bioinformatics has led to impactful solutions across industries, ranging from bioinformatics at the University of Massachusetts to video analysis at ivi.ru and MTS AI. Alex has a proven track record of managing complex infrastructure that scales to hundreds of GPUs, enabling effective and easy use of cloud infrastructure for data scientists while driving down costs through cloud consolidation efforts and boosting productivity through the deployment of sophisticated AI models. In addition to his technical contributions, Alex has been instrumental in mentoring teams and fostering a culture of innovation and collaboration. His deep understanding of AI systems, from developing recommendation engines to cutting-edge computer vision algorithms to voice and NLP, positions him as a thought leader in the AI and ML space. Whether it’s speaking on the latest advancements in MLOps, sharing insights on AI-driven automation, or discussing the future of AI in the enterprise, Alex brings a wealth of knowledge, practical experience, and a passion for pushing the boundaries of what’s possible with AI.
- Going Beyond Copilot with AI Agents
Ana is a Data Scientist at the Amsterdam-based fintech Mollie. She works with ML and Gen AI to create solutions for automating and optimizing customer monitoring and payment processes. Ana enjoys exploring novel data approaches and practically implementing them to solve business challenges.
- Boost Your LLM: Building LLM Agents with LangChain
Anastasiia holds a master's degree in Applied Mathematics and has utilised her skills in a number of data projects in energy domain. She joined CorrelAid NL after contributing to multiple projects as a data for good volunteer. At CorrelAid, her goals include bringing more #data4good projects on board, streamlining technical processes and continuing our meet-up series that got her excited to join us in the first place! In her work as a data scientist at Jedlix, she is focused on productionizing data products, as she believes that data brings most value when it improves and automates decision making continuously. Anastasiia takes inspiration from doughnut economics and hopes that her work with CorrelAid contributes to a more sustainable and just world!
- Data Science for Social Good: Making Impact in Resource-Constrained Environments
Andrei Stoian obtained a Engineering degree in Computer Systems from the Politehnica University of Bucharest, then a PhD in Machine Learning for image and video analysis at the Conservatoire National des Arts et Metiers in Paris, France, in 2015. Since 2021 he has been working for Zama on building Concrete-ML, a machine learning toolkit to perform model inference on encrypted data. His research interests are centred on adapting deep learning models to FHE computation, especially in the areas of image classification and recognition. He has published more than 20 papers on various machine learning topics and holds several patents.
- Open-source Machine Learning on Encrypted Data
Ankur Ankan is a postdoctoral researcher at Radboud University in the Netherlands, where his research focuses on causal inference. His main interest lies in developing practical methods for causal inference, along with developing software tools for it. He also started and maintains the Python package pgmpy, which offers tools for probabilistic and causal inference in graphical models.
- Introduction to Causal Inference using pgmpy
Arda is a data scientist, comedy enthusiast and a self-proclaimed comfort-zone escaper. Originally from Turkey, he graduated from Koc University, Istanbul with a BSc in Electrical Engineering and a minor in Psychology. Later on he specialized in Signal Processing during his master's studies in Delft University of Technology in the Netherlands before delving into his professional career in data science.
- Changing Hats: Techie vs Comic
She writes backend logic and ML models for a living. She is also passionate about health informatics,open source source and tech communities.
- Synthetic Data for Localized Solutions
My first introduction to code was Fortran 90 during my (Hydro)Geology studies. It was not love at first sight. This changed when I picked up R and Python to automate some processes in my day to day job, still working in the Geology field. I noticed that the little coding projects I gave myself I found the most enjoyable, and as such, decided to jump aboard the Data train, back in 2015. Being a Data Sciencist back then was already the sexiest job, and I wanted a piece of that. I joined the NPO (Dutch public broadcaster), and have since joined Nicolab, where I currently work as a Data/Software Engineer on developing solutions to assist in the acute stroke space.
Also: come and tap me on the shoulder if you wanna talk Mountainbiking and/or Padel :-) <3.
- From mocking to rocking your tests with testcontainers
Cheuk has been a Data Scientist in various companies which demands high numerical and programmatical skills, especially in Python. To follow her passion for the tech community, Cheuk has been a Developer Advocate for 3 years. Cheuk also contributes to multiple Open Source libraries like Hypothesis, Pytest, Pandas, Polars, PyO3, Jupyter Notebook and Django. Cheuk is now a consultant and trainer at CMD Limes.
Besides her work, Cheuk enjoys talking about Python on personal streaming platforms and podcasts. Cheuk has also been a speaker at Universities and various conferences. Besides, Cheuk also organises tech events. Conferences that Cheuk has organized include EuroPython, PyData London and Pyjamas Conf. Believing in Tech Diversity and Inclusion, Cheuk co-founded Humble Data workshops and help organise mentored sprints for underrepresented groups.
Chuek also loves serving the community that she is in. In 2021 and 2022 Cheuk served as a board member in EuroPython Society. Cheuk is currently a Python Software Foundation fellow (since 2021) and director (since 2023).
- Writing Python modules in Rust - PyO3 101
- Counting down for CRA - updates and expectations
Senior Research Fellow at University of Southampton
Research Software Engineer for the Legacy Survey of Space and Time, UK
- Mastering Data Flow: Empower Your Projects with Prefect's Pipeline Magic
CorrelAid Netherlands is a non-partisan, non-profit network of data scientists who want to use their skills to advance the social good.
- Data for Social Good
Danial is a data scientist & analytics translator with a PhD in applied mathematics (systems & control). In his career, he has experienced different sectors, i.e. manufacturing, cybersecurity, healthcare, and finance. Danial interest lies in the area of predictive & causal modelling theory and its application. In his current adventure at ABN AMRO bank, he contributes to e-commerce data-driven solutions to improves clients experience, engagement, and satisfaction.
- Causal Effect Estimation in Practice: Lessons Learned from E-commerce & Banking
Medior Data engineer at AXI & python enthusiast with a passion for making interesting data topics more broadly accesible.
- Show off your python code to even a complete newbie, using shiny for python
Fabien Vauchelles is an Anti-Ban Expert. With over a decade of experience in Web Scraping, Fabien's passion for code and technology helps him to bypass protections. He is the creator of Scrapoxy, a mature free and open-source proxy waterfall tailored for the Web Scraping industry.
He had the opportunity of sharing his insights at many events including Devoxx conferences, Voxxed Days, API Days, PyCon, PyData and others.
- Master Advanced Web Scraping Techniques in Python
Farzam Fanitabasi is working as the "Chapter Lead Data Science: LLMOps" at ING Netherlands. He has been working in the NLP domain in the past 4 years, and the LLM domain the last 3 years. He has a PhD from ETH Zurich (2018-2021) where he worked on Deep Learning + NLP. Prior to joining ING, Farzam worked as a Postdoc (2021) on applied NLP in the communication science domain (VU Amsterdam) and later as a senior data scientist and tech lead, where he led a technical team in designing and developing large-scale NLP and LLM data products.
- Productionizing Generative AI at ING: Navigating Risk, Compliance, and Defense Mechanism
I am a machine learning scientist at Booking.com working on personalized discounts under budget constraints.
I have a PhD in Computer Science from the Delft University of Technology. During my PhD, I interned as an applied scientist at Amazon Alexa Shopping, where I worked on finding proxies for what customers find relevant when comparing products during their search shopping journey in order to empower Amazon recommendation systems. Before that I obtained a BSc and MSc in Computer Science from the Federal University of Minas Gerais, visited research labs at NYU and the University of Quebec, and worked as a software engineer intern in a news recommendation system start up.
- Uplift Modeling for Marketing Personalization in Practice
Françoise is leading the Data team at Local Logic, a Montreal-based location-intelligence startup, where she builds data products and tells data stories. Before joining Local Logic, she did a PhD in experimental physics, a postdoctoral fellowship in machine learning, organized PyLadies Montreal and had a prolific 7 years working at Shopify. Although she loves the switch to data science, she secretly misses shooting lasers at stuff.
- How to measure a city
Gautier Chenard is the Data Director at Malt, a leading European freelancing platform, currently based in Amsterdam. In his role, he focuses on utilizing analytics to drive impactful, data-informed decisions. Prior to joining Malt, Gautier gained extensive data leadership experience at fast-growing companies like Ankorstore and Shadow. He started his career in consulting and he holds a master’s degree from ESSEC Business School in Paris.
- The Impact of Generative AI on Data Analytics: Panel Discussion
I am an Assistant Professor of Marketing at the Rotterdam School of Management (Erasmus University Rotterdam). My research focuses on (differential) privacy and marketing analytics.
- Private targeting strategies
Hanna is an experienced Data Scientist in the Generative AI team of Adyen, building a self-hosted LLM platform based on open source tools, and using the platform to streamline the work of customer service agents within Adyen. Originally from the Netherlands, she is now based in Madrid. She loves traveling, nature and CrossFit.
- The Impact of Generative AI on Data Analytics: Panel Discussion
Hans is a senior data scientist. He currently works at Deltares, the Dutch research institute for water and subsurface. He is passionate about actively involving stakeholders in ML model development and creating data-driven models for earth sciences with sufficient extrapolation skill.
- SHAP beyond the standard graphics: co-design of ML-models in earth sciences
I am a Senior Machine Learning Scientist in the Pricing Department at Booking.com. I specialize in promotion personalization using causal inference and uplift modeling techniques.
I obtained a PhD in Computer Science from Leiden University, where I worked on statistically robust learning for interpretable machine learning models using information theory in collaboration with GE Aviation. I have developed diverse industry experience through positions at Huawei AI Research Labs in Ireland and Silo AI in Helsinki.
- Uplift Modeling for Marketing Personalization in Practice
Ines Montani is a developer specializing in tools for AI and NLP technology. She’s the co-founder and CEO of Explosion and a core developer of spaCy, a popular open-source library for Natural Language Processing in Python, and Prodigy, a modern annotation tool for creating training data for machine learning models.
- Keynote - Applied NLP in the age of Generative AI
Senior Data Scientist at The HEINEKEN Company;
I am passionate about Data Science and have been working professionally in this area since 2013. My biggest professional interests are statistical learning and causal inference. I am a statistician by education and the most of my professional experience is in implementing machine learning based solutions to optimize business processes.
- From Predictions to Action: Fusing Machine Learning and Mixed Integer Linear Programming
James serves as lead instructor for Don't Use This Code. Don't Use This Code provides consulting, coaching, and training services to a number of clients in the financial services and tech industry, helping them develop greater expertise in the use of Python for data analysis, computational simulation, and automation.
If you are interested in developing your staff's expertise in Python, Pandas, or with data analysis and software development in general, please reach out to us at [email protected]
- How Dimensional is a `pandas.DataFrame`, anyway?
Javier is a Research Engineer at Hopsworks where he actively contributes to advancing the Hopsworks Feature Store platform. He is currently pursuing his Ph.D. at KTH Royal Institute of Technology in Sweden with a primary focus on large-scale machine learning systems.
- Build a personalized Commute virtual assistant in Python with Hopsworks and LLM Function Calling
Jeroen Janssens, PhD, is a data science consultant and certified instructor. His expertise lies in visualizing data, implementing machine learning models, and building solutions using Python, R, JavaScript, and Bash. He’s passionate about helping and teaching others to do such things.
Jeroen works as a Senior Machine Learning Engineer at Xomnia in Amsterdam. Previously, he was an assistant professor at Jheronimus Academy of Data Science and a data scientist at Elsevier in Amsterdam and several startups in New York City.
He is the author of Data Science at the Command Line (O’Reilly, 2021) and Python Polars: The Definitive Guide (with Thijs Nieuwdorp; to be published by O’Reilly in January 2025). Jeroen holds a PhD in machine learning from Tilburg University and an MSc in artificial intelligence from Maastricht University.
Website: https://jeroenjanssens.com
- How I hacked UMAP and won at a plotting contest
Juan is a Mathematician (Ph.D. Humboldt Universität zu Berlin) and data scientist. He is interested in interdisciplinary applications of mathematical methods. In particular, time series analysis, bayesian methods, and causal inference.
- Time Series forecasting with NumPyro
Laura is a senior machine learning scientist at Adyen - currently working in the authentication domain. She has a background in cognitive science and completed her PhD in the field of affective computing at LMU Munich.
- Je ne regrette rien - Teaching Machine Learning Models Regret Avoidance
Laura is a very technical designer™️. She recently joined Pydantic as Lead Design Engineer. Her side projects include Debias AI, (debias.ai), Sweet Summer Child Score (summerchild.dev), Ethics Litmus Tests (ethical-litmus.site), fairXiv (fairxiv.org), the Melbourne Fair ML reading group (groups.io/g/fair-ml). Laura is passionate about feminism, digital rights and designing for privacy. She speaks, writes and runs workshops at the intersection of design and technology.
- Sweet Summer Child Score
As an econometrician at the Erasmus School of Economics and later a data scientist at Bocconi, my journey in uncertainty quantification began during my internship at Quantmetry, where I contributed to the development of MAPIE. I implemented conformalized quantile regression and became a core developer of the library, collaborating closely with Thibault Cordier, Vincent Blot, and Candice Moyet.
Currently, at Capgemini Invent's R&I lab, we continue to advance MAPIE while also exploring cutting-edge topics such as hallucinations in large language models (LLMs) and combining knowledge graphs with LLMs.
- Boosting AI Reliability: Uncertainty Quantification with MAPIE
As a PhD researcher at the University of Amsterdam, I focus on multimodal machine learning, merging images, text, and data for meaningful tasks. I am also the author of "De AI Revolutie," exploring the societal impacts of artificial intelligence.
I recently launched The AI Factory, a venture dedicated to creating a lasting and positive impact with AI. Through AI product development and strategic consultancy, The AI Factory aims to help organizations apply, scale, and control AI effectively and responsibly.
I partner with clients across various industries to deliver rapid results with AI systems.
Interested in learning more about The AI Factory? Or making a lasting impact with AI? Feel free to reach out; I'd love to chat about the possibilities!
- Jounai.nl: Playing with New Tech to Reinvent the News
Data and machine learning enthusiast with a soft spot for open-source software.
Driven by curiosity, eager self-learner, Kaggle Notebook Expert, Datathons enjoyer, PyData volunteer, currently engaged in Women in AI Mentorship, exploring and contributing to Open Source Land and building my Machine Learning portfolio.
- Alice in the Open Source Land
- Open Source Sprint: Narwhals
Meet Mahmoud, a seasoned data professional with a decade of experience across continents and industries, from banking to telecommunications. Starting as an ETL developer, he evolved into a lead data architect at ABN AMRO, overseeing data lineage and management. Now, as a Senior Data Manager at Booking.com, Mahmoud drives innovative data management practices in the realm of public cloud architecture, shaping the future of booking.com.
- The Impact of Generative AI on Data Analytics: Panel Discussion
Data Platform Engineer @ Adyen
- From Data Pipelines to a Data Platform: Embracing Monorepo Architecture
Marc Palyart is the Director of Data Science at Malt, the freelancer marketplace, where he leads the search and matching team. With over a decade of data-wizardry under his belt, he's ventured into the depths of academia and scaled the heights of industry where he's had the pleasure of collaborating with some truly remarkable people.
- Retrieve me if you can: SLM-powered retrieval to scale freelancers matching at Malt
Marco is a core dev of pandas and Polars and works at Quansight Labs as Senior Software Engineer. He also consults and trains clients professionally on Polars. He has also written the first Polars Plugins Tutorial and has taught Polars Plugins to clients.
He has a background in Mathematics and holds an MSc from the University of Oxford, and was one of the prize winners in the M6 Forecasting Competition (2nd place overall Q1).
- How you (yes, you!) can write a Polars Plugin
- Understanding Polars Expressions when you're used to pandas
- Open Source Sprint: Narwhals
I am a Data Scientist, who is specialised in both, traditional machine learning and generative AI techniques. At Mollie, I am currently contributing to the development of MollieGPT, the company's chatbot. My background as a researcher in physics fuelled my love for unraveling hidden details in data and has equipped me with the unique ability to approach complex problems with a scientific mindset.
- Boost Your LLM: Building LLM Agents with LangChain
Martin is an experienced computer science educator and open source software developer.
Martin creates educational content for Neo4j and supports developers in using graph technology to understand their data.
As a child he wanted to be either a Computer Scientist, Astronaut or Snowboard Instructor.
- GenAI Beyond Chat with RAG, Knowledge Graphs and Python
- IOI 2000 / Jugend forscht (https://t.ly/DmvHG)
- Electrical Engineering at KIT, Stanford University, and IMEC/KU-Leuven
-- Research: Mapping software descriptions on hardware targets + Transactional Memory - QuantCo – Optimizing high stakes business decisions by data analytics
https://www.linkedin.com/in/mtrautmann
- pydiverse pipedag - A library for data pipeline orchestration optimizing high development iteration speed
Marzieh Fadaee is a senior research scientist at Cohere For AI, a non-profit research lab that seeks to solve complex machine learning problems and create more points of entry into machine learning research. Marzieh's work is broadly interested in all aspects of natural language understanding, particularly in multilingual learning, data-conscious learning, robust and scalable models, compositionality, and interpretability. Previously she was the NLP/ML research lead at Zeta Alpha Vector working on smarter ways to discover and organize knowledge. She did her PhD at University of Amsterdam, working on developing models to understand and utilize interesting phenomena in the data.
- Keynote - The Art of Language: Mastering Multilingual Challenges in LLMs
I am a Machine Learning Engineer at Booking.com. In my role, I focus on creating personalised discounts for customers, carefully balancing budget constraints with advanced machine learning techniques.
I hold a Master's degree in Data and Machine Learning from Politecnico di Milano in Italy, which has provided me with a strong academic foundation in Computer Science that I love to apply in the ML realm. At Booking.com, I support scientists by developing and maintaining tools that streamline the experimentation, deployment, and monitoring of machine learning models, ensuring these processes are robust, efficient and effective.
Before joining Booking.com, I gained valuable experience at DAZN, where I contributed to developing user facing content recommendation engine that aimed at enhancing user experience through personalised suggestions. I've a past experience as ML consultant, where I advised various clients on integrating machine learning solutions to improve their business operations.
- Uplift Modeling for Marketing Personalization in Practice
Matthieu Caneill holds a PhD in computer science, and is now working as a data & software engineer, empowering dozens of dbt users through his passion for well crafted data & software.
- dbt-score: a linter for your dbt model metadata
Mehrzad received his Masters in Artificial Intelligence back in the time when AI was not that common in The Netherlands. As part of the first groups of students who did a full AI study in the Netherlands, he joined the early adopters of AI and took the challenge of bringing practical AI to the endeavors of everyday life. He has currently taken the challenge of bringing the power of collaboration to the scientific world by building digital solutions based on AI and Data Science. Mehrzad joined CorrelAid with the goal of making this blue planet a better place for everyone, helping NGOs harness the power of Data and Data Science. With a strong passion for collaboration and people, Mehrzad's everyday reminder is: Live, Love, Explore, and challenge your excuses everyday!
Contact: [email protected]
- Data Science for Social Good: Making Impact in Resource-Constrained Environments
Merve is a machine learning advocate engineer at Hugging Face. She's making zero-shot computer vision and multimodality more accessible by building, documenting, and sharing about them.
- Keynote - Open-source Multimodal AI in the Wild
I am a hybrid machine learning engineer, data scientist and cloud infrastructure engineer whose professional track record covers many aspects of applied data science, MLOps and (big) data engineering. I am passionate about developing scalable, production-ready, and efficient ML applications / infrastructure.
I relate to the everyday struggles that data scientists and machine learning engineers encounter in their workflows, whether that be reproducible experimentation, feature engineering, model training, or inference. Being able to come up with solutions for impediments in these areas and enabling data science teams to be as productive as possible is what drives me.
- Polishing Python: Preventing Performance Corrosion with Rust
He is a data scientist at KNAB with over 6 years of experience in the financial services industry. He currently works for KNAB and prior to this, he worked at ING as a data scientist across various domains, including commercial, people analytics, and financial crime and fraud.
He holds a PhD in computational physics and is passionate about conducting research in Trustworthy AI and Generative AI topics.
- Uncertainty quantification: How much can you trust your machine learning model?
Myrthe has been working as an AI Engineer and Data Engineer for various clients over the past couple of years. Currently, she is a Data Engineer at Digital Power.
Inspired by everything related to Python, Natural Language Processing, and Machine Learning, Myrthe is excited to present at (and attend) PyData Amsterdam this year!
- Prompt hacking for Generative AI
I am a Senior Machine Learning Scientist at Booking.com based at the global headquarters in Amsterdam. My machine learning interests lie primarily in probabilistic/Bayesian modelling and currently I work on applying these ideas to attribution modelling. When not crunching statistics or building ML models, you can find me coasting on one of my bicycles or gazing at landscapes through a train window.
- From language to marketing: RNNs for data-driven multi-touch attribution at Booking.com
I’m a Data Scientist with 5 years of experience at Booking.com. My academic background is in Applied Mathematics. Prior to joining Booking.com, I worked in the telecommunications industry for 3 years.
Outside of work, I’m passionate about traveling and exploring new cultures with my family. Additionally, I enjoy playing tennis with my son – it's our favourite way to stay active and have fun together.
I’m excited to be a part of the PyData community and look forward to connecting with fellow data enthusiasts!
- From language to marketing: RNNs for data-driven multi-touch attribution at Booking.com
Natasha is currently the Head of Modelling at ING Global Risk. She is also an Expert in Residence at Kickstart.ai focusing on creating frameworks and technical implementations for Ethical & Responsible AI as well as the Education lead at Women in AI (NGO).
She holds a PhD in Computer Science specializing in AI and during her career, she has been the Head of Data science at dotModus (Google cloud scale up), the Head of the Centre of Excellence for Robotics in South Africa and previously spent many years working as a data scientist and software engineer.
Natasha is passionate about STEM education especially for females and disadvantaged communities and is a strong advocate for the use of ethical & responsible AI.
- AI: Ethical & Responsible by Design
Niek is a founder of Roseman Labs, a company created to commercially realise privacy-preserving and/or confidential computations by means of secure multiparty computation (MPC).
Niek enjoys working on topics in computer science, mathematics and electrical engineering that combine theoretical (mathematical) foundations with practical relevance and feasibility. His work includes further developing such theoretical foundations, as well as bringing such foundations into practice, by means of, for example, writing a software implementation.
In his research at TU Eindhoven, he focused on MPC including its applications to privacy-preserving data mining & machine learning. At ABN AMRO, he has worked on machine learning techniques for detecting transaction fraud. At EPFL Switzerland, he has worked on security and control aspects of smart electricity grids. Niek holds a PhD in Mathematics ('12) from Leiden University, on the topic of quantum cryptography; this research was carried out in CWI's cryptology group. He obtained his MSc (cum laude) and BSc degree in Electrical Engineering from University of Twente. His master's thesis was in Communication and Information Theory. His research interests include cryptology, privacy-preserving machine learning, information theory, Bayesian statistics, and signal processing.
- Roseman Labs - Python-powered encrypted AI
Philip is the director of Blair Software, an Amsterdam-based AI consultancy specializing in NLP software. Originally from the United States, he spent nearly a decade doing applied research and development on NLP systems, and conducts a mixture of AI software development, AI corporate advisory work, and advising software companies of the impacts of transatlantic AI regulation for companies in Europe and the United States.
- Who's Who and Where's What: Dealing With Names and Addresses Around the World
I'm a lead data scientist in the Marketing Science in Meta (Facebook). As a lead data scientist, I lead data science projects by ensuring scalability and implementation of best practices. I have experience in data pipelines and warehousing, machine-learning visibility for MLOps, stakeholder management, mentoring, and causal inference.
Among other tools and libraries, I have expertise in Airflow, XGBoost, CausalPy, PyMC, STAN, Looker, Deep Learning, Synthetic Control (for pesudo-experiments).
- Almost Perfect: A Benchmark on Algorithms for Quasi-Experiments
Richie is a technical economist and currently a member of the Microsoft NL Security team, currently focusing on AI security. Prior to joining Microsoft, as a data scientist, he focused on A/B testing, causal inference, econometrics and consulting.
- LLM Security 101 - An Introduction to AI Red Teaming
Ritchie Vink is the Author of the Polars DataFrame library and Co-founder/CEO of the Polars company. Originally he has a background in Civil Engineering, but he soon made the switch to Data/Software development. He has worked as a Machine Learning Engineer and a Software Engineer for 5 years, before he spent all of his time to Polars project. Those years have been filled with side projects to feed his curiosity. His last 2 years have been spent full time on Polars.
- Polars 1.0 and beyond
As a final-year PhD student at the University of Amsterdam, Rob Romijnders specializes in privacy-aware machine learning. Two research projects have been awarded oral talks by academic conferences such as AAAI and ICLR. Having worked at a startup and two AI companies, Rob is excited to be back at PyData and make academic topics applicable to the PyData community!
- Differential Privacy Made Practical
I write and speak about the learnings and challenges I face in the data world, from the perspective of having worked in various data roles (Data Science, ML & Data Engineering, Tech Lead).
Currently I'm working at Achmea as a Machine Learning Engineer. Here I'm helping build a MLOps platform from the ground up.
Before this I've built the data platform from scratch at Solvimon. Solvimon is a startup tackling the entire billing ecosystem by building a modern and flexible platform.
In my previous role at Adyen, I led the initiative that resulted in their first end-to-end machine learning solution, which boosted payment conversion rates and generated hundred millions of euros in additional revenue for leading global merchants (e.g. Spotify, Microsoft, Meta).
- Building a Data Platform from scratch
Sander is a Machine Learning engineer at Cohere, working on post-training, reward modelling, and model evaluation. Originally from Groningen, he now lives in Denmark.
- Terrible tokenizer troubles in large language models
Sander has been working in Data & AI for 7+ years, focussing on the engineering side of things. He currently develops the AI platform at Schiphol.
Being inspired by Mickey Beurskens' talk 'Breaking Large Language Models' at PyData Eindhoven in 2023, he developed a workshop to provide people hands-on experience with prompt engineering and hacking. He's super excited to bring this knowledge back to PyData again!
Sander is a frequent visitor of PyData conferences and meetups, and helped organise the conference in 2018 and 2019.
- Prompt hacking for Generative AI
CTF-Player, Hacker, Pentester, Python Enthusiast
- The Odyssey of Hacking LLMs: Insights from Two Shipmates sailing in the LLM CTF @ SaTML 2024
Sarah Diot-Girard has been working on Machine Learning since 2012, and she enjoys using data science tools to find solutions to practical problems. She is particularly interested in issues, both technical and ethical, coming from applying ML into real life. She gave talks at international conferences, about data privacy and algorithmic fairness, and software engineering best practices applied to data science. She is employed by Owkin as a maintainer of the Federated Learning platform Substra since 2023.
- Debugging as an experimental science
Data Scientist at the HEINEKEN Company with a proven track-record in building optimization models combined with machine learning in production.
- From Predictions to Action: Fusing Machine Learning and Mixed Integer Linear Programming
Stephan de Goede is a senior data scientist at Rabobank. He is a strategic thinker with experience in developing and implementing GenAI solutions aligned with business objectives in regulated environments.
- The Impact of Generative AI on Data Analytics: Panel Discussion
Thibault Cordier is a Data and Research Scientist at Capgemini Invent, where he is a member of the Lab Invent team in France and serves as the technical leader of the MAPIE project.
Prior to joining the research team at Capgemini Invent, he earned his PhD in Computer Science in 2023 at Avignon University.
Up to now, his research has focused on distribution-free inference and conformal prediction, with applications in computer vision, natural language processing, and time series analysis.
- Boosting AI Reliability: Uncertainty Quantification with MAPIE
Thomas Crul is a matchmaking researcher and data analyst for the dating app Breeze. Breeze aims to take ‘online dating offline’ by bypassing the messaging stage and directly arranging real-life dates for its users after a match. Crul specializes in developing recommendation algorithms responsible for connecting potential partners. He is interested in investigating and addressing biases within these technical systems.
- Algorithmic bias is everywhere (especially at Breeze) - what can we do about it?
Meet Thomas, a passionate advocate for science, particularly in the realm of applied mathematics. Following his doctoral studies, he embarked on a journey into the world of embedded programming, where his affinity for DevOps took root. His enduring passion for crunching numbers ultimately led him to the fascinating field of artificial intelligence, where he's now an acknowledged MLOps and NLP expert, seamlessly integrating machine learning into operations.
Thomas has an impressive track record as a leader, having overseen two publicly funded open-source research programs in the field of AI, in collaboration with the German Aerospace Center. Today, he is at the forefront of AI-driven cybersecurity research at Smart Cyber Security GmbH and working on his low-budget bark beetle detection drone project – a testament to his enduring fascination with embedded systems.
- The Odyssey of Hacking LLMs: Insights from Two Shipmates sailing in the LLM CTF @ SaTML 2024
Thomas is an engineering enthusiast with expertise in designing self-powered AI systems, driven by a passion for innovative product design. His research has focused on multimodal fusion and deep learning. With roots in Greece and now calling the Netherlands home, Thomas works as a senior data scientist at ASML, where he develops AI-driven, data-analytic solutions for diagnostics. Outside of work, he enjoys spending time immersed in nature.
- The Impact of Generative AI on Data Analytics: Panel Discussion
Responsible for Posit in the Benelux
- Show off your python code to even a complete newbie, using shiny for python
I'm a data-driven problem solver with diverse interests spanning physics, venture capital, machine learning, and DIY robotics. My PhD research focused on investigating the evolution and success factors of complex social systems, particularly in the startup ecosystem.
Currently, I lead the data science team for consumer and affluent customers at ABN AMRO. By applying advanced data science techniques to challenges like churn prediction, propensity modeling, and recommender systems, I'm helping shape the bank of tomorrow.
My work, which I'll be presenting at PyData Amsterdam, has garnered attention in Wired UK and was published in Nature Scientific Reports. It explores the emergence of large-scale order in complex systems, offering insights applicable across various domains. Links below:
- https://www.wired.com/story/how-to-grow-startup/
- https://www.nature.com/articles/s41598-019-57209-w
- Hunting unicorns with Network analysis
Vasiliy Kaminskiy is the Data Science Team Lead at JetBrains with over four years of experience managing a team that delivers impactful data science solutions. He focuses on innovation, automation, and maintenance of high-quality research output. His diverse skill set includes knowledge of different programming languages (Python, SQL, and Kotlin), databases, and data science libraries to prepare, analyze, and visualize data, presenting key insights to stakeholders. Vasiliy’s experience encompasses survey analysis to extract popularity trends, using text clustering methods to extract key points, and integrating ML models into decision-making processes to impact revenue generation, customer targeting, and product development.
- How Research Teams Can Deliver Higher-Quality Insights Faster
Vincent is a senior data professional, and recovering consultant, who worked as an engineer, researcher, team lead, and educator in the past. I’m especially interested in understanding algorithmic systems so that one may prevent failure. As such, he prefers simpler solutions that scale and worry more about data quality than the number of tensors we throw at a problem. He's also well known for creating calmcode as well as a small dozen of open-source packages.
He's currently employed at probabl where he works together with scikit-learn core maintainers to improve the ecosystem of tooling.
- Run a benchmark they said. It will be fun they said.
As a senior machine learning scientist at Adyen, my current focus lies in the development of models and explainability tools to monitor and comprehend the intricacies of payment processes. My primary areas of interest encompass Bayesian probabilistic modeling, machine learning explainability, and causality. In my past roles as a freelance professional, employed data scientist, and university researcher, I have engaged with a wide array of applied modeling challenges, such as churn modeling, matching engines, anomaly detection, recommender systems, and modeling of human behavior under risk.
- Drift Detection on Irregular Time Series with Multiple Non-Uniform Seasonal Patterns Using MIST and DTW algorithms
With a robust career spanning ten years, Vladimir brings expertise in infrastructure maintenance, operations, and server-side development. His experience includes five years in DevOps and four years in Python development.
- The Impact of Generative AI on Data Analytics: Panel Discussion
With a PhD in computer science, I completed my thesis in machine learning applied to the fashion industry in partnership with Lectra. I'm currently in charge of advanced matching topics requiring R&D at Malt, the freelance marketplace.
- Retrieve me if you can: SLM-powered retrieval to scale freelancers matching at Malt
Wojtek Kuberski is an AI professional and entrepreneur with a master's in AI from KU Leuven. He co-founded NannyML, an OSS in Python for ML monitoring and Post-Deployment Data Science. At NannyML, he leads the research and product teams as the CTO, contributing to novel algorithms in model monitoring.
- The ML Monitoring Flow for Models Deployed to Production
I'm a data scientist with a background in physics, holding both a bachelor's and a master's degree in the field. After completing my professional doctorate in data science at Eindhoven University of Technology, I gained extensive industry experience working with diverse clients across various sectors, primarily in the NLP and LLM domains. Outside of work, I'm an avid runner and enjoy playing volleyball and indoor football. In my spare time, I love reading, building with Legos, and engaging in board games.
- Productionizing Generative AI at ING: Navigating Risk, Compliance, and Defense Mechanism