Martin Trautmann
- IOI 2000 / Jugend forscht (https://t.ly/DmvHG)
- Electrical Engineering at KIT, Stanford University, and IMEC/KU-Leuven
-- Research: Mapping software descriptions on hardware targets + Transactional Memory - QuantCo – Optimizing high stakes business decisions by data analytics
Sessions
09-20
14:10
35min
pydiverse pipedag - A library for data pipeline orchestration optimizing high development iteration speed
Martin Trautmann
This talk presents github.com/pydiverse/pydiverse.pipedag, a data pipeline orchestration library for rapid iterative development with automatic cache invalidation. It allows users to focus on their actual tasks: Writing analytics and data transformation code in pandas, polars, sqlalchemy, ibis, and the like.
The talk is meant for people working with data pipelines on beginner to advanced level. It teaches best practices in dealing with code versioning vs. data versioning, working with small data samples during development, and how to gradually improve the coding style of an existing pipeline.
Escher