Reusable data transformation functions and utilities for ETL pipelines
Apache Spark project framework for large-scale data processing with Python