My personal project for data engineering zoomcamp
-
Updated
Jun 3, 2024 - Python
My personal project for data engineering zoomcamp
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
An orchestration platform for the development, production, and observation of data assets.
Apache Superset is a Data Visualization and Data Exploration Platform
Workflow Engine for Kubernetes
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
A Python and FastAPI based asynchronous REST API for accessing St. Louis Parcel Data stored in the Regional Entity Database (REDB)
A Python automated ELT pipeline that routinely aggregates 20+ million rows of parcel data from numerous local government departments for the Regional Entity Database.
Scripts to extract, transform, and load Los Angeles Yelp data from the Yelp Fusion API and Kaggle.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
The Open Source Feature Store for Machine Learning
CSVs sliced, diced & analyzed.
Distributed DataFrame for Python designed for the cloud, powered by Rust
🐚 Python-powered, cross-platform, Unix-gazing shell.
Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines
Clean APIs for data cleaning. Python implementation of R package Janitor
Business intelligence as code: build fast, interactive data visualizations in pure SQL and markdown
SageWorks: An easy to use Python API for creating and deploying AWS SageMaker Models
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."