Performance Observability for Apache Spark
-
Updated
Apr 6, 2025 - TypeScript
Performance Observability for Apache Spark
Jayvee is a domain-specific language and runtime for automated processing of data pipelines
🔍 Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript
⚡️ Next-generation data transformation framework for TypeScript that puts developer experience first
📺 Instill Console for 🔮 Instill Core: https://github.com/instill-ai/instill-core
Never sift through endless dbt™ logs again. dbt Command Center is a free, open-source, local web application that provides a user-friendly interface to monitor and manage dbt runs.
Splicing: Gen-AI Copilot for Data Engineering
Aqueduct Core is responsible for the core functionality of Aqueduct, an experiment management system.
Sync your team's data to your LLM applications in real-time
Watchmen Platform is a low code data platform for data pipeline, meta data management , analysis, indicator objective analysis and quality management
An extensible pipelining tool to build data pipelines from your bank account to any destination.
A next JS app that analysis your whatsapp chats and gives useful quirky insights
Create Database agnostic aggregations base on data pipelines
Real-time data processing architecture using Apache Kafka, Flink, and Kubernetes. This project demonstrates how to build a scalable and resilient pipeline for streaming data, performing ETL with Flink, and storing the processed data in a Data Warehouse for analysis.
Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."