Skip to content

Examples

This section contains complete, runnable examples demonstrating common QuickETL patterns and use cases. Each example includes:

  • Complete YAML configuration
  • Sample data
  • Expected output
  • Step-by-step explanation

Getting Started Examples

Basic Pipeline

A minimal pipeline that reads a CSV, applies filters, and writes to Parquet. Perfect for understanding QuickETL fundamentals.

name: basic_example
source:
  type: file
  path: data/input.csv
  format: csv
transforms:
  - op: filter
    predicate: amount > 0
sink:
  type: file
  path: output/results.parquet
  format: parquet

Data Processing Examples

Multi-Source Join

Combine data from multiple sources (orders + customers + products) into a single enriched dataset.

transforms:
  - op: join
    right:
      type: file
      path: data/customers.csv
      format: csv
    on: [customer_id]
    how: left

Aggregation Pipeline

Compute metrics, summaries, and roll-ups from transactional data.

transforms:
  - op: aggregate
    group_by: [region, category]
    aggregations:
      total_revenue: sum(amount)
      order_count: count(*)
      avg_order_value: avg(amount)

Cloud & Production Examples

Cloud ETL

End-to-end pipeline reading from S3, transforming with Spark, and loading to a data warehouse.

engine: spark
source:
  type: file
  path: s3://bucket/raw/*.parquet
  format: parquet
sink:
  type: database
  connection: snowflake
  table: analytics.fact_sales

Airflow DAG

Complete Airflow DAG with QuickETL tasks, error handling, and monitoring.

@quicketl_task(config="pipelines/daily.yml")
def process_daily(**context):
    return {"DATE": context["ds"]}

Example Categories

Category Examples Description
Basic Basic Pipeline Core concepts
Joins Multi-Source Join Combining data
Analytics Aggregation Metrics and summaries
Cloud Cloud ETL Production cloud pipelines
Orchestration Airflow DAG Workflow integration

Running Examples

Setup

# Clone or create project
quicketl init examples
cd examples

# Install dependencies
pip install quicketl[duckdb]

Run Any Example

# Validate first
quicketl validate pipelines/example.yml

# Run
quicketl run pipelines/example.yml

With Variables

quicketl run pipelines/example.yml --var DATE=2025-01-15

Sample Data

All examples use sample data available in the docs/assets/data/ directory:

File Description Rows
sales.csv Transaction records 12
customers.csv Customer master data 5
products.csv Product catalog 10

Contributing Examples

Have a useful pattern? Contribute an example:

  1. Create a new markdown file in docs/examples/
  2. Include complete, runnable YAML
  3. Add sample input/output data
  4. Document each step

See Contributing Guide for details.