etl-designer

Design robust ETL/ELT pipelines for data processing.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "etl-designer" with this command: npx skills add armanzeroeight/fastagent-plugins/armanzeroeight-fastagent-plugins-etl-designer

ETL Designer

Design robust ETL/ELT pipelines for data processing.

Quick Start

Use Airflow for orchestration, implement idempotent operations, add error handling, monitor pipeline health.

Instructions

Airflow DAG Structure

from airflow import DAG from airflow.operators.python import PythonOperator from datetime import datetime, timedelta

default_args = { 'owner': 'data-team', 'retries': 3, 'retry_delay': timedelta(minutes=5), 'email_on_failure': True, 'email': ['alerts@company.com'] }

with DAG( 'etl_pipeline', default_args=default_args, schedule_interval='0 2 * * *', # Daily at 2 AM start_date=datetime(2024, 1, 1), catchup=False ) as dag:

extract = PythonOperator(
    task_id='extract_data',
    python_callable=extract_from_source
)

transform = PythonOperator(
    task_id='transform_data',
    python_callable=transform_data
)

load = PythonOperator(
    task_id='load_to_warehouse',
    python_callable=load_to_warehouse
)

extract >> transform >> load

Incremental Processing

def extract_incremental(last_run_date): query = f""" SELECT * FROM source_table WHERE updated_at > '{last_run_date}' """ return pd.read_sql(query, conn)

Error Handling

def safe_transform(data): try: transformed = transform_data(data) return transformed except Exception as e: logger.error(f"Transform failed: {e}") send_alert(f"Pipeline failed: {e}") raise

Best Practices

  • Make operations idempotent

  • Use incremental processing

  • Implement proper error handling

  • Add monitoring and alerts

  • Use data quality checks

  • Document pipeline logic

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

gcp-cost-optimizer

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

schema-designer

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

aws-cost-optimizer

No summary provided by upstream source.

Repository SourceNeeds Review