integrations-index

Dagster Integrations Index

Navigate 82+ Dagster integrations organized by Dagster's official taxonomy. Find AI/ML tools, ETL platforms, data storage, compute services, BI tools, and monitoring integrations.

When to Use This Skill vs. Others

If User Says... Use This Skill/Command Why

"which integration for X" /dagster-integrations

Need to discover appropriate integration

"does dagster support X" /dagster-integrations

Check integration availability

"snowflake vs bigquery" /dagster-integrations

Compare integrations in same category

"best practices for X" /dagster-conventions

Implementation patterns needed

"implement X integration" /dg:prototype

Ready to build with specific integration

"how do I use dbt" /dagster-conventions (dbt section) dbt-specific implementation patterns

"make this code better" /dignified-python

Python code review needed

"create new project" /dg:create-project

Project initialization needed

Quick Reference by Category

Category Count Common Tools Reference

AI & ML 6 OpenAI, Anthropic, MLflow, W&B references/ai.md

ETL/ELT 9 dbt, Fivetran, Airbyte, PySpark references/etl.md

Storage 35+ Snowflake, BigQuery, Postgres, DuckDB references/storage.md

Compute 15+ AWS, Databricks, Spark, Docker, K8s references/compute.md

BI & Visualization 7 Looker, Tableau, PowerBI, Sigma references/bi.md

Monitoring 3 Datadog, Prometheus, Papertrail references/monitoring.md

Alerting 6 Slack, PagerDuty, MS Teams, Twilio references/alerting.md

Testing 2 Great Expectations, Pandera references/testing.md

Other 2+ Pandas, Polars references/other.md

Category Taxonomy

This index aligns with Dagster's official documentation taxonomy from tags.yml:

ai: Artificial intelligence and machine learning integrations (LLM APIs, experiment tracking)
etl: Extract, transform, and load tools including data replication and transformation frameworks
storage: Databases, data warehouses, object storage, and table formats
compute: Cloud platforms, container orchestration, and distributed processing frameworks
bi: Business intelligence and visualization platforms
monitoring: Observability platforms and metrics systems for tracking performance
alerting: Notification and incident management systems for pipeline alerts
testing: Data quality validation and testing frameworks
other: Miscellaneous integrations including DataFrame libraries

Note: Support levels (dagster-supported, community-supported) are shown inline in each integration entry.

Last verified: 2026-01-27

Finding the Right Integration

I need to...

Load data from external sources

SaaS applications → ETL (Fivetran, Airbyte)
Files/databases → ETL (dlt, Sling, Meltano)
Cloud storage → Storage (S3, GCS, Azure Blob)

Transform data

SQL transformations → ETL (dbt)
Distributed transformations → ETL (PySpark)
DataFrame operations → Other (Pandas, Polars)
Large-scale processing → Compute (Spark, Dask, Ray)

Store data

Cloud data warehouse → Storage (Snowflake, BigQuery, Redshift)
Relational database → Storage (Postgres, MySQL)
File/object storage → Storage (S3, GCS, Azure, LakeFS)
Analytics database → Storage (DuckDB)
Vector embeddings → Storage (Weaviate, Chroma, Qdrant)

Validate data quality

Schema validation → Testing (Pandera)
Quality checks → Testing (Great Expectations)

Run ML workloads

LLM integration → AI (OpenAI, Anthropic, Gemini)
Experiment tracking → AI (MLflow, W&B)
Distributed training → Compute (Ray, Spark)

Execute computation

Cloud compute → Compute (AWS, Azure, GCP, Databricks)
Containers → Compute (Docker, Kubernetes)
Distributed processing → Compute (Spark, Dask, Ray)

Monitor pipelines

Team notifications → Alerting (Slack, MS Teams, PagerDuty)
Metrics tracking → Monitoring (Datadog, Prometheus)
Log aggregation → Monitoring (Papertrail)

Visualize data

BI dashboards → BI (Looker, Tableau, PowerBI)
Analytics platform → BI (Sigma, Hex, Evidence)

Integration Categories

AI & ML

Artificial intelligence and machine learning platforms, including LLM APIs and experiment tracking.

Key integrations:

OpenAI - GPT models and embeddings API
Anthropic - Claude AI models
Gemini - Google's multimodal AI
MLflow - Experiment tracking and model registry
Weights & Biases - ML experiment tracking
NotDiamond - LLM routing and optimization

See references/ai.md for all AI/ML integrations.

ETL/ELT

Extract, transform, and load tools for data ingestion, transformation, and replication.

Key integrations:

dbt - SQL-based transformation with automatic dependencies
Fivetran - Automated SaaS data ingestion (component-based)
Airbyte - Open-source ELT platform
dlt - Python-based data loading (component-based)
Sling - High-performance data replication (component-based)
PySpark - Distributed data transformation
Meltano - ELT for the modern data stack

See references/etl.md for all ETL/ELT integrations.

Storage

Data warehouses, databases, object storage, vector databases, and table formats.

Key integrations:

Snowflake - Cloud data warehouse with IO managers
BigQuery - Google's serverless data warehouse
DuckDB - In-process SQL analytics
Postgres - Open-source relational database
Weaviate - Vector database for AI search
Delta Lake - ACID transactions for data lakes
DataHub - Metadata catalog and lineage

See references/storage.md for all storage integrations.

Compute

Cloud platforms, container orchestration, and distributed processing frameworks.

Key integrations:

AWS - Cloud compute services (Glue, EMR, Lambda)
Databricks - Unified analytics platform
GCP - Google Cloud compute (Dataproc, Cloud Run)
Spark - Distributed data processing engine
Dask - Parallel computing framework
Docker - Container execution with Pipes
Kubernetes - Cloud-native orchestration
Ray - Distributed computing for ML

See references/compute.md for all compute integrations.

BI & Visualization

Business intelligence and visualization platforms for analytics and reporting.

Key integrations:

Looker - Google's BI platform
Tableau - Interactive dashboards
PowerBI - Microsoft's BI tool
Sigma - Cloud analytics platform
Hex - Collaborative notebooks
Evidence - Markdown-based BI
Cube - Semantic layer platform

See references/bi.md for all BI integrations.

Monitoring

Observability platforms and metrics systems for tracking pipeline performance.

Key integrations:

Datadog - Comprehensive observability platform
Prometheus - Time-series metrics collection
Papertrail - Centralized log management

See references/monitoring.md for all monitoring integrations.

Alerting

Notification and incident management systems for pipeline alerts.

Key integrations:

Slack - Team messaging and alerts
PagerDuty - Incident management for on-call
MS Teams - Microsoft Teams notifications
Twilio - SMS and voice notifications
Apprise - Universal notification platform
DingTalk - Team communication for Asian markets

See references/alerting.md for all alerting integrations.

Testing

Data quality validation and testing frameworks for ensuring data reliability.

Key integrations:

Great Expectations - Data validation with expectations
Pandera - Statistical data validation for DataFrames

See references/testing.md for all testing integrations.

Other

Miscellaneous integrations including DataFrame libraries and utility tools.

Key integrations:

Pandas - In-memory DataFrame library
Polars - Fast DataFrame library with columnar storage

See references/other.md for other integrations.

References

Integration details are organized in the following files:

AI & ML: references/ai.md
AI and ML platforms, LLM APIs, experiment tracking
ETL/ELT: references/etl.md
Data ingestion, transformation, and replication tools
Storage: references/storage.md
Warehouses, databases, object storage, vector DBs
Compute: references/compute.md
Cloud platforms, containers, distributed processing
BI & Visualization: references/bi.md
Business intelligence and analytics platforms
Monitoring: references/monitoring.md
Observability and metrics systems
Alerting: references/alerting.md
Notifications and incident management
Testing: references/testing.md
Data quality and validation frameworks
Other: references/other.md
DataFrame libraries and miscellaneous tools

Using Integrations

Most Dagster integrations follow a common pattern:

Install the package:

pip install dagster-<integration>

Import and configure a resource:

from dagster_<integration> import <Integration>Resource

resource = <Integration>Resource( config_param=dg.EnvVar("ENV_VAR") )

Use in your assets:

@dg.asset def my_asset(integration: <Integration>Resource): # Use the integration pass

For component-based integrations (dbt, Fivetran, dlt, Sling), see the specific reference files for scaffolding and configuration patterns.

integrations-index

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

dagster-expert

dagster-best-practices

dagster-conventions

dignified-python