integrations-index

Dagster Integrations Index

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "integrations-index" with this command: npx skills add dagster-io/dagster-claude-plugins/dagster-io-dagster-claude-plugins-integrations-index

Dagster Integrations Index

Navigate 82+ Dagster integrations organized by Dagster's official taxonomy. Find AI/ML tools, ETL platforms, data storage, compute services, BI tools, and monitoring integrations.

When to Use This Skill vs. Others

If User Says... Use This Skill/Command Why

"which integration for X" /dagster-integrations

Need to discover appropriate integration

"does dagster support X" /dagster-integrations

Check integration availability

"snowflake vs bigquery" /dagster-integrations

Compare integrations in same category

"best practices for X" /dagster-conventions

Implementation patterns needed

"implement X integration" /dg:prototype

Ready to build with specific integration

"how do I use dbt" /dagster-conventions (dbt section) dbt-specific implementation patterns

"make this code better" /dignified-python

Python code review needed

"create new project" /dg:create-project

Project initialization needed

Quick Reference by Category

Category Count Common Tools Reference

AI & ML 6 OpenAI, Anthropic, MLflow, W&B references/ai.md

ETL/ELT 9 dbt, Fivetran, Airbyte, PySpark references/etl.md

Storage 35+ Snowflake, BigQuery, Postgres, DuckDB references/storage.md

Compute 15+ AWS, Databricks, Spark, Docker, K8s references/compute.md

BI & Visualization 7 Looker, Tableau, PowerBI, Sigma references/bi.md

Monitoring 3 Datadog, Prometheus, Papertrail references/monitoring.md

Alerting 6 Slack, PagerDuty, MS Teams, Twilio references/alerting.md

Testing 2 Great Expectations, Pandera references/testing.md

Other 2+ Pandas, Polars references/other.md

Category Taxonomy

This index aligns with Dagster's official documentation taxonomy from tags.yml:

  • ai: Artificial intelligence and machine learning integrations (LLM APIs, experiment tracking)

  • etl: Extract, transform, and load tools including data replication and transformation frameworks

  • storage: Databases, data warehouses, object storage, and table formats

  • compute: Cloud platforms, container orchestration, and distributed processing frameworks

  • bi: Business intelligence and visualization platforms

  • monitoring: Observability platforms and metrics systems for tracking performance

  • alerting: Notification and incident management systems for pipeline alerts

  • testing: Data quality validation and testing frameworks

  • other: Miscellaneous integrations including DataFrame libraries

Note: Support levels (dagster-supported, community-supported) are shown inline in each integration entry.

Last verified: 2026-01-27

Finding the Right Integration

I need to...

Load data from external sources

  • SaaS applications → ETL (Fivetran, Airbyte)

  • Files/databases → ETL (dlt, Sling, Meltano)

  • Cloud storage → Storage (S3, GCS, Azure Blob)

Transform data

  • SQL transformations → ETL (dbt)

  • Distributed transformations → ETL (PySpark)

  • DataFrame operations → Other (Pandas, Polars)

  • Large-scale processing → Compute (Spark, Dask, Ray)

Store data

  • Cloud data warehouse → Storage (Snowflake, BigQuery, Redshift)

  • Relational database → Storage (Postgres, MySQL)

  • File/object storage → Storage (S3, GCS, Azure, LakeFS)

  • Analytics database → Storage (DuckDB)

  • Vector embeddings → Storage (Weaviate, Chroma, Qdrant)

Validate data quality

  • Schema validation → Testing (Pandera)

  • Quality checks → Testing (Great Expectations)

Run ML workloads

  • LLM integration → AI (OpenAI, Anthropic, Gemini)

  • Experiment tracking → AI (MLflow, W&B)

  • Distributed training → Compute (Ray, Spark)

Execute computation

  • Cloud compute → Compute (AWS, Azure, GCP, Databricks)

  • Containers → Compute (Docker, Kubernetes)

  • Distributed processing → Compute (Spark, Dask, Ray)

Monitor pipelines

  • Team notifications → Alerting (Slack, MS Teams, PagerDuty)

  • Metrics tracking → Monitoring (Datadog, Prometheus)

  • Log aggregation → Monitoring (Papertrail)

Visualize data

  • BI dashboards → BI (Looker, Tableau, PowerBI)

  • Analytics platform → BI (Sigma, Hex, Evidence)

Integration Categories

AI & ML

Artificial intelligence and machine learning platforms, including LLM APIs and experiment tracking.

Key integrations:

  • OpenAI - GPT models and embeddings API

  • Anthropic - Claude AI models

  • Gemini - Google's multimodal AI

  • MLflow - Experiment tracking and model registry

  • Weights & Biases - ML experiment tracking

  • NotDiamond - LLM routing and optimization

See references/ai.md for all AI/ML integrations.

ETL/ELT

Extract, transform, and load tools for data ingestion, transformation, and replication.

Key integrations:

  • dbt - SQL-based transformation with automatic dependencies

  • Fivetran - Automated SaaS data ingestion (component-based)

  • Airbyte - Open-source ELT platform

  • dlt - Python-based data loading (component-based)

  • Sling - High-performance data replication (component-based)

  • PySpark - Distributed data transformation

  • Meltano - ELT for the modern data stack

See references/etl.md for all ETL/ELT integrations.

Storage

Data warehouses, databases, object storage, vector databases, and table formats.

Key integrations:

  • Snowflake - Cloud data warehouse with IO managers

  • BigQuery - Google's serverless data warehouse

  • DuckDB - In-process SQL analytics

  • Postgres - Open-source relational database

  • Weaviate - Vector database for AI search

  • Delta Lake - ACID transactions for data lakes

  • DataHub - Metadata catalog and lineage

See references/storage.md for all storage integrations.

Compute

Cloud platforms, container orchestration, and distributed processing frameworks.

Key integrations:

  • AWS - Cloud compute services (Glue, EMR, Lambda)

  • Databricks - Unified analytics platform

  • GCP - Google Cloud compute (Dataproc, Cloud Run)

  • Spark - Distributed data processing engine

  • Dask - Parallel computing framework

  • Docker - Container execution with Pipes

  • Kubernetes - Cloud-native orchestration

  • Ray - Distributed computing for ML

See references/compute.md for all compute integrations.

BI & Visualization

Business intelligence and visualization platforms for analytics and reporting.

Key integrations:

  • Looker - Google's BI platform

  • Tableau - Interactive dashboards

  • PowerBI - Microsoft's BI tool

  • Sigma - Cloud analytics platform

  • Hex - Collaborative notebooks

  • Evidence - Markdown-based BI

  • Cube - Semantic layer platform

See references/bi.md for all BI integrations.

Monitoring

Observability platforms and metrics systems for tracking pipeline performance.

Key integrations:

  • Datadog - Comprehensive observability platform

  • Prometheus - Time-series metrics collection

  • Papertrail - Centralized log management

See references/monitoring.md for all monitoring integrations.

Alerting

Notification and incident management systems for pipeline alerts.

Key integrations:

  • Slack - Team messaging and alerts

  • PagerDuty - Incident management for on-call

  • MS Teams - Microsoft Teams notifications

  • Twilio - SMS and voice notifications

  • Apprise - Universal notification platform

  • DingTalk - Team communication for Asian markets

See references/alerting.md for all alerting integrations.

Testing

Data quality validation and testing frameworks for ensuring data reliability.

Key integrations:

  • Great Expectations - Data validation with expectations

  • Pandera - Statistical data validation for DataFrames

See references/testing.md for all testing integrations.

Other

Miscellaneous integrations including DataFrame libraries and utility tools.

Key integrations:

  • Pandas - In-memory DataFrame library

  • Polars - Fast DataFrame library with columnar storage

See references/other.md for other integrations.

References

Integration details are organized in the following files:

  • AI & ML: references/ai.md

  • AI and ML platforms, LLM APIs, experiment tracking

  • ETL/ELT: references/etl.md

  • Data ingestion, transformation, and replication tools

  • Storage: references/storage.md

  • Warehouses, databases, object storage, vector DBs

  • Compute: references/compute.md

  • Cloud platforms, containers, distributed processing

  • BI & Visualization: references/bi.md

  • Business intelligence and analytics platforms

  • Monitoring: references/monitoring.md

  • Observability and metrics systems

  • Alerting: references/alerting.md

  • Notifications and incident management

  • Testing: references/testing.md

  • Data quality and validation frameworks

  • Other: references/other.md

  • DataFrame libraries and miscellaneous tools

Using Integrations

Most Dagster integrations follow a common pattern:

Install the package:

pip install dagster-<integration>

Import and configure a resource:

from dagster_<integration> import <Integration>Resource

resource = <Integration>Resource( config_param=dg.EnvVar("ENV_VAR") )

Use in your assets:

@dg.asset def my_asset(integration: <Integration>Resource): # Use the integration pass

For component-based integrations (dbt, Fivetran, dlt, Sling), see the specific reference files for scaffolding and configuration patterns.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

dagster-expert

No summary provided by upstream source.

Repository SourceNeeds Review
General

dagster-best-practices

No summary provided by upstream source.

Repository SourceNeeds Review
General

dagster-conventions

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

dignified-python

No summary provided by upstream source.

Repository SourceNeeds Review