dbt Model Development
Read before you write. Build after you write. Verify your output.
Critical Rules
-
ALWAYS run dbt build after creating/modifying models - compile is NOT enough
-
ALWAYS verify output after build using dbt show
-
don't assume success
-
If build fails 3+ times, stop and reassess your entire approach
Workflow
- Understand the Task Requirements
-
What columns are needed? List them explicitly.
-
What is the grain of the table (one row per what)?
-
What calculations or aggregations are required?
- Discover Project Conventions
cat dbt_project.yml find models/ -name "*.sql" | head -20
Read 2-3 existing models to learn naming, config, and SQL patterns.
- Find Similar Models
Find models with similar purpose
find models/ -name "agg.sql" -o -name "fct_.sql" | head -5
Learn from existing models: join types, aggregation patterns, NULL handling.
- Check Upstream Data
Preview upstream data if needed
dbt show --select <upstream_model> --limit 10
- Write the Model
Follow discovered conventions. Match the required columns exactly.
- Compile (Syntax Check)
dbt compile --select <model_name>
- BUILD - MANDATORY
This step is REQUIRED. Do NOT skip it.
dbt build --select <model_name>
If build fails:
-
Read the error carefully
-
Fix the specific issue
-
Run build again
-
If fails 3+ times, step back and reassess approach
- Verify Output (CRITICAL)
Build success does NOT mean correct output.
Check the table was created and preview data
dbt show --select <model_name> --limit 10
Verify:
-
Column names match requirements exactly
-
Row count is reasonable
-
Data values look correct
-
No unexpected NULLs
- Verify Calculations Against Sample Data
For models with calculations, verify correctness manually:
Pick a specific row and verify calculation by hand
dbt show --inline " select * from {{ ref('model_name') }} where <primary_key> = '<known_value>' " --limit 1
Cross-check aggregations
dbt show --inline " select count(*), sum(<column>) from {{ ref('model_name') }} "
For example, if calculating total_revenue = quantity * price :
-
Pick one row from output
-
Look up the source quantity and price
-
Manually calculate: does it match?
- Re-review Against Requirements
Before declaring done, re-read the original request:
-
Did you implement what was asked, not what you assumed?
-
Are column names exactly as specified?
-
Is the calculation logic correct per the requirements?
-
Does the grain (one row per what?) match what was requested?
Anti-Patterns
-
Declaring done after compile without running build
-
Not verifying output data after build
-
Getting stuck in compile/build error loops
-
Assuming table exists just because model file exists
-
Writing SQL without checking existing model patterns first