databricks-debug-bundle

Databricks Debug Bundle

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "databricks-debug-bundle" with this command: npx skills add jeremylongshore/claude-code-plugins-plus-skills/jeremylongshore-claude-code-plugins-plus-skills-databricks-debug-bundle

Databricks Debug Bundle

Overview

Collect all necessary diagnostic information for Databricks support tickets.

Prerequisites

  • Databricks CLI installed and configured

  • Access to cluster logs (admin or cluster owner)

  • Permission to access job run details

Instructions

Step 1: Create Debug Bundle Script

#!/bin/bash

databricks-debug-bundle.sh

set -e BUNDLE_DIR="databricks-debug-$(date +%Y%m%d-%H%M%S)" mkdir -p "$BUNDLE_DIR"

echo "=== Databricks Debug Bundle ===" > "$BUNDLE_DIR/summary.txt" echo "Generated: $(date)" >> "$BUNDLE_DIR/summary.txt" echo "Workspace: ${DATABRICKS_HOST}" >> "$BUNDLE_DIR/summary.txt" echo "" >> "$BUNDLE_DIR/summary.txt"

Step 2: Collect Environment Info

set -euo pipefail

Environment info

echo "--- Environment ---" >> "$BUNDLE_DIR/summary.txt" echo "CLI Version: $(databricks --version)" >> "$BUNDLE_DIR/summary.txt" echo "Python: $(python --version 2>&1)" >> "$BUNDLE_DIR/summary.txt" echo "Databricks SDK: $(pip show databricks-sdk 2>/dev/null | grep Version)" >> "$BUNDLE_DIR/summary.txt" echo "DATABRICKS_HOST: ${DATABRICKS_HOST}" >> "$BUNDLE_DIR/summary.txt" echo "DATABRICKS_TOKEN: ${DATABRICKS_TOKEN:+[SET]}" >> "$BUNDLE_DIR/summary.txt" echo "" >> "$BUNDLE_DIR/summary.txt"

Workspace info

echo "--- Workspace Info ---" >> "$BUNDLE_DIR/summary.txt" databricks current-user me 2>&1 >> "$BUNDLE_DIR/summary.txt" || echo "Failed to get user info" echo "" >> "$BUNDLE_DIR/summary.txt"

Step 3: Collect Cluster Information

Cluster details (if cluster_id provided)

CLUSTER_ID="${1:-}" if [ -n "$CLUSTER_ID" ]; then echo "--- Cluster Info: $CLUSTER_ID ---" >> "$BUNDLE_DIR/summary.txt" databricks clusters get --cluster-id "$CLUSTER_ID" > "$BUNDLE_DIR/cluster_info.json" 2>&1

# Extract key info
jq -r '{
    state: .state,
    spark_version: .spark_version,
    node_type_id: .node_type_id,
    num_workers: .num_workers,
    autotermination_minutes: .autotermination_minutes
}' "$BUNDLE_DIR/cluster_info.json" >> "$BUNDLE_DIR/summary.txt"

# Get cluster events
echo "--- Recent Cluster Events ---" >> "$BUNDLE_DIR/summary.txt"
databricks clusters events --cluster-id "$CLUSTER_ID" --limit 20 > "$BUNDLE_DIR/cluster_events.json" 2>&1
jq -r '.events[] | "\(.timestamp): \(.type) - \(.details)"' "$BUNDLE_DIR/cluster_events.json" >> "$BUNDLE_DIR/summary.txt" 2>/dev/null

fi

Step 4: Collect Job Run Information

Job run details (if run_id provided)

RUN_ID="${2:-}" if [ -n "$RUN_ID" ]; then echo "--- Job Run Info: $RUN_ID ---" >> "$BUNDLE_DIR/summary.txt" databricks runs get --run-id "$RUN_ID" > "$BUNDLE_DIR/run_info.json" 2>&1

# Extract run state
jq -r '{
    state: .state.life_cycle_state,
    result: .state.result_state,
    message: .state.state_message,
    start_time: .start_time,
    end_time: .end_time
}' "$BUNDLE_DIR/run_info.json" >> "$BUNDLE_DIR/summary.txt"

# Get run output
echo "--- Run Output ---" >> "$BUNDLE_DIR/summary.txt"
databricks runs get-output --run-id "$RUN_ID" > "$BUNDLE_DIR/run_output.json" 2>&1
jq -r '.error // "No error"' "$BUNDLE_DIR/run_output.json" >> "$BUNDLE_DIR/summary.txt"

# Task-level details
jq -r '.tasks[] | "Task \(.task_key): \(.state.result_state)"' "$BUNDLE_DIR/run_info.json" >> "$BUNDLE_DIR/summary.txt" 2>/dev/null

fi

Step 5: Collect Spark Logs

Spark driver logs (requires cluster_id)

if [ -n "$CLUSTER_ID" ]; then echo "--- Spark Driver Logs (last 500 lines) ---" > "$BUNDLE_DIR/driver_logs.txt" # HTTP 500 Internal Server Error

# Get logs via API
python3 << EOF >> "$BUNDLE_DIR/driver_logs.txt" 2>&1

from databricks.sdk import WorkspaceClient w = WorkspaceClient() try: logs = w.clusters.get_cluster_driver_logs(cluster_id="$CLUSTER_ID") print(logs.log_content[:50000] if logs.log_content else "No logs available") # 50000ms = 50 seconds except Exception as e: print(f"Error fetching logs: {e}") EOF fi

Step 6: Collect Delta Table Info

Delta table diagnostics (if table provided)

TABLE_NAME="${3:-}" if [ -n "$TABLE_NAME" ]; then echo "--- Delta Table Info: $TABLE_NAME ---" >> "$BUNDLE_DIR/summary.txt"

python3 << EOF >> "$BUNDLE_DIR/delta_info.txt" 2>&1

from databricks.sdk import WorkspaceClient from databricks.connect import DatabricksSession

w = WorkspaceClient() spark = DatabricksSession.builder.getOrCreate()

Table history

print("=== Table History ===") history_df = spark.sql(f"DESCRIBE HISTORY {TABLE_NAME} LIMIT 20") history_df.show(truncate=False)

Table details

print("\n=== Table Details ===") spark.sql(f"DESCRIBE DETAIL {TABLE_NAME}").show(truncate=False)

Schema

print("\n=== Schema ===") spark.sql(f"DESCRIBE {TABLE_NAME}").show(truncate=False) EOF fi

Step 7: Package Bundle

set -euo pipefail

Create config snapshot (redacted)

echo "--- Config (redacted) ---" >> "$BUNDLE_DIR/summary.txt" cat ~/.databrickscfg 2>/dev/null | sed 's/token = .*/token = REDACTED/' >> "$BUNDLE_DIR/config-redacted.txt"

Network connectivity test

echo "--- Network Test ---" >> "$BUNDLE_DIR/summary.txt" echo -n "API Health: " >> "$BUNDLE_DIR/summary.txt" curl -s -o /dev/null -w "%{http_code}" "${DATABRICKS_HOST}/api/2.0/clusters/list"
-H "Authorization: Bearer ${DATABRICKS_TOKEN}" >> "$BUNDLE_DIR/summary.txt" echo "" >> "$BUNDLE_DIR/summary.txt"

Package everything

tar -czf "$BUNDLE_DIR.tar.gz" "$BUNDLE_DIR" rm -rf "$BUNDLE_DIR"

echo "Bundle created: $BUNDLE_DIR.tar.gz" echo "" echo "Contents:" echo " - summary.txt: Environment and error summary" echo " - cluster_info.json: Cluster configuration" echo " - cluster_events.json: Recent cluster events" echo " - run_info.json: Job run details" echo " - run_output.json: Task outputs and errors" echo " - driver_logs.txt: Spark driver logs" echo " - delta_info.txt: Delta table diagnostics" echo " - config-redacted.txt: CLI configuration (secrets removed)"

Output

  • databricks-debug-YYYYMMDD-HHMMSS.tar.gz archive containing:

  • summary.txt

  • Environment and error summary

  • cluster_info.json

  • Cluster configuration

  • cluster_events.json

  • Recent cluster events

  • run_info.json

  • Job run details

  • driver_logs.txt

  • Spark driver logs

  • config-redacted.txt

  • Configuration (secrets removed)

Error Handling

Item Purpose Included

Environment versions Compatibility check Yes

Cluster config Hardware/software setup Yes

Cluster events State changes, errors Yes

Job run details Task failures, timing Yes

Spark logs Stack traces, exceptions Yes

Delta table info Schema, history Optional

Examples

Sensitive Data Handling

ALWAYS REDACT:

  • API tokens and secrets

  • Personal access tokens

  • Connection strings

  • PII in logs

Safe to Include:

  • Error messages

  • Stack traces (check for PII)

  • Cluster IDs, job IDs

  • Configuration (without secrets)

Usage

Basic bundle (environment only)

./databricks-debug-bundle.sh

With cluster diagnostics

./databricks-debug-bundle.sh cluster-12345-abcde # port 12345 - example/test

With job run diagnostics

./databricks-debug-bundle.sh cluster-12345-abcde 67890 # 67890 = configured value

Full diagnostics with Delta table

./databricks-debug-bundle.sh cluster-12345 67890 catalog.schema.table

Submit to Support

  • Create bundle: bash databricks-debug-bundle.sh [cluster-id] [run-id]

  • Review for sensitive data

  • Open support ticket at Databricks Support

  • Attach bundle to ticket

Resources

  • Databricks Support

  • Community Forum

  • Status Page

Next Steps

For rate limit issues, see databricks-rate-limits .

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

backtesting-trading-strategies

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

svg-icon-generator

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

performance-lighthouse-runner

No summary provided by upstream source.

Repository SourceNeeds Review