Databricks Data

# Databricks

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Databricks Data" with this command: npx skills add hanxueyuan/databricks-data

Databricks

Overview

Databricks is a data and AI company that pioneered the "lakehouse" architecture, unifying data warehousing and data lake capabilities. Founded by the original creators of Apache Spark at UC Berkeley, the company has grown from an open-source data processing tool into a $43B AI infrastructure platform competing directly with Snowflake.

历史时间线

  • 2013: Founded by Matei Zaharia (Spark creator) + 5 UC Berkeley researchers in a San Francisco loft
  • 2014: Open-sources Apache Spark — becomes #1 distributed processing engine
  • 2016: Delta Lake project launched — adds ACID transactions to data lakes
  • 2019: Introduces "lakehouse" concept, challenging traditional data warehouse model
  • 2020: Raises $400M at $6.2B valuation
  • 2021: Acquires 8080 Labs (creators of dbt alternative)
  • 2023: DBRX model released — competitive with GPT-4 in some benchmarks
  • 2023: Raises $500M at $43B valuation, preparing for IPO
  • 2024: Lakehouse AI features launch, embedding ML directly into data workflows
  • 2024: Revenue surpasses $2B ARR

商业模式

消耗量定价+企业订阅:

  • 平台使用费: 按数据处理量(DBU - Databricks Units)计费,类似AWS消耗模式
  • 企业版: 安全管理、协作功能、SLA保障,年合同$50K-10M+
  • AI功能: Mosaic ML、Lakehouse AI,为AI训练和推理提供专属定价层
  • 市场生态: Databricks Marketplace — 数据产品交易抽成

护城河分析

  • 开源根基: Spark是事实标准,Hadoop生态继承者,开发者心智占有率极高
  • Lakehouse创新: 融合数据湖(低成本存储)与数据仓库(高性能查询)的最佳特性
  • AI原生: 从数据到ML训练到推理的端到端平台,比Snowflake更偏AI
  • 云中立: 同时运行在AWS、Azure、GCP上,客户不受单一云厂商锁定

关键数据

  • ARR: $2B+ (2024)
  • 估值: $43B (2023年融资)
  • 客户数: 10,000+ 企业
  • Fortune 500采用率: 50%+
  • 开发者社区: Spark全球最大开源数据社区
  • 融资总额: $3.9B+

有趣事实

创始人Matei Zaharia创建Apache Spark时是UC Berkeley的博士生 — Spark的论文在他的博士答辩前仅几周发布,随后Spark成为分布式计算历史上引用次数最多的论文之一,Databricks公司也是从这篇论文直接孵化而来。

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

qwencloud-model-selector

[QwenCloud] Recommend the best Qwen model and parameters. TRIGGER when: choosing between Qwen models, comparing Qwen model pricing, understanding Qwen model...

Registry SourceRecently Updated
General

deployment-manager

You are a deployment manager with expertise in release orchestration, deployment strategies, and production reliability. Use when: release orchestration and...

Registry SourceRecently Updated
General

Hk Stock Morning Report

Generate HK stock market morning report (股市晨報) for bank trading desks. Triggers: "生成晨报", "股市晨报", "今日股市", "港股晨報" 報告結構(5部分): 1. 市場回顧(恒指/科指/國指 + 強弱勢股) 2. 南下資金(總...

Registry SourceRecently Updated
General

Story Long Scan

长篇网文扫榜。分析起点、番茄、晋江等平台排行榜数据,提炼市场趋势与热门题材。 触发方式:/story-long-scan、/长篇扫榜、「长篇什么火」「起点排行」

Registry SourceRecently Updated