neo4j-cypher

Use this skill when writing, reviewing, or debugging Cypher queries for Neo4j. Covers query patterns, performance optimization, fraud-detection domain queries, and Neo4j 5+ syntax.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "neo4j-cypher" with this command: npx skills add michaelkeevildown/claude-agents-skills/michaelkeevildown-claude-agents-skills-neo4j-cypher

Neo4j Cypher

When to Use

Use this skill when writing, reviewing, or debugging Cypher queries for Neo4j. Covers query patterns, performance optimization, fraud-detection domain queries, and Neo4j 5+ syntax.

Core Query Patterns

MATCH and Filtering

-- Basic node match with property filter MATCH (c:Customer {customerId: $customerId}) RETURN c

-- Relationship traversal MATCH (c:Customer)-[:HAS_ACCOUNT]->(a:Account) WHERE a.status = 'active' RETURN c, a

-- Variable-length paths MATCH path = (a:Account)-[:TRANSACTION*1..5]->(b:Account) RETURN path

-- Multiple relationship types MATCH (c:Customer)-[:HAS_EMAIL|HAS_PHONE|HAS_SSN]->(pii) RETURN c, pii

OPTIONAL MATCH

Use when the related data may not exist. Without OPTIONAL, rows with no match are dropped entirely.

MATCH (c:Customer {customerId: $customerId}) OPTIONAL MATCH (c)-[:HAS_EMAIL]->(e:Email) OPTIONAL MATCH (c)-[:HAS_PHONE]->(p:Phone) RETURN c, collect(DISTINCT e) AS emails, collect(DISTINCT p) AS phones

WITH Chaining

Use WITH to pipe results between query stages, filter intermediate results, and control cardinality.

MATCH (c:Customer)-[:HAS_ACCOUNT]->(a:Account) MATCH (a)-[:PERFORM]->(tx:Transaction) WITH c, a, count(tx) AS txCount, sum(tx.amount) AS totalAmount WHERE txCount > 10 RETURN c.customerId, a.accountNumber, txCount, totalAmount ORDER BY totalAmount DESC

Aggregation

-- Group and aggregate MATCH (a:Account)-[:PERFORM]->(tx:Transaction) WITH a, count(tx) AS txCount, sum(tx.amount) AS total, avg(tx.amount) AS avgAmount RETURN a.accountNumber, txCount, total, avgAmount ORDER BY total DESC LIMIT 20

-- collect() for lists — use DISTINCT to avoid duplicates from cartesian products MATCH (c:Customer)-[:HAS_ACCOUNT]->(a:Account) OPTIONAL MATCH (a)-[:PERFORM]->(tx:Transaction) RETURN c.customerId, collect(DISTINCT a.accountNumber) AS accounts, count(DISTINCT tx) AS transactionCount

UNWIND

Expand a list into rows. Useful for parameterized batch operations.

UNWIND $accountNumbers AS accNum MATCH (a:Account {accountNumber: accNum}) RETURN a

CASE Expressions

MATCH (tx:Transaction) RETURN tx.amount, CASE WHEN tx.amount > 10000 THEN 'high' WHEN tx.amount > 1000 THEN 'medium' ELSE 'low' END AS riskTier

Subqueries (CALL {})

MATCH (c:Customer) CALL (c) { MATCH (c)-[:HAS_ACCOUNT]->(a:Account)-[:PERFORM]->(tx:Transaction) RETURN sum(tx.amount) AS totalSpend } RETURN c.customerId, totalSpend ORDER BY totalSpend DESC

Write Operations

CREATE and MERGE

-- CREATE always creates new CREATE (c:Customer {customerId: $id, firstName: $first, lastName: $last})

-- MERGE finds or creates — always specify the minimal unique key MERGE (e:Email {address: $email}) ON CREATE SET e.createdAt = datetime() ON MATCH SET e.lastSeen = datetime()

-- MERGE relationship MATCH (c:Customer {customerId: $customerId}) MATCH (e:Email {address: $email}) MERGE (c)-[:HAS_EMAIL]->(e)

SET and REMOVE

MATCH (a:Account {accountNumber: $accNum}) SET a.status = 'frozen', a.frozenAt = datetime(), a:Frozen REMOVE a:Active

DELETE

-- Delete node and all its relationships MATCH (n:TempNode {id: $id}) DETACH DELETE n

-- Delete specific relationship MATCH (c:Customer)-[r:HAS_EMAIL]->(e:Email {address: $email}) DELETE r

Performance

Index Usage

Always create indexes on properties used in MATCH/WHERE lookups.

-- Property index (most common) CREATE INDEX customer_id FOR (c:Customer) ON (c.customerId)

-- Composite index CREATE INDEX account_lookup FOR (a:Account) ON (a.accountNumber, a.status)

-- Text index for CONTAINS/STARTS WITH CREATE TEXT INDEX email_text FOR (e:Email) ON (e.address)

-- Verify indexes SHOW INDEXES

EXPLAIN and PROFILE

-- EXPLAIN: shows plan without executing EXPLAIN MATCH (c:Customer {customerId: '123'})-[:HAS_ACCOUNT]->(a:Account) RETURN c, a

-- PROFILE: executes and shows actual rows/db hits per operator PROFILE MATCH (c:Customer {customerId: '123'})-[:HAS_ACCOUNT]->(a:Account) RETURN c, a

Look for:

  • NodeByLabelScan → missing index, add one

  • CartesianProduct → unconnected MATCH clauses, connect them or use WITH

  • Eager → query plan can't stream, may cause memory issues on large datasets

  • High db hits relative to result rows → inefficient traversal

Parameterized Queries

Always use parameters ($param ) instead of string interpolation. This enables query plan caching and prevents injection.

-- Good MATCH (c:Customer {customerId: $customerId}) RETURN c

-- Bad — no plan cache, injection risk MATCH (c:Customer {customerId: '${userInput}'}) RETURN c

Avoiding Cartesian Products

-- BAD: two unconnected MATCH clauses = cartesian product MATCH (a:Account) MATCH (b:Bank) RETURN a, b -- rows = |accounts| × |banks|

-- GOOD: connect through relationships MATCH (a:Account)-[:PERFORM]->(tx:Transaction)-[:BENEFITS_TO]->(b:Bank) RETURN a, tx, b

-- GOOD: if truly independent, use UNION or separate queries

Limit Early, Filter Early

-- Push WHERE as early as possible MATCH (c:Customer) WHERE c.nationality = $country -- filter before traversal MATCH (c)-[:HAS_ACCOUNT]->(a:Account)-[:PERFORM]->(tx:Transaction) WHERE tx.amount > $threshold RETURN c, a, tx

Fraud-Domain Patterns

Shared PII Detection (Synthetic Identity)

MATCH (c1:Customer)-[:HAS_EMAIL|HAS_PHONE|HAS_SSN]->(pii)<-[:HAS_EMAIL|HAS_PHONE|HAS_SSN]-(c2:Customer) WHERE c1 <> c2 WITH c1, c2, collect(pii) AS sharedPII, count(pii) AS sharedCount WHERE sharedCount >= 2 RETURN c1.customerId, c2.customerId, sharedCount, [p IN sharedPII | labels(p)[0]] AS sharedTypes

Transaction Ring Detection

-- Circular fund flow (Neo4j 5.9+ quantified path patterns) MATCH path = (a:Account)-[:PERFORM]->(first_tx) ((tx_i)-[:BENEFITS_TO]->(a_i)-[:PERFORM]->(tx_j) WHERE tx_i.date < tx_j.date)* (last_tx)-[:BENEFITS_TO]->(a) WHERE size(apoc.coll.toSet([a] + a_i)) = size([a] + a_i) RETURN path

Fund Flow / Money Trail

-- Trace where money went from a specific account MATCH (source:Account {accountNumber: $accNum}) MATCH path = (source)-[:PERFORM]->(tx:Transaction)-[:BENEFITS_TO]->(dest) RETURN dest, tx.amount, tx.date, labels(dest)[0] AS destType ORDER BY tx.date DESC

Network Expansion (1-hop, 2-hop)

-- All entities within 2 hops of a customer MATCH path = (c:Customer {customerId: $customerId})-[*1..2]-(connected) RETURN path

Community Detection (with GDS)

-- Project graph CALL gds.graph.project('fraud-network', 'Customer', {LINKED: {type: 'LINKED', orientation: 'UNDIRECTED'}})

-- Run Weakly Connected Components CALL gds.wcc.stream('fraud-network') YIELD nodeId, componentId WITH componentId, collect(gds.util.asNode(nodeId).customerId) AS members WHERE size(members) > 1 RETURN componentId, members, size(members) AS clusterSize ORDER BY clusterSize DESC

-- Clean up projection CALL gds.graph.drop('fraud-network')

Centrality (PageRank / Betweenness)

CALL gds.pageRank.stream('fraud-network') YIELD nodeId, score RETURN gds.util.asNode(nodeId).customerId AS customerId, score ORDER BY score DESC LIMIT 20

Neo4j 5+ Features

Element IDs (replaces internal integer IDs)

-- Neo4j 5+: use elementId() instead of id() MATCH (n:Customer) WHERE elementId(n) = $elementId RETURN n

-- In Bloom scene actions MATCH (n) WHERE elementId(n) IN $nodes RETURN n

Quantified Path Patterns (5.9+)

-- Match paths of variable length with inline predicates MATCH path = (a:Account) (()-[:PERFORM]->(tx:Transaction)-[:BENEFITS_TO]->() WHERE tx.amount > 1000){2,5} (b:Account) RETURN path

Temporal Types

-- datetime(), date(), time(), duration() CREATE (tx:Transaction { timestamp: datetime(), settlementDate: date('2024-03-15'), processingTime: duration('PT2H30M') })

-- Filtering by date range MATCH (tx:Transaction) WHERE tx.timestamp >= datetime($startDate) AND tx.timestamp <= datetime($endDate) RETURN tx

-- Date arithmetic MATCH (tx:Transaction) WHERE tx.timestamp >= datetime() - duration({days: 30}) RETURN tx

COUNT {} and EXISTS {} Subqueries

-- Count subquery (Neo4j 5+) MATCH (c:Customer) WHERE COUNT { (c)-[:HAS_ACCOUNT]->(a:Account)-[:PERFORM]->(tx:Transaction) WHERE tx.amount > 10000 } > 5 RETURN c

-- EXISTS subquery MATCH (c:Customer) WHERE EXISTS { (c)-[:HAS_EMAIL]->(e:Email)<-[:HAS_EMAIL]-(other:Customer) WHERE c <> other } RETURN c

Anti-Patterns

  1. Collecting without DISTINCT

When multiple OPTIONAL MATCH clauses create cartesian products between collected lists:

-- BAD: emails × phones duplicates MATCH (c:Customer) OPTIONAL MATCH (c)-[:HAS_EMAIL]->(e:Email) OPTIONAL MATCH (c)-[:HAS_PHONE]->(p:Phone) RETURN c, collect(e) AS emails, collect(p) AS phones

-- GOOD: use DISTINCT RETURN c, collect(DISTINCT e) AS emails, collect(DISTINCT p) AS phones

  1. MERGE on Too Many Properties

-- BAD: if any property differs, creates duplicate MERGE (c:Customer {customerId: $id, firstName: $first, lastName: $last})

-- GOOD: merge on unique key, set other props MERGE (c:Customer {customerId: $id}) ON CREATE SET c.firstName = $first, c.lastName = $last

  1. Unbounded Variable-Length Paths

-- BAD: can explode on connected graphs MATCH path = (a)-[*]->(b) RETURN path

-- GOOD: always bound the length MATCH path = (a)-[*1..5]->(b) RETURN path

  1. Using Labels in WHERE Instead of MATCH

-- BAD: scans all nodes then filters MATCH (n) WHERE 'Customer' IN labels(n) RETURN n

-- GOOD: label in MATCH uses label index MATCH (n:Customer) RETURN n

  1. String Concatenation for Dynamic Queries

-- BAD: no plan caching, injection risk "MATCH (n {id: '" + userId + "'}) RETURN n"

-- GOOD: use parameters MATCH (n {id: $userId}) RETURN n

  1. Loading Too Much Data

-- BAD: returns everything MATCH (n) RETURN n

-- GOOD: limit and paginate MATCH (n:Customer) RETURN n ORDER BY n.customerId SKIP $offset LIMIT $pageSize

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

neo4j-data-models

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

git-workflow

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

fastapi

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

neo4j-driver-js

No summary provided by upstream source.

Repository SourceNeeds Review