Unit of Work Implementer

When to Use

A Unit of Work (UoW) is the coordinator object that tracks every entity touched during a business operation — newly created, loaded and modified, or deleted — and then flushes all changes to the database together in the correct order inside a single system transaction.

Use this skill when:

You have a Data Mapper layer and need change tracking discipline across a business operation
Your code makes too many database round trips (one UPDATE per field change, not one per commit)
You are scoping an ORM session (Hibernate Session, EF DbContext, SQLAlchemy Session) and want to understand the underlying contract
You are implementing a custom Data Mapper layer without a framework and need to track dirty objects
You are wiring optimistic locking and need a single commit point to run version-conditioned updates

Prerequisite: Data Mapper must be the chosen data-source pattern. If it has not been selected, invoke data-source-pattern-selector first, or ask the user to confirm their persistence approach before proceeding. UoW adds a coordination layer that Active Record codebases do not need.

Context & Input Gathering

Required

Data-source pattern confirmation: Verify Data Mapper is in use or being introduced. If Active Record: stop, explain UoW is not applicable, offer data-source-pattern-selector.
Language and framework: Which language and ORM (if any)? This determines whether UoW is already built-in (Hibernate, EF Core, SQLAlchemy) or must be hand-rolled.
Entity classes: What are the domain objects (e.g., Order, LineItem, Product)?
FK dependency graph: Which entities reference which? This determines INSERT and DELETE ordering.
Transaction boundary: Where does one business operation begin and end (per-HTTP-request, per-command, per-service call)?

Helpful

Existing mapper or repository classes — their find() / insert() / update() / delete() methods will be called by the UoW on commit.
Any existing dirty-tracking or session management code.
Whether Optimistic Offline Lock (version columns) or Lazy Load (proxy collections) is in use — both require UoW integration.

Defaults if not specified

Unknown ORM → ask before generating stack-specific code; provide pseudocode in the interim.
Unknown FK graph → assume single-level parent/child; flag ordering analysis as required.
Unknown transaction boundary → default to per-request scope; warn about cross-request sharing.

Process

Step 1 — Confirm Data Mapper prerequisite.

WHY: Unit of Work is designed to work with Data Mapper's separation of domain objects from SQL. Active Record embeds persistence in the entity itself; adding a UoW coordinator duplicates responsibility and creates confusion about who owns the save call. Confirming the prerequisite prevents a misapplication that will complicate the codebase.

Data Mapper confirmed → continue.
Active Record found → stop. Explain that AR handles persistence per-object; suggest data-source-pattern-selector if the team wants to reassess.
Unknown → invoke data-source-pattern-selector or ask user directly.

Step 2 — Choose the registration strategy.

WHY: The UoW must know which objects have changed. There are three strategies, each with a different trade-off between transparency and coupling. Choosing the wrong one for the stack and team leads to missed registrations (caller registration) or domain-layer coupling (object registration).

Evaluate each option:

Strategy	How it works	Best for	Risk
Caller registration	Application code calls `uow.registerDirty(entity)` explicitly	Simple custom layers, greenfield	Easy to forget; silent data loss
Object registration	Entity setters call `UoW.getCurrent().registerDirty(this)`	Custom frameworks; Java/C# domain objects	Couples domain to UoW; requires access to current UoW
UoW-controlled (copy-on-load)	UoW registers clean objects on load; detects changes at commit via snapshot comparison	ORM-provided (Hibernate, EF, SQLAlchemy)	Higher memory overhead; infrastructure-heavy

Decision:

Using Hibernate / EF Core / SQLAlchemy → use the built-in Session/DbContext (UoW-controlled). The skill maps your usage to the UoW contract (see Step 9).
Custom framework → prefer object registration; use caller registration only for simple scripts.
Testing-heavy codebase → consider a no-op UoW for unit tests (does not write to DB on commit).

Step 3 — Design the UoW API.

WHY: The interface is the contract between domain code and the persistence coordinator. Keeping it minimal and explicit prevents the UoW from becoming a god object.

Minimum API:

registerNew(entity)    — entity will be INSERTed on commit
registerDirty(entity)  — entity will be UPDATEd on commit
registerClean(entity)  — entity is known, no action on commit; populates Identity Map
registerRemoved(entity) — entity will be DELETEd on commit
commit()               — flush all changes in order, then DB COMMIT
rollback()             — discard change sets; no DB writes
clear()                — reset UoW state (call after commit or on request teardown)

Invariant assertions (enforce at registration time):

registerNew: entity must have a non-null ID; must not be in dirty or removed list.
registerDirty: must not be in removed list; no-op if already in new list.
registerRemoved: if in new list → just remove from new (no DB write needed); remove from dirty.

Step 4 — Implement four-state tracking.

WHY: The four states map directly to the four SQL operations. Tracking state precisely prevents redundant SQL (e.g., updating an entity that was just inserted) and missed SQL (e.g., forgetting to delete an entity that was removed mid-operation).

Internal storage — three collections (clean objects are tracked only in Identity Map):

newObjects: List<DomainObject>      → INSERT on commit
dirtyObjects: List<DomainObject>    → UPDATE on commit
removedObjects: List<DomainObject>  → DELETE on commit
identityMap: Map<(Class, Id), DomainObject>  → first-level cache

State transition rules:

Load from DB → registerClean → add to identityMap.
Mutate a clean object → registerDirty → move to dirtyObjects.
Create new → registerNew → add to newObjects AND identityMap.
Delete → registerRemoved → move to removedObjects; remove from dirtyObjects.
Delete a new object (not yet in DB) → remove from newObjects; no DB action needed.

For detailed per-transition code examples see references/entity-state-transitions.md.

Step 5 — Implement the ordered commit procedure.

WHY: Database referential integrity requires that parent rows exist before child rows are inserted, and child rows are deleted before parent rows. Committing in arbitrary order produces FK violation errors. The UoW is the natural place to enforce this ordering because it holds the full change set.

Commit sequence:

INSERT newObjects in FK dependency order (parents before children). Use a topological sort of the FK graph for complex schemas; use explicit ordering for small schemas.
UPDATE dirtyObjects (order within this set is usually safe; touch each exactly once).
DELETE removedObjects in reverse FK dependency order (children before parents).
DB COMMIT — issue COMMIT on the system transaction.
Clear UoW state — discard all lists and Identity Map entries, or discard the UoW entirely.

For the topological sort algorithm and ordering metadata approach see references/commit-ordering.md.

Step 6 — Wire Identity Map integration.

WHY: Without an Identity Map, loading the same entity twice produces two separate in-memory objects for the same database row. Updating both produces conflicting writes and undefined behavior. The Identity Map, co-located in the UoW, prevents this by ensuring every load returns the same instance.

Implementation:

Key: (entityClass, primaryKey) tuple.
On find(class, id): check Identity Map first. If found → return cached instance. If not → load from DB, call registerClean, add to map, return.
On registerNew: add to map immediately (the new ID must be assigned before registration).
On registerRemoved: remove from map.
The Identity Map also serves as a performance cache (avoids redundant DB reads), but its primary purpose is identity consistency, not performance.

Step 7 — Establish lifecycle management.

WHY: A UoW that spans multiple requests accumulates stale data, grows without bound, and causes race conditions when shared across threads. The lifecycle must be bounded.

Standard lifecycles:

Per-request (most common for web apps): create UoW at request start, commit (or rollback on error) at request end, discard. Never share a UoW across threads.
Per-business-operation: create UoW at the start of a command/service call, commit at end. Useful for non-HTTP contexts (CLI, batch).
Explicit begin/end: for long business transactions that span multiple system transactions, pair with Optimistic Offline Lock patterns (UoW is recreated per system transaction; lock ensures consistency across them).

Anti-pattern: never share a UoW across requests or threads. A shared UoW accumulates dirty objects from multiple users, produces incorrect commits, and leaks memory.

Step 8 — Integrate with collaborators.

WHY: UoW is rarely used in isolation. Two patterns depend on UoW for correct behavior; wiring them explicitly prevents integration bugs.

Optimistic Offline Lock integration:

Each entity tracked by UoW has a version field (integer or timestamp).
On updateDirty, the UPDATE SQL becomes: UPDATE ... SET ..., version = version+1 WHERE id=? AND version=?
If rowsAffected == 0 → collision detected → raise ConcurrencyException, roll back transaction.
The UoW is the correct place to run this check (it owns all UPDATE calls). See optimistic-offline-lock-implementer for the full version-management workflow.

Lazy Load integration:

Lazy proxy collections are populated on first access via a callback into the current UoW/session.
The Identity Map ensures that the populated entity is the same instance that UoW is already tracking — preventing a duplicate-entity trap where a loaded proxy yields a different object than the one already in the dirty list.

Step 9 — Map to your stack's native UoW.

WHY: Most modern stacks include a built-in Unit of Work. Using it directly is far preferable to hand-rolling; the skill's value is understanding the contract so you configure and scope the built-in correctly.

Stack	UoW Object	Registration strategy	Commit call
Hibernate (Java)	`Session`	UoW-controlled (snapshot)	`session.flush()` + `tx.commit()`
Spring Data JPA	`EntityManager` via `@Transactional`	UoW-controlled	transaction commit
EF Core (.NET)	`DbContext`	UoW-controlled (change tracker)	`dbContext.SaveChanges()`
SQLAlchemy (Python)	`Session`	UoW-controlled + explicit `session.add()`	`session.commit()`
TypeORM (TS/JS)	`EntityManager` / `QueryRunner`	UoW-controlled	`queryRunner.commitTransaction()`
Django ORM	No first-class UoW	Per-save explicit	`transaction.atomic()` wrapper

For Django: use transaction.atomic() to batch saves, but note there is no central dirty tracker — bulk_update / bulk_create provides partial batching.

For stack-specific scoping patterns (request-scoped DbContext in ASP.NET, scoped Session in FastAPI, EntityManager lifecycle in Jakarta EE) see references/stack-native-uow-guide.md.

Inputs

Confirmed data-source pattern: Data Mapper
Entity class list and FK dependency graph
Language and ORM framework (or "none — hand-rolling")
Transaction boundary convention (per-request / per-command / explicit)
Whether Optimistic Offline Lock and/or Lazy Load are in scope

Outputs

UoW Implementation Artifact (written to the codebase or returned inline):

## Unit of Work Implementation Record

### Registration Strategy
[Caller | Object | UoW-controlled] — [rationale]

### API
registerNew(entity) / registerDirty(entity) / registerClean(entity) / registerRemoved(entity)
commit() / rollback() / clear()

### State Tracking
- newObjects: [List<Entity>]
- dirtyObjects: [List<Entity>]
- removedObjects: [List<Entity>]
- identityMap: Map<(Class, Id), Entity>

### Commit Sequence
1. INSERT newObjects in order: [entity order based on FK graph]
2. UPDATE dirtyObjects
3. DELETE removedObjects in reverse order: [reverse FK order]
4. DB COMMIT
5. Clear UoW state

### Lifecycle
[Per-request | Per-command] — [where UoW is created and where it is discarded]

### Stack-Native Equivalent
[If using Hibernate/EF/SQLAlchemy: the built-in Session/DbContext IS the UoW.
Map register/commit calls to the framework's API.]

### Integration Notes
- Optimistic Offline Lock: [version column present / not applicable]
- Lazy Load: [proxy collections wired through session / not applicable]

### Anti-Patterns to Watch
- [ ] Cross-request UoW sharing
- [ ] Missing registerDirty calls (caller registration risk)
- [ ] FK ordering violations on commit
- [ ] UoW not cleared between requests → memory leak + stale data

Key Principles

1. UoW is the database change controller — not individual domain objects. Without a UoW, each domain object decides when to write to the database. This produces excessive round trips, inconsistent ordering, and no natural rollback point. The UoW centralizes that control: domain code mutates objects freely; the UoW decides when and in what order those mutations reach the database.

2. The four states (new, dirty, clean, removed) map exactly to the four SQL operations. Every entity in a business operation is in exactly one of these states. Understanding the state machine prevents double-writes, missed writes, and cascade ordering errors. The UoW enforces the state machine at registration time via assertions.

3. INSERT/DELETE order is determined by FK dependencies, not by the order changes were made. If LineItem references Order, then Order must be inserted before LineItem, and LineItem must be deleted before Order. The UoW must encode or compute this graph. Ignoring it works until it doesn't — a single FK violation on commit surfaces the missing ordering logic.

4. Identity Map is not optional — it is required for correctness. Loading the same row twice into two objects is a correctness bug, not a performance issue. The Identity Map prevents this by making the UoW the single source of truth for in-memory entity identity. Performance caching is a beneficial side-effect, not the purpose.

5. UoW lifecycle must be bounded to one business operation. A UoW that outlives its business operation accumulates stale state and grows unboundedly. Cross-request sharing is especially dangerous in web apps because it causes different users' changes to be committed together. Enforce a clear begin/end boundary and discard the UoW after commit.

6. On modern stacks, use the built-in Session/DbContext — understand its contract, don't fight it. Hibernate Session, EF Core DbContext, and SQLAlchemy Session implement the full UoW + Identity Map contract. The skill's purpose is to understand what they do (so you scope, flush, and clear them correctly) — not to replace them with a hand-rolled alternative.

Examples

Scenario A: Java e-commerce — custom Data Mapper, hand-rolled UoW

Trigger: "We have a Java e-commerce service with Order, LineItem, and Product. We're using hand-rolled Data Mappers (no ORM). After a business operation touches 12 objects, we're making 12 separate UPDATE calls. How do we introduce a Unit of Work?"

Process:

Confirm Data Mapper in place. FK graph: LineItem references Order and Product.
Registration strategy: object registration — setters on Order and LineItem call UoW.getCurrent().registerDirty(this).
UoW API: registerNew / registerDirty / registerClean / registerRemoved / commit().
Identity Map keyed by (Class, Long id); populated on OrderMapper.find(id).
Commit sequence: INSERT Order → INSERT LineItem (Product pre-exists) → UPDATE dirty Orders → UPDATE dirty LineItems → DELETE removed LineItems → DELETE removed Orders → COMMIT.
Lifecycle: per-HTTP-request via servlet filter — UnitOfWork.newCurrent() on request start; UnitOfWork.getCurrent().commit() + setCurrent(null) on request end (in finally block).

Output: Hand-rolled UnitOfWork class with three lists (new/dirty/removed), ThreadLocal storage for current UoW, DomainObject base class with markDirty() / markNew() / markRemoved(), and per-request lifecycle managed by a servlet filter. See references/entity-state-transitions.md for full Java sketch.

Scenario B: Python + SQLAlchemy — scoping the built-in Session

Trigger: "We use SQLAlchemy with a FastAPI app. We're seeing stale data and occasional DetachedInstanceError. How should we scope the Session?"

Process:

Data Mapper confirmed: SQLAlchemy ORM's mapped classes + Session is the UoW.
Registration strategy: UoW-controlled — SQLAlchemy tracks changes automatically; session.add(entity) registers new objects.
Problem diagnosis: Session is likely being shared across requests (application-scoped singleton) rather than per-request.
Fix: use a dependency-injected Session per FastAPI request via Depends(get_db), where get_db yields a session and closes it after the request.
Commit sequence: handled by session.commit() — SQLAlchemy resolves INSERT ordering via mapper relationships; session.flush() pushes SQL without committing for mid-operation ID resolution.
Lazy Load: SQLAlchemy lazy proxies use the session for population; closed or detached sessions trigger DetachedInstanceError. Fix: load eagerly for data needed after session close, or keep session open for the request lifetime.

Output: get_db generator dependency, per-request session scope, session.add for new entities, session.delete for removed, session.commit() at end of each request handler (or in a middleware). Anti-pattern warning: never use a module-level Session instance.

Scenario C: .NET + EF Core — DbContext per-request scoping

Trigger: "We have an ASP.NET Core app with EF Core. We're trying to understand when to call SaveChanges and how to avoid detached entity errors."

Process:

Data Mapper confirmed: EF Core DbContext is the UoW; entities tracked by the change tracker.
Registration strategy: UoW-controlled — EF Core detects changes on tracked entities automatically.
DbContext is registered as Scoped in ASP.NET Core DI → one instance per HTTP request. This is correct.
Commit: await dbContext.SaveChanges() at the end of the service method (or in a controller action). Avoid calling it multiple times per request unless intentional.
Optimistic Offline Lock: add a [Timestamp] or [ConcurrencyToken] property; EF Core adds WHERE version=? automatically and throws DbUpdateConcurrencyException on collision.
Anti-pattern: passing a DbContext from a scoped service into a singleton service → context outlives the request, accumulates stale data.

Output: Confirm Scoped lifetime, single SaveChanges() call per business operation, [ConcurrencyToken] on entities needing optimistic locking, and warning against singleton-scoped DbContext.

References

references/entity-state-transitions.md — Full state machine with Java pseudocode for four entity states and registration assertions
references/commit-ordering.md — Topological sort algorithm for FK-ordered INSERT/DELETE + ordering for Order/LineItem/Product example
references/stack-native-uow-guide.md — Per-stack session scoping patterns (FastAPI, ASP.NET Core, Spring Boot, Jakarta EE, TypeORM)
references/identity-map-implementation.md — Key design choices (explicit vs generic map, one map per class vs per session, inheritance handling)

Related patterns triggered by this skill's output:

If Optimistic Offline Lock needed → optimistic-offline-lock-implementer
If Lazy Load proxies in scope → lazy-load-strategy-implementer
If data-source pattern not yet chosen → data-source-pattern-selector

License

This skill is licensed under CC-BY-SA-4.0. Source: BookForge — Patterns of Enterprise Application Architecture by Martin Fowler et al.

Related BookForge Skills

Install related skills from ClawhHub:

clawhub install bookforge-data-source-pattern-selector
clawhub install bookforge-lazy-load-strategy-implementer
clawhub install bookforge-optimistic-offline-lock-implementer

Or install the full book set from GitHub: bookforge-skills