Neuropsychological Battery Selector
Purpose
Selecting a neuropsychological test battery is a clinical judgment task, not a checklist exercise. A competent programmer without clinical neuropsychology training will get this wrong because:
-
Not all "memory tests" test the same construct. The CVLT-II/III assesses list learning with encoding strategies; Logical Memory tests narrative recall; the BVMT-R tests visual-spatial memory. Each is sensitive to different lesion profiles (Lezak et al., 2012, Ch. 11).
-
Test selection must match the referral question. A dementia screen requires different instruments than a TBI return-to-work evaluation or a pre-surgical epilepsy workup.
-
Normative data are not interchangeable. Age, education, cultural background, and premorbid ability all determine which norms to apply and whether a given score is actually impaired (Mitrushina et al., 2005).
-
Redundant tests waste time and fatigue patients. Over-testing degrades performance and inflates apparent impairment, particularly in older adults and those with attentional deficits (Strauss et al., 2006).
When to Use This Skill
Use this skill when you need to:
-
Select neuropsychological tests matched to a suspected cognitive deficit profile
-
Assemble a battery for a specific referral question (dementia differential, TBI, pre-surgical, forensic)
-
Advise on which cognitive domains to assess given a neurological condition
-
Evaluate whether a proposed battery has adequate domain coverage or problematic redundancy
-
Choose between brief screening vs. comprehensive evaluation
Do NOT use this skill for:
-
Interpreting test scores (that requires a different skill)
-
Diagnosing neurological conditions from test results alone
-
Administering tests (this requires licensed clinical training)
Research Planning Protocol
Before executing the domain-specific steps below, you MUST:
-
State the research question -- What cognitive domain(s) are being assessed and why?
-
Justify the method choice -- Why neuropsychological testing (not neuroimaging, behavioral paradigm)? What alternatives were considered?
-
Declare expected outcomes -- What deficit pattern would support the clinical/research hypothesis?
-
Note assumptions and limitations -- What does this battery assume about the patient? Where could it mislead?
-
Present the plan to the user and WAIT for confirmation before proceeding.
For detailed methodology guidance, see the research-literacy skill.
⚠️ Verification Notice
This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.
Step 1: Clarify the Referral Question
The referral question determines everything. Map it to one of these categories:
Referral Type Primary Goal Typical Battery Length
Dementia differential diagnosis Distinguish AD vs. FTD vs. VaD vs. DLB 3--4 hours
Mild cognitive impairment screening Detect early decline, track progression 1.5--2 hours
TBI evaluation (acute/subacute) Document deficits, guide rehabilitation 2--3 hours
TBI evaluation (chronic/forensic) Quantify residual deficits, effort testing 4--6 hours
Pre-surgical epilepsy workup Lateralize/localize function, predict risk 3--5 hours
Psychiatric differential Distinguish cognitive vs. psychiatric etiology 2--3 hours
Return-to-work/fitness-for-duty Functional capacity in specific domains 2--4 hours
(Lezak et al., 2012, Ch. 5; Sweet et al., 2011 -- 78% of neuropsychologists use a flexible battery approach)
Step 2: Identify Target Cognitive Domains
Based on the referral question and suspected condition, select domains to assess. Every battery MUST cover at least attention/processing speed, memory, and executive function. Add domains based on the clinical picture.
Cognitive Domain Framework
Attention / Processing Speed
-
WAIS-IV Processing Speed Index (Coding, Symbol Search): ~15 min (Wechsler, 2008)
-
Trail Making Test Part A: ~3 min (Reitan, 1958; deficient if >78 sec, ages 25--54)
-
Continuous Performance Test (CPT-3): ~14 min (Conners, 2014)
-
WAIS-IV Digit Span (Forward): ~5 min (Wechsler, 2008)
Executive Function
-
Wisconsin Card Sorting Test (WCST-64): ~15 min (Heaton et al., 1993)
-
Trail Making Test Part B: ~5 min (Reitan, 1958; deficient if >273 sec, ages 25--54)
-
Stroop Color-Word Test: ~5 min (Golden, 1978)
-
Tower of London/D-KEFS Tower: ~15 min (Shallice, 1982; Delis et al., 2001)
-
Verbal Fluency -- FAS: ~5 min (Benton et al., 1994; mean ~36--44 words total for ages 25--54, education 12+ years)
-
Verbal Fluency -- Animals: ~2 min (Strauss et al., 2006; mean ~20--24 animals for ages 25--54)
Memory
-
WMS-IV (Logical Memory I & II, Verbal Paired Associates I & II): ~30--45 min including delay (Wechsler, 2009)
-
CVLT-II/CVLT-3 (California Verbal Learning Test): ~30 min (Delis et al., 2000/2017)
-
RAVLT (Rey Auditory Verbal Learning Test): ~15 min (Rey, 1964; Schmidt, 1996)
-
BVMT-R (Brief Visuospatial Memory Test--Revised): ~25 min including delay (Benedict, 1997)
-
Logical Memory (WMS-IV): immediate and delayed recall; sensitivity 90--95% for MCI when combined with CVLT (Rabin et al., 2009)
Language
-
Boston Naming Test (BNT-60): ~15--20 min (Kaplan et al., 1983)
-
Token Test (short form): ~10 min (De Renzi & Vignolo, 1962)
-
Controlled Oral Word Association (COWA/FAS): ~5 min (listed above under executive; also indexes language)
-
Western Aphasia Battery--Revised (WAB-R): ~30--60 min (Kertesz, 2007; use for suspected aphasia)
Visuospatial / Visuoconstructional
-
Rey Complex Figure Test -- Copy: ~5--10 min (Osterrieth, 1944; Meyers & Meyers, 1995)
-
WAIS-IV Block Design: ~10 min (Wechsler, 2008)
-
Judgment of Line Orientation (JLO): ~15 min (Benton et al., 1994)
-
Hooper Visual Organization Test (VOT): ~15 min (Hooper, 1983)
Motor Function
-
Grooved Pegboard: ~5 min per hand (Klove, 1963; Ruff & Parker, 1993)
-
Finger Tapping Test: ~10 min (Halstead, 1947; Reitan & Wolfson, 1993)
Step 3: Assemble the Battery
Core Battery (~2--3 hours)
Every evaluation should include these unless contraindicated:
Domain Recommended Core Test(s) Time
Premorbid estimate TOPF or WTAR ~10 min
Attention / Processing Speed TMT-A + WAIS-IV Coding + Digit Span ~20 min
Executive Function TMT-B + Verbal Fluency (FAS + Animals) + Stroop ~15 min
Verbal Memory CVLT-II/III or RAVLT ~30 min
Visual Memory BVMT-R or RCFT recall ~25 min
Language BNT (30- or 60-item) ~15 min
Visuospatial RCFT Copy or Block Design ~10 min
Motor Grooved Pegboard (bilateral) ~10 min
Effort/Validity TOMM Trial 1 or embedded measures ~10 min
Total
~145 min
Extended Battery (~4--6 hours)
Add these for complex referrals (forensic, dementia differential, pre-surgical):
Domain Additional Tests Time
Intelligence estimate WAIS-IV (4 index scores) ~70 min
Memory (expanded) WMS-IV (full battery) ~75 min
Executive (expanded) WCST-64 + Tower ~30 min
Language (expanded) Token Test + WAB-R ~40 min
Visuospatial (expanded) JLO + Hooper VOT ~30 min
Effort/Validity (expanded) TOMM (full) + WMT or MSVT ~30 min
Added time
~275 min
Assembly Rules
-
One verbal learning test: Choose CVLT-II/III OR RAVLT, not both. They measure overlapping constructs (Strauss et al., 2006--778).
-
One copy figure: RCFT copy OR Block Design for visuoconstruction screening. Use both only if visuospatial function is the primary question.
-
Delay intervals: Schedule verbal memory delay recall (~20--30 min after learning) during non-memory tasks. Same for visual memory delay.
-
Fatigue management: Place demanding tests (WCST, CVLT) early. Place motor tests as breaks. Offer rest periods every 60--90 min (Lezak et al., 2012, Ch. 6).
-
At least one validity measure: Mandatory. Use TOMM Trial 1 (sensitivity 83%, specificity 93% at cutoff <=40; Denning, 2012) as a minimum. For forensic cases, use two or more PVTs from different modalities (Sweet et al., 2011).
Step 4: Select Appropriate Norms
Normative Data Decision Tree
-
Age: Always match. Most tests provide age-stratified norms.
-
Education: Use education-corrected norms when available (e.g., Heaton et al., 2004 norms for TMT, WCST, verbal fluency).
-
Premorbid IQ: For patients with estimated IQ far from average, IQ-adjusted norms improve accuracy over education alone. MOANS norms found BNT, Token Test, and JLO correlate more strongly with IQ (r = .47--.61) than with education (r = .24--.31) (Steinberg et al., 2005).
-
Cultural/linguistic background: US-normed tests may overestimate impairment in non-English speakers or culturally diverse populations (Lucas et al., 2005; Pena-Casanova et al., 2009). Use population-specific norms when available (e.g., NP-NUMBRS for Spanish speakers).
-
Sex: Match when norms are available. Grooved Pegboard shows significant sex differences: women faster than men (Ruff & Parker, 1993). Finger Tapping: men faster, especially in older groups.
Premorbid Estimation
-
TOPF (Test of Premorbid Functioning): 70 irregular words, co-normed with WAIS-IV/WMS-IV, IQ range 53--141 (Pearson, 2009). Preferred for current use.
-
WTAR (Wechsler Test of Adult Reading): predecessor to TOPF, co-normed with WAIS-III/WMS-III (Wechsler, 2001). Acceptable if TOPF unavailable.
-
Caution: Both underestimate premorbid IQ in high-functioning individuals and overestimate in low-functioning individuals (Bright & van der Linde, 2020). Supplement with demographic-based estimates.
Step 5: Address Common Pitfalls
Practice Effects in Serial Assessment
-
Practice effects average d = 0.24--0.28 on composite scores at 6--12 month retest intervals (Calamia et al., 2012).
-
No consensus on minimum retest interval; effects persist for 2+ years on some measures (Heilbronner et al., 2010).
-
Tests most susceptible: PASAT, Stroop interference, verbal fluency, TMT-B (Beglinger et al., 2005).
-
Tests least susceptible: Digit Span, Letter-Number Sequencing (Beglinger et al., 2005).
-
Mitigation: Use alternate forms (CVLT-II has alternate form; RAVLT has multiple lists). Apply reliable change indices (RCIs) or standardized regression-based norms to interpret change (Chelune et al., 1993).
Ceiling and Floor Effects
-
Ceiling effects: TMT-A and simple attention tests may miss mild deficits in high-functioning individuals. Add more demanding measures (e.g., PASAT, D-KEFS verbal fluency switching) (Strauss et al., 2006).
-
Floor effects: WCST and complex tests may be too difficult for moderate-to-severe dementia. Substitute with simpler tasks (e.g., clock drawing, category fluency) (Lezak et al., 2012, Ch. 18).
Ecological Validity
-
Neuropsychological tests have modest correlations (r = .3--.5) with real-world functioning (Chaytor & Schmitter-Edgecombe, 2003).
-
Supplement with functional measures (e.g., Independent Living Scales, IADL checklists) when the referral question concerns everyday competence.
-
Executive function tests have particularly limited ecological validity; consider adding the Behavioral Assessment of the Dysexecutive Syndrome (BADS) or real-world task simulations (Wilson et al., 1996).
Symptom Validity Testing
-
TOMM standard cutoff (<45 Trial 2): specificity .96--1.00 but sensitivity only .15--.50 (Tombaugh, 1996). Use Trial 1 cutoff <=40 for better sensitivity (.83) at .93 specificity (Denning, 2012).
-
WMT (Word Memory Test): more sensitive than TOMM but higher false-positive rate in genuine MCI/dementia -- 67% of MCI patients classified as "poor effort" at standard cutoffs (Green, 2003). Use hard-easy comparison scores instead (sensitivity/specificity ~95%).
-
Embedded PVTs: Reliable Digit Span (RDS >= 7 as cutoff; Greiffenstein et al., 1994), CVLT-II Forced Choice <=15 (Delis et al., 2000). Use multiple embedded measures to supplement standalone PVTs.
-
Rule: In forensic and disability evaluations, include at least two standalone PVTs and two embedded PVTs (Larrabee, 2012).
Step 6: Condition-Specific Battery Recommendations
For expected deficit profiles by condition, see references/deficit-profiles.md . Below are summary battery modifications:
Condition Add to Core Remove/De-emphasize Key Rationale
Alzheimer's (suspected) WMS-IV full, CVLT-3 intrusion analysis, BNT-60 May shorten executive battery Memory encoding/storage is primary deficit (Weintraub et al., 2012)
FTD (behavioral variant) WCST, D-KEFS, social cognition measures, personality inventory De-emphasize visuospatial Executive/behavioral profile dominates (Rascovsky et al., 2011)
Vascular dementia Processing speed emphasis (Coding, Symbol Search), TMT-A/B, verbal fluency May abbreviate language Processing speed and executive function most affected (Sachdev et al., 2014)
TBI (moderate-severe) CPT-3, PASAT, verbal fluency, motor tests bilateral None -- broad battery needed Diffuse deficits: attention, speed, memory, executive (Rabinowitz & Levin, 2014)
Temporal lobe epilepsy Verbal/visual memory (laterality-specific), BNT, verbal fluency May abbreviate motor Memory lateralization critical for surgical planning (Jones-Gotman et al., 2010)
Parkinson's disease Verbal fluency (semantic + phonemic), JLO, clock drawing, Grooved Pegboard None Dual-syndrome: frontostriatal vs. posterior cortical (Kehagia et al., 2013)
Multiple sclerosis SDMT, PASAT, CVLT-II/III, BVMT-R May abbreviate language BICAMS recommended minimum battery (Langdon et al., 2012): SDMT + CVLT-II + BVMT-R
Quick Reference: Test-to-Domain Mapping
For a comprehensive catalog of tests with administration times, normative samples, and sensitivity data, see references/test-catalog.md .
Key References
-
Benton, A. L., Sivan, A. B., Hamsher, K., Varney, N. R., & Spreen, O. (1994). Contributions to Neuropsychological Assessment (2nd ed.). Oxford University Press.
-
Delis, D. C., Kaplan, E., & Kramer, J. H. (2001). Delis-Kaplan Executive Function System. Pearson.
-
Heaton, R. K., Miller, S. W., Taylor, M. J., & Grant, I. (2004). Revised Comprehensive Norms for an Expanded Halstead-Reitan Battery. PAR.
-
Heilbronner, R. L., Sweet, J. J., Attix, D. K., Krull, K. R., Henry, G. K., & Hart, R. P. (2010). Official position of the AACN on serial neuropsychological assessments. The Clinical Neuropsychologist, 24, 1267--1278.
-
Lezak, M. D., Howieson, D. B., Bigler, E. D., & Tranel, D. (2012). Neuropsychological Assessment (5th ed.). Oxford University Press.
-
Mitrushina, M., Boone, K. B., Razani, J., & D'Elia, L. F. (2005). Handbook of Normative Data for Neuropsychological Assessment (2nd ed.). Oxford University Press.
-
Strauss, E., Sherman, E. M. S., & Spreen, O. (2006). A Compendium of Neuropsychological Tests (3rd ed.). Oxford University Press.
-
Sweet, J. J., Nelson, N. W., & Moberg, P. J. (2011). The TCN/AACN 2010 "salary survey." The Clinical Neuropsychologist, 25, 218--245.