Reading Time Analysis

Guides analysis of eye-tracking reading measures including first fixation, gaze duration, regression path, and total reading time

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Reading Time Analysis" with this command: npx skills add haoxuanlithuai/awesome_cognitive_and_neuroscience_skills/haoxuanlithuai-awesome-cognitive-and-neuroscience-skills-reading-time-analysis

Reading Time Analysis

Purpose

This skill encodes expert methodological knowledge for analyzing eye-tracking data from reading experiments. A competent programmer without psycholinguistics training would likely compute a single "reading time" per word, missing the critical insight that different eye-tracking measures tap different stages of language processing. Choosing the wrong measure for your research question -- or failing to account for spillover effects, skipping patterns, and the distinction between first-pass and second-pass reading -- leads to misattribution of cognitive processes.

When to Use

Use this skill when:

  • Analyzing eye-movement data from reading experiments (sentence or passage reading)
  • Selecting which eye-tracking measures to report for a given linguistic manipulation
  • Defining regions of interest and handling spillover effects
  • Setting up statistical models for eye-tracking reading data
  • Cleaning and filtering fixation data for reading analyses

Do not use this skill when:

  • Analyzing self-paced reading data (see self-paced-reading-designer for that paradigm)
  • Analyzing eye movements in visual search or scene viewing (different fixation patterns)
  • Working with eye-tracking data from non-reading tasks (e.g., visual world paradigm)

Research Planning Protocol

Before executing the domain-specific steps below, you MUST:

  1. State the research question -- What specific question is this analysis/paradigm addressing?
  2. Justify the method choice -- Why is this approach appropriate? What alternatives were considered?
  3. Declare expected outcomes -- What results would support vs. refute the hypothesis?
  4. Note assumptions and limitations -- What does this method assume? Where could it mislead?
  5. Present the plan to the user and WAIT for confirmation before proceeding.

For detailed methodology guidance, see the research-literacy skill.

⚠️ Verification Notice

This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.

Eye-Tracking Reading Measures Hierarchy

Measure Definitions and Cognitive Interpretations

The following measures are ordered from earliest to latest processing stages. This hierarchy reflects the temporal unfolding of language comprehension during reading (Rayner, 1998, 2009; Clifton et al., 2007).

First-Pass Measures (Before Leaving the Region)

MeasureDefinitionCognitive ProcessWhen to Use
First Fixation Duration (FFD)Duration of the first fixation on a word during first passEarly lexical access; initial contact with the word (Rayner, 1998)When testing early word recognition effects (frequency, predictability)
Single Fixation Duration (SFD)Duration of the only fixation on a word, when exactly one first-pass fixation occursCleaner measure of early lexical processing than FFD (Rayner, 2009)When most words receive one fixation; avoids refixation confounds
Gaze Duration (GD)Sum of all first-pass fixation durations on a word (before eyes leave the word in either direction)Lexical processing / word identification (Rayner, 1998, 2009)Default first-pass measure for most word-level analyses

Late Measures (After Leaving the Region)

MeasureDefinitionCognitive ProcessWhen to Use
Go-Past Time (GPT) / Regression Path DurationTime from first fixation on the word until first fixation to the right of the word (includes any regressions out and back)Integration difficulty; signals reanalysis of prior material (Clifton et al., 2007)When testing syntactic garden-path effects, semantic anomalies, discourse integration
Total Reading Time (TRT)Sum of all fixation durations on a word (first pass + regressions back)Overall processing difficulty (Rayner, 1998)When interested in total processing cost regardless of time course
Regression Probability (Reg-out)Binary: did the reader make a regression from this region?Reanalysis / comprehension difficulty (Clifton et al., 2007)When interested in whether (not how long) reanalysis occurred
Regression-in ProbabilityBinary: did the reader regress back to this region from downstream?Downstream difficulty triggers revisitation (Rayner & Pollatsek, 1989)When testing whether a region is revisited after later processing fails

Decision Tree: Which Measure for Which Question?

What stage of processing is your manipulation expected to affect?
|
+-- EARLY LEXICAL (word frequency, orthographic regularity, predictability)
| |
| +-- Use GAZE DURATION as primary measure (Rayner, 1998, 2009)
| +-- Report FIRST FIXATION DURATION as supplementary
| +-- Report SINGLE FIXATION DURATION if high proportion of
| single-fixation cases (Rayner, 2009)
|
+-- LATE LEXICAL / POST-LEXICAL (semantic plausibility, thematic fit)
| |
| +-- Use GAZE DURATION for early effects
| +-- Use GO-PAST TIME for integration effects (Clifton et al., 2007)
| +-- Use TOTAL READING TIME for overall effects
|
+-- SYNTACTIC (garden-path, structural ambiguity, reanalysis)
| |
| +-- Use GO-PAST TIME as primary measure (Clifton et al., 2007)
| +-- Use REGRESSION PROBABILITY as complementary binary measure
| +-- Effects often appear in the SPILLOVER REGION (1-2 words
| post-critical; Rayner & Pollatsek, 1989)
|
+-- DISCOURSE / PRAGMATIC (reference resolution, inference, coherence)
| |
| +-- Use GO-PAST TIME and TOTAL READING TIME
| +-- Effects are typically late and may span multiple words
| +-- Consider REGRESSION-IN probability for earlier regions
|
+-- EXPLORATORY / UNKNOWN TIMING
 |
 +-- Report ALL major measures: FFD, GD, GPT, TRT, Reg-out
 +-- Let the pattern across measures inform process interpretation

First-Pass vs. Second-Pass Distinction

CategoryDefinitionIncludes
First passAll fixations from first entering a region until first leaving it (in either direction)FFD, SFD, GD
Second passAll fixations on a region after first leaving itRe-reading time (TRT minus first-pass time)

Why this matters: First-pass measures reflect initial processing; second-pass measures reflect recovery from processing difficulty encountered downstream. Conflating them obscures when processing difficulty arose.

Region of Interest (ROI) Definition

Word-Level ROIs

  • The most common unit of analysis is the single word (Rayner, 1998)
  • For multi-word critical regions, report analyses at both word level and region level

Multi-Word ROIs

  • Sometimes necessary for syntactic manipulations where the critical structure spans multiple words
  • Define ROIs a priori based on linguistic structure, not post-hoc based on where effects appear
  • Report the number of characters and words in each ROI

Spillover Effects

Spillover is the delayed manifestation of a processing effect on fixations one or more words downstream of the critical word (Rayner & Pollatsek, 1989).

  • Typical spillover range: 1-2 words after the critical word (Rayner, 1998)
  • Always analyze the spillover region (word n+1, sometimes n+2) in addition to the critical word
  • Spillover is most common for first-pass measures (GD, FFD)
  • Pre-target region (word n-1) should also be checked to verify no confounding baseline differences

Parafoveal Preview Effects

  • Words are partially processed before they are directly fixated -- the parafoveal preview benefit (Rayner, 1975; Rayner, 2009)
  • Parafoveal preview extends to approximately 7-8 characters to the right of fixation in English (McConkie & Rayner, 1975)
  • This means effects of word n's properties can appear on the last fixation of word n-1 (parafoveal-on-foveal effects; Drieghe et al., 2008)

Data Cleaning

Fixation Duration Cutoffs

CriterionValueRationaleCitation
Short fixation merge< 80 ms within 1 character of another fixation: merge with nearest fixationToo brief for meaningful processing; likely corrective saccade (Rayner & Pollatsek, 1989)
Short fixation exclude< 80 ms (not adjacent to another fixation): excludeNot informative for reading (Rayner & Pollatsek, 1989)
Long fixation exclude> 800 ms: excludeLikely track loss, inattention, or blink artifact (Rayner & Pollatsek, 1989)
Alternative long cutoff> 1000 ms or > 1200 msUsed in some labs; report which cutoff and justify

Note: Some researchers use 50 ms as the lower bound and 1000-1200 ms as the upper bound. The critical requirement is to report your exact cutoffs and the percentage of data excluded.

Trial-Level Exclusions

CriterionActionRationale
Track lossExclude trialUnreliable position data
Blinks on critical regionExclude trialMissing fixation data on the ROI
First-pass skip of critical wordExclude from first-pass measures (FFD, SFD, GD); include in TRTWord was not fixated during first pass
Comprehension accuracyExclude participants below 80% on comprehension questionsEnsures reading for comprehension (Rayner et al., 2006)

Skipping Rate Considerations

  • Short, high-frequency, and predictable words are skipped 10-30% of the time (Rayner, 1998, 2009)
  • Content words are skipped ~15% of the time; function words ~35% (Rayner, 2009)
  • If skipping rates differ across conditions, this is informative -- report it
  • For first-pass measures, words that are skipped contribute no data, not zero reading time
  • Do not substitute zero for skipped words -- this conflates fast processing with no fixation

Statistical Modeling

Linear Mixed-Effects Models (LMMs)

Eye-tracking reading data should be analyzed with LMMs with crossed random effects for subjects and items (Baayen et al., 2008; Baayen, Davidson, & Bates, 2008):

# R formula (lme4 syntax):
gaze_duration ~ condition + (1 + condition | subject) + (1 + condition | item)

Why crossed random effects: Reading experiments use a Latin square design where every subject sees every item, but items rotate across conditions between subjects. Both subjects and items are random samples, and both contribute variance (Clark, 1973; Baayen et al., 2008).

Random Effects Structure

ApproachSpecificationWhen to UseCitation
MaximalRandom intercepts + all random slopes justified by designDefault starting pointBarr et al., 2013
ParsimoniousRemove random correlations first, then random slopes that explain ~0 varianceWhen maximal model fails to convergeBates et al., 2015; Matuschek et al., 2017

Convergence protocol (Barr et al., 2013; Bates et al., 2015):

  1. Fit maximal model (all by-subject and by-item random slopes for within-unit factors)
  2. If convergence fails: remove correlations between random effects (use || in lme4)
  3. If still fails: remove the random slope with the smallest variance component
  4. Report the final model structure and note any simplifications

Distributional Considerations

Reading times are right-skewed and bounded below by zero. Options:

ApproachWhen to UseCitation
Log-transformSimple; commonly used; adequate for many datasetsStandard in psycholinguistics
Inverse transform (-1000/RT)Can outperform log for skewed RT dataBaayen & Milin, 2010
Generalized LMM (Gamma)Models the skewness directly; avoids back-transformation issuesLo & Andrews, 2015
Raw RT with residual checksWhen effects are large and residuals are approximately normalBaayen et al., 2008

Recommendation: Start with raw reading times in the LMM. Check residual plots. If residuals are non-normal, apply log-transformation or fit a GLMM with Gamma family and identity link (Lo & Andrews, 2015).

Multiple Comparisons

When analyzing multiple reading measures on the same data:

  • Do not apply Bonferroni correction across measures -- each measure tests a different theoretical question (Clifton et al., 2007)
  • Do correct within each measure if testing multiple contrasts
  • Report effect sizes and confidence intervals alongside p-values

Typical Fixation Duration Benchmarks

These values serve as sanity checks for data quality (Rayner, 1998, 2009):

MeasureTypical Range (Silent Reading)Citation
Average fixation duration200-250 msRayner, 1998, 2009
Average saccade length7-9 characters (~2 degrees)Rayner, 1998, 2009
Regression rate10-15% of all saccadesRayner, 1998
Word skipping rateContent words ~15%; function words ~35%Rayner, 2009
Fixation duration range50-500 ms (bulk of distribution)Rayner, 1998

If your data substantially deviates from these benchmarks, check calibration quality, task instructions, and participant compliance.

Common Pitfalls

  1. Using only total reading time: TRT conflates early and late processing. If you only report TRT, you cannot determine when the effect arose. Always report at least one first-pass measure (GD) and one late measure (GPT or TRT) (Clifton et al., 2007).

  2. Ignoring spillover effects: Many effects appear 1-2 words downstream of the critical word, especially for syntactic manipulations. Always analyze the spillover region (Rayner, 1998; Rayner & Pollatsek, 1989).

  3. Substituting zero for skipped words: Skipped words should be treated as missing data for first-pass measures, not as zero reading time. Substituting zero artificially deflates means and inflates variance.

  4. Using ANOVA instead of LMMs: F1/F2 ANOVA is outdated for psycholinguistic data. LMMs with crossed random effects properly handle the variance structure (Baayen et al., 2008; Barr et al., 2013).

  5. Over-interpreting first fixation duration: FFD is contaminated by refixation planning. When a substantial proportion of words receive multiple first-pass fixations, GD is more informative (Rayner, 2009).

  6. Defining ROIs post-hoc: Selecting regions of interest after seeing the data inflates Type I error. Define ROIs a priori based on linguistic theory.

  7. Ignoring comprehension accuracy: If participants are not reading for comprehension (accuracy < 80%), eye-movement patterns are not interpretable as reflecting normal reading processes (Rayner et al., 2006).

  8. Not reporting data loss: Always report the percentage of trials excluded at each cleaning step and the percentage of words skipped in the critical region.

Minimum Reporting Checklist

Based on Clifton et al. (2007) and current standards in psycholinguistics:

  • Eye-tracker model and sampling rate (minimum 1000 Hz recommended; 500 Hz acceptable; Rayner, 2009)
  • Viewing distance and display specifications (font size, characters per degree)
  • Calibration procedure and accuracy threshold (typically < 0.5 degrees average error)
  • Fixation duration cutoffs (lower and upper bounds) with citations
  • Data cleaning steps and percentage of data excluded at each step
  • Skipping rates for the critical region by condition
  • ROI definitions with linguistic justification
  • All relevant reading measures (at minimum: GD, GPT, TRT for the critical region; GD for spillover)
  • Statistical model specification (random effects structure, any transformations)
  • Software for data analysis (with version)
  • Comprehension question accuracy (mean, exclusion threshold)
  • Number of participants and items after exclusions

References

  • Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59, 390-412.
  • Baayen, R. H., & Milin, P. (2010). Analyzing reaction times. International Journal of Psychological Research, 3, 12-28.
  • Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68, 255-278.
  • Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015). Parsimonious mixed models. arXiv:1506.04967.
  • Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning and Verbal Behavior, 12, 335-359.
  • Clifton, C., Staub, A., & Rayner, K. (2007). Eye movements in reading words and sentences. In R. P. G. van Gompel, M. H. Fischer, W. S. Murray, & R. L. Hill (Eds.), Eye movements: A window on mind and brain. Amsterdam: Elsevier.
  • Drieghe, D., Rayner, K., & Pollatsek, A. (2008). Mislocated fixations can account for parafoveal-on-foveal effects in eye movements during reading. Quarterly Journal of Experimental Psychology, 61, 1239-1249.
  • Lo, S., & Andrews, S. (2015). To transform or not to transform: Using generalized linear mixed models to analyse reaction time data. Frontiers in Psychology, 6, 1171.
  • Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language, 94, 305-315.
  • McConkie, G. W., & Rayner, K. (1975). The span of the effective stimulus during a fixation in reading. Perception & Psychophysics, 17, 578-586.
  • Rayner, K. (1975). The perceptual span and peripheral cues in reading. Cognitive Psychology, 7, 65-81.
  • Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372-422.
  • Rayner, K. (2009). The 35th Sir Frederick Bartlett Lecture: Eye movements and attention in reading, scene perception, and visual search. Quarterly Journal of Experimental Psychology, 62, 1457-1506.
  • Rayner, K., Chace, K. H., Slattery, T. J., & Ashby, J. (2006). Eye movements as reflections of comprehension processes in reading. Scientific Studies of Reading, 10, 241-255.
  • Rayner, K., & Pollatsek, A. (1989). The psychology of reading. Englewood Cliffs, NJ: Prentice Hall.

See references/measure-computation-guide.md for step-by-step computation procedures and worked examples.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

cognitive science statistical analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

neural population decoding analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

creativity self-efficacy mediation analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

erp data analysis

No summary provided by upstream source.

Repository SourceNeeds Review