Forensic Audio Research Audio Voice Recovery Best Practices

Comprehensive audio forensics and voice recovery guide providing CSI-level capabilities for recovering voice from low-quality, low-volume, or damaged audio recordings. Contains 45 rules across 8 categories, prioritized by impact to guide audio enhancement, forensic analysis, and transcription workflows.

When to Apply

Reference these guidelines when:

Recovering voice from noisy or low-quality recordings
Enhancing audio for transcription or legal evidence
Performing forensic audio authentication
Analyzing recordings for tampering or splices
Building automated audio processing pipelines
Transcribing difficult or degraded speech

Rule Categories by Priority

Priority Category Impact Prefix Rules

1 Signal Preservation & Analysis CRITICAL signal-

2 Noise Profiling & Estimation CRITICAL noise-

3 Spectral Processing HIGH spectral-

4 Voice Isolation & Enhancement HIGH voice-

5 Temporal Processing MEDIUM-HIGH temporal-

6 Transcription & Recognition MEDIUM transcribe-

7 Forensic Authentication MEDIUM forensic-

8 Tool Integration & Automation LOW-MEDIUM tool-

Quick Reference

Signal Preservation & Analysis (CRITICAL)

signal-preserve-original
Never modify original recording
signal-lossless-format
Use lossless formats for processing
signal-sample-rate
Preserve native sample rate
signal-bit-depth
Use maximum bit depth for processing
signal-analyze-first
Analyze before processing

Noise Profiling & Estimation (CRITICAL)

noise-profile-silence
Extract noise profile from silent segments
noise-identify-type
Identify noise type before reduction
noise-adaptive-estimation
Use adaptive estimation for non-stationary noise
noise-snr-assessment
Measure SNR before and after
noise-avoid-overprocessing
Avoid over-processing and musical artifacts

Spectral Processing (HIGH)

spectral-subtraction
Apply spectral subtraction for stationary noise
spectral-wiener-filter
Use Wiener filter for optimal noise estimation
spectral-notch-filter
Apply notch filters for tonal interference
spectral-band-limiting
Apply frequency band limiting for speech
spectral-equalization
Use forensic equalization to restore intelligibility
spectral-declip
Repair clipped audio before other processing

Voice Isolation & Enhancement (HIGH)

voice-rnnoise
Use RNNoise for real-time ML denoising
voice-dialogue-isolate
Use source separation for complex backgrounds
voice-formant-preserve
Preserve formants during pitch manipulation
voice-dereverb
Apply dereverberation for room echo
voice-enhance-speech
Use AI speech enhancement services for quick results
voice-vad-segment
Use VAD for targeted processing
voice-frequency-boost
Boost frequency regions for specific phonemes

Temporal Processing (MEDIUM-HIGH)

temporal-dynamic-range
Use dynamic range compression for level consistency
temporal-noise-gate
Apply noise gate to silence non-speech segments
temporal-time-stretch
Use time stretching for intelligibility
temporal-transient-repair
Repair transient damage (clicks, pops, dropouts)
temporal-silence-trim
Trim silence and normalize before export

Transcription & Recognition (MEDIUM)

transcribe-whisper
Use Whisper for noise-robust transcription
transcribe-multipass
Use multi-pass transcription for difficult audio
transcribe-segment
Segment audio for targeted transcription
transcribe-confidence
Track confidence scores for uncertain words
transcribe-hallucination
Detect and filter ASR hallucinations

Forensic Authentication (MEDIUM)

forensic-enf-analysis
Use ENF analysis for timestamp verification
forensic-metadata
Extract and verify audio metadata
forensic-tampering
Detect audio tampering and splices
forensic-chain-custody
Document chain of custody for evidence
forensic-speaker-id
Extract speaker characteristics for identification

Tool Integration & Automation (LOW-MEDIUM)

tool-ffmpeg-essentials
Master essential FFmpeg audio commands
tool-sox-commands
Use SoX for advanced audio manipulation
tool-python-pipeline
Build Python audio processing pipelines
tool-audacity-workflow
Use Audacity for visual analysis and manual editing
tool-install-guide
Install audio forensic toolchain
tool-batch-automation
Automate batch processing workflows
tool-quality-assessment
Measure audio quality metrics

Essential Tools

Tool Purpose Install

FFmpeg Format conversion, filtering brew install ffmpeg

SoX Noise profiling, effects brew install sox

Whisper Speech transcription pip install openai-whisper

librosa Python audio analysis pip install librosa

noisereduce ML noise reduction pip install noisereduce

Audacity Visual editing brew install audacity

Workflow Scripts (Recommended)

Use the bundled scripts to generate objective baselines, create a workflow plan, and verify results.

scripts/preflight_audio.py
Generate a forensic preflight report (JSON or Markdown).
scripts/plan_from_preflight.py
Create a workflow plan template from the preflight report.
scripts/compare_audio.py
Compare objective metrics between baseline and processed audio.

Example usage:

1) Analyze and capture baseline metrics

python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

2) Generate a workflow plan template

python3 skills/.experimental/audio-voice-recovery/scripts/plan_from_preflight.py --preflight preflight.json --out plan.md

3) Compare baseline vs processed metrics

python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py
--before evidence.wav
--after enhanced.wav
--format md
--out comparison.md

Forensic Preflight Workflow (Do This Before Any Changes)

Align preflight with SWGDE Best Practices for the Enhancement of Digital Audio (20-a-001) and SWGDE Best Practices for Forensic Audio (08-a-001). Establish an objective baseline state and plan the workflow so processing does not introduce clipping, artifacts, or false "done" confidence. Use scripts/preflight_audio.py to capture baseline metrics and preserve the report with the case file.

Capture and record before processing:

Record evidence identity and integrity: path, filename, file size, SHA-256 checksum, source, format/container, codec
Record signal integrity: sample rate, bit depth, channels, duration
Measure baseline loudness and levels: LUFS/LKFS, true peak, peak, RMS, dynamic range, DC offset
Detect clipping and document clipped-sample percentage, peak headroom, exact time ranges
Identify noise profile: stationary vs non-stationary, dominant noise bands, SNR estimate
Locate the region of interest (ROI) and document time ranges and changes over time
Inspect spectral content and estimate speech-band energy and intelligibility risk
Scan for temporal defects: dropouts, discontinuities, splices, drift
Evaluate channel correlation and phase anomalies (if stereo)
Extract and preserve metadata: timestamps, device/model tags, embedded notes

Procedure:

Prepare a forensic working copy, verify hashes, and preserve the original untouched.
Locate ROI and target signal; document exact time ranges and changes across the recording.
Assess challenges to intelligibility and signal quality; map challenges to mitigation strategies.
Identify required processing and plan a workflow order that avoids unwanted artifacts. Generate a plan draft with scripts/plan_from_preflight.py and complete it with case-specific decisions.
Measure baseline loudness and true peak per ITU-R BS.1770 / EBU R 128 and record peak/RMS/DC offset.
Detect clipping and dropouts; if clipping is present, declip first or pause and document limitations.
Inspect spectral content and noise type; collect representative noise profile segments and estimate SNR.
If stereo, evaluate channel correlation and phase; document anomalies.
Create a baseline listening log (multiple devices) and define success criteria for intelligibility and listenability.

Failure-pattern guardrails:

Do not process until every preflight field is captured.
Document every process, setting, software version, and time segment to enable repeatability.
Compare each processed output to the unprocessed input and assess progress toward intelligibility and listenability.
Avoid over-processing; review removed signal (filter residue) to avoid removing target signal components.
Keep intermediate files uncompressed and preserve sample rate/bit depth when moving between tools.
Perform a final review against the original; if unsatisfactory, revise or stop and report limitations.
If the request is not achievable, communicate limitations and do not declare completion.
Require objective metrics and A/B listening before declaring completion.
Do not rely solely on objective metrics; corroborate with critical listening.
Take listening breaks to avoid ear fatigue during extended reviews.

Quick Enhancement Pipeline

1. Analyze original (run preflight and capture baseline metrics)

python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

2. Create working copy with checksum

cp evidence.wav working.wav sha256sum evidence.wav > evidence.sha256

3. Apply enhancement

ffmpeg -i working.wav -af "
highpass=f=80,
adeclick=w=55:o=75,
afftdn=nr=12:nf=-30:nt=w,
equalizer=f=2500:t=q:w=1:g=3,
loudnorm=I=-16:TP=-1.5:LRA=11
" enhanced.wav

4. Transcribe

whisper enhanced.wav --model large-v3 --language en

5. Verify original unchanged

sha256sum -c evidence.sha256

6. Verify improvement (objective comparison + A/B listening)

python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py
--before evidence.wav
--after enhanced.wav
--format md
--out comparison.md

How to Use

Read individual reference files for detailed explanations and code examples:

Section definitions - Category structure and impact levels
Rule template - Template for adding new rules

Reference Files

File Description

AGENTS.md Complete compiled guide with all rules

references/_sections.md Category definitions and ordering

assets/templates/_template.md Template for new rules

metadata.json Version and reference information

audio-voice-recovery

Safety Notice

Copy this and send it to your AI assistant to learn

1) Analyze and capture baseline metrics

2) Generate a workflow plan template

3) Compare baseline vs processed metrics

1. Analyze original (run preflight and capture baseline metrics)

2. Create working copy with checksum

3. Apply enhancement

4. Transcribe

5. Verify original unchanged

6. Verify improvement (objective comparison + A/B listening)

Source Transparency

Related Skills

zod

clean-architecture

typescript

emilkowal-animations