audio-mixing-patterns

Audio Mixing Patterns

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "audio-mixing-patterns" with this command: npx skills add yonatangross/orchestkit/yonatangross-orchestkit-audio-mixing-patterns

Audio Mixing Patterns

Comprehensive guide to audio mixing for video production using ffmpeg. Covers narration/music balancing, automatic ducking, timing control, and loudness normalization.

Core Principle

Quality Audio = Clear Narration + Supportive Music + Appropriate Levels

The human voice occupies 85-255 Hz (fundamental) with harmonics up to 8kHz. Music must support, not compete.

Volume Balancing Formula

Standard Video Mix Ratios:

Narration: 100% (reference level) Music: 15-20% of narration level SFX: 70-100% of narration level (contextual)

dB Relationships:

Narration: -14 dB LUFS (dialogue standard) Music bed: -30 to -35 dB LUFS (under narration) Music only: -16 dB LUFS (no narration sections) SFX: -18 to -20 dB LUFS

Volume Multiplier Quick Reference

Ratio Multiplier Use Case

100% 1.0 Full volume (narration)

70% 0.7 Prominent SFX

50% 0.5 Equal blend

30% 0.3 Noticeable background

20% 0.2 Subtle bed (recommended music)

15% 0.15 Minimal presence

10% 0.1 Barely audible

Basic ffmpeg Mixing Commands

Two-Track Mix (Narration + Music)

Basic mix: narration at full, music at 15%

ffmpeg -i narration.mp3 -i music.mp3
-filter_complex "[0:a]volume=1.0[narr];[1:a]volume=0.15[music];[narr][music]amix=inputs=2:duration=first"
-c:a aac -b:a 192k output.m4a

Three-Track Mix (Narration + Music + SFX)

ffmpeg -i narration.mp3 -i music.mp3 -i sfx.mp3
-filter_complex "
[0:a]volume=1.0[narr];
[1:a]volume=0.15[music];
[2:a]volume=0.7[sfx];
[narr][music][sfx]amix=inputs=3:duration=first:weights='3 1 2'"
-c:a aac -b:a 192k output.m4a

Timing with adelay Filter

The adelay filter positions audio at precise timestamps.

Syntax

adelay=delays[|delays...][,all=1]

delays: milliseconds or samples (with 'S' suffix)

all=1: apply same delay to all channels

Position Music at Specific Time

Start music at 5 seconds

ffmpeg -i narration.mp3 -i music.mp3
-filter_complex "
[0:a]volume=1.0[narr];
[1:a]adelay=5000|5000,volume=0.15[music];
[narr][music]amix=inputs=2:duration=first"
output.m4a

Multiple Timed Audio Cues

Narration starts at 0, music at 2s, SFX at 5.5s

ffmpeg -i narration.mp3 -i music.mp3 -i sfx.wav
-filter_complex "
[0:a]volume=1.0[narr];
[1:a]adelay=2000|2000,volume=0.15[music];
[2:a]adelay=5500|5500,volume=0.7[sfx];
[narr][music][sfx]amix=inputs=3:duration=longest"
output.m4a

Audio Ducking

Automatically lower music when speech is present.

Simple Sidechain Compression (Ducking)

ffmpeg -i narration.mp3 -i music.mp3
-filter_complex "
[0:a]asplit=2[narr][sc];
[1:a][sc]sidechaincompress=threshold=0.02:ratio=10:attack=50:release=500[ducked];
[narr][ducked]amix=inputs=2:duration=first"
output.m4a

Parameters Explained

Parameter Value Effect

threshold 0.02 (default 0.125) Lower = more sensitive to speech

ratio 10:1 How much to reduce (10:1 = significant duck)

attack 50ms How fast to duck when speech starts

release 500ms How fast to return after speech ends

knee 2.82843 Softness of compression curve

Advanced Ducking with Precise Control

ffmpeg -i narration.mp3 -i music.mp3
-filter_complex "
[0:a]asplit=2[narr][sc];
[1:a]volume=0.5[music_pre];
[music_pre][sc]sidechaincompress=
threshold=0.015:
ratio=15:
attack=30:
release=800:
makeup=1:
knee=6[ducked];
[narr][ducked]amix=inputs=2:duration=first:weights='1 0.4'"
output.m4a

Mix Ratios by Content Type

Content TypeNarrationMusicSFXNotes
Tutorial/How-to100%10%50%Voice clarity critical
Corporate/Business100%15%60%Professional presence
Social Media100%20%80%Higher energy
Documentary100%25%100%Cinematic feel
Promo/Advertising100%30%100%Impactful
Music Video50%100%80%Music dominant
Podcast100%5%30%Minimal distraction
E-learning100%8%40%Focus on retention

Loudness Normalization (LUFS)

LUFS (Loudness Units Full Scale) is the broadcast standard for perceived loudness.

Target Levels by Platform

Platform Target LUFS True Peak Notes

YouTube -14 LUFS -1 dB TP Auto-normalized

Spotify -14 LUFS -1 dB TP Loudness penalty applied

Apple Music -16 LUFS -1 dB TP Sound Check

Broadcast TV -24 LUFS -2 dB TP EBU R128 standard

Podcast -16 to -19 LUFS -1 dB TP Apple spec

TikTok/Reels -14 LUFS -1 dB TP Mobile optimization

Loudness Normalization Command

Normalize to -14 LUFS (YouTube/Spotify standard)

ffmpeg -i input.mp3
-af loudnorm=I=-14:TP=-1:LRA=11
-c:a aac -b:a 192k output.m4a

Two-Pass Normalization (More Accurate)

Pass 1: Analyze

ffmpeg -i input.mp3
-af loudnorm=I=-14:TP=-1:LRA=11:print_format=json
-f null - 2>&1 | grep -A 12 "output_i"

Pass 2: Apply measured values

ffmpeg -i input.mp3
-af loudnorm=I=-14:TP=-1:LRA=11:
measured_I=-18.5:measured_TP=-2.3:measured_LRA=8.2:
measured_thresh=-28.5:
linear=true
-c:a aac -b:a 192k output.m4a

Multi-Track Production Pipeline

Complete Video Audio Mix

ffmpeg -i video.mp4 -i narration.wav -i music.mp3 -i sfx.wav
-filter_complex "
[1:a]volume=1.0,aformat=sample_fmts=fltp:sample_rates=48000:channel_layouts=stereo[narr];
[2:a]volume=0.15,aformat=sample_fmts=fltp:sample_rates=48000:channel_layouts=stereo[music];
[3:a]adelay=3000|3000,volume=0.7,aformat=sample_fmts=fltp:sample_rates=48000:channel_layouts=stereo[sfx];
[narr][music][sfx]amix=inputs=3:duration=first:normalize=0[mixed];
[mixed]loudnorm=I=-14:TP=-1:LRA=11[final]"
-map 0:v -map "[final]"
-c:v copy -c:a aac -b:a 192k
output.mp4

Audio-Only Master Mix

ffmpeg -i narration.wav -i music.mp3 -i intro_sfx.wav -i outro_sfx.wav
-filter_complex "
[0:a]volume=1.0[narr];
[1:a]volume=0.15[music];
[2:a]adelay=0|0,volume=0.8[intro];
[3:a]adelay=55000|55000,volume=0.8[outro];
[narr][music][intro][outro]amix=inputs=4:duration=longest:weights='3 1 2 2'[mix];
[mix]loudnorm=I=-14:TP=-1[final]"
-map "[final]" -c:a aac -b:a 256k master_audio.m4a

Quick Reference: Common Patterns

Pattern 1: Narration + Background Music

ffmpeg -i narration.mp3 -i music.mp3
-filter_complex "[0:a]volume=1.0[n];[1:a]volume=0.15[m];[n][m]amix=inputs=2:duration=first"
output.m4a

Pattern 2: Music with Auto-Duck

ffmpeg -i narration.mp3 -i music.mp3
-filter_complex "[0:a]asplit=2[n][sc];[1:a][sc]sidechaincompress=threshold=0.02:ratio=10[d];[n][d]amix=inputs=2"
output.m4a

Pattern 3: Timed Intro Music Fade

ffmpeg -i narration.mp3 -i intro_music.mp3
-filter_complex "
[1:a]afade=t=out:st=8:d=2,volume=0.3[intro];
[0:a]adelay=10000|10000[narr];
[intro][narr]amix=inputs=2:duration=longest"
output.m4a

Pattern 4: Crossfade Between Segments

ffmpeg -i segment1.mp3 -i segment2.mp3
-filter_complex "
[0:a]afade=t=out:st=28:d=2[s1];
[1:a]adelay=28000|28000,afade=t=in:d=2[s2];
[s1][s2]amix=inputs=2:duration=longest"
output.m4a

Troubleshooting

Issue Cause Solution

Clipping/distortion Combined levels too high Reduce individual volumes or add limiter

Narration buried Music too loud Reduce music to 10-15%, add ducking

Hollow/thin sound Phase cancellation Check mono compatibility

Pumping artifacts Aggressive ducking Increase attack/release times

Inconsistent levels No normalization Apply loudnorm filter

Add Limiter to Prevent Clipping

ffmpeg -i input.mp3
-af "alimiter=level_in=1:level_out=0.9:limit=0.95:attack=5:release=50"
output.m4a

Related Skills

  • video-pacing : Video rhythm and timing patterns

  • remotion-composer : Programmatic video generation

  • demo-producer : Product demo video production

  • thumbnail-first-frame : Video thumbnail optimization

References

  • ffmpeg Filters - Complete audio filter reference

  • Volume Balancing - Detailed formulas and calculations

  • Ducking Patterns - Automatic ducking implementation

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

ui-components

No summary provided by upstream source.

Repository SourceNeeds Review
General

responsive-patterns

No summary provided by upstream source.

Repository SourceNeeds Review
General

domain-driven-design

No summary provided by upstream source.

Repository SourceNeeds Review