LLVM Optimization Skill

This skill covers LLVM optimization infrastructure, pass development, and performance tuning techniques.

Optimization Pipeline Overview

Pipeline Stages

Source → Frontend → LLVM IR → Optimization Passes → CodeGen → Machine Code ↓ [Transform Passes] [Analysis Passes]

Optimization Levels

No optimization

clang -O0 source.c

Basic optimization (most optimizations enabled)

clang -O1 source.c

Full optimization (aggressive inlining, vectorization)

clang -O2 source.c

Maximum optimization (may increase code size)

clang -O3 source.c

Size optimization

clang -Os source.c # Optimize for size clang -Oz source.c # Aggressive size optimization

Core Optimization Passes

Scalar Optimizations

Constant Propagation: Replace variables with known constant values
Dead Code Elimination (DCE): Remove unreachable or unused code
Common Subexpression Elimination (CSE): Avoid redundant computations
Instruction Combining: Merge multiple instructions into simpler forms
Scalar Replacement of Aggregates (SROA): Break up aggregate allocations

Loop Optimizations

Loop Invariant Code Motion (LICM): Hoist invariant computations
Loop Unrolling: Duplicate loop body to reduce overhead
Loop Vectorization: Convert scalar loops to vector operations
Loop Fusion/Fission: Combine or split loops
Induction Variable Simplification: Optimize loop counters

Interprocedural Optimizations

Inlining: Replace call sites with function body
Dead Argument Elimination: Remove unused function parameters
Interprocedural Constant Propagation: Propagate constants across functions
Link-Time Optimization (LTO): Whole-program optimization

Writing Custom Optimization Passes

New Pass Manager (LLVM 13+)

#include "llvm/IR/PassManager.h" #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h"

struct MyOptimizationPass : public llvm::PassInfoMixin<MyOptimizationPass> { llvm::PreservedAnalyses run(llvm::Function &F, llvm::FunctionAnalysisManager &FAM) { bool Changed = false;

    for (auto &#x26;BB : F) {
        for (auto &#x26;I : BB) {
            // Implement optimization logic
            if (optimizeInstruction(I)) {
                Changed = true;
            }
        }
    }
    
    if (Changed)
        return llvm::PreservedAnalyses::none();
    return llvm::PreservedAnalyses::all();
}

private: bool optimizeInstruction(llvm::Instruction &I) { // Example: Replace add x, 0 with x if (auto *BinOp = llvm::dyn_cast<llvm::BinaryOperator>(&I)) { if (BinOp->getOpcode() == llvm::Instruction::Add) { if (auto *C = llvm::dyn_cast<llvm::ConstantInt>(BinOp->getOperand(1))) { if (C->isZero()) { I.replaceAllUsesWith(BinOp->getOperand(0)); return true; } } } } return false; } };

// Plugin registration extern "C" LLVM_ATTRIBUTE_WEAK ::llvm::PassPluginLibraryInfo llvmGetPassPluginInfo() { return {LLVM_PLUGIN_API_VERSION, "MyOptPass", LLVM_VERSION_STRING, [](llvm::PassBuilder &PB) { PB.registerPipelineParsingCallback( [](llvm::StringRef Name, llvm::FunctionPassManager &FPM, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>) { if (Name == "my-opt") { FPM.addPass(MyOptimizationPass()); return true; } return false; }); }}; }

Analysis Dependencies

struct MyAnalysis : public llvm::AnalysisInfoMixin<MyAnalysis> { using Result = MyAnalysisResult;

Result run(llvm::Function &#x26;F, llvm::FunctionAnalysisManager &#x26;FAM) {
    // Compute analysis result
    return Result();
}

static llvm::AnalysisKey Key;

};

// Using analysis in a pass llvm::PreservedAnalyses run(llvm::Function &F, llvm::FunctionAnalysisManager &FAM) { auto &DT = FAM.getResult<llvm::DominatorTreeAnalysis>(F); auto &LI = FAM.getResult<llvm::LoopAnalysis>(F); auto &AA = FAM.getResult<llvm::AAManager>(F);

// Use analysis results...

}

Instruction Patterns

Strength Reduction

// Replace expensive operations with cheaper ones // x * 2 → x << 1 // x / 4 → x >> 2 // x % 8 → x & 7

bool reduceStrength(llvm::BinaryOperator *BO) { if (BO->getOpcode() == llvm::Instruction::Mul) { if (auto *C = llvm::dyn_cast<llvm::ConstantInt>(BO->getOperand(1))) { if (C->getValue().isPowerOf2()) { unsigned Shift = C->getValue().exactLogBase2(); auto *Shl = llvm::BinaryOperator::CreateShl( BO->getOperand(0), llvm::ConstantInt::get(C->getType(), Shift)); BO->replaceAllUsesWith(Shl); return true; } } } return false; }

Algebraic Simplification

// x + 0 → x // x * 1 → x // x * 0 → 0 // x - x → 0 // x | x → x // x & 0 → 0

Dominator Tree Usage

Finding Optimization Opportunities

void optimizeWithDominators(llvm::Function &F, llvm::DominatorTree &DT) { // Use dominance for safe code motion for (auto &BB : F) { for (auto &I : BB) { if (auto *Load = llvm::dyn_cast<llvm::LoadInst>(&I)) { // Check if we can hoist this load if (canHoist(Load, DT)) { hoistInstruction(Load, DT); } } } } }

bool canHoist(llvm::Instruction *I, llvm::DominatorTree &DT) { llvm::BasicBlock *DefBB = I->getParent();

// Check all uses are dominated
for (auto *U : I->users()) {
    if (auto *UI = llvm::dyn_cast&#x3C;llvm::Instruction>(U)) {
        if (!DT.dominates(DefBB, UI->getParent())) {
            return false;
        }
    }
}
return true;

}

Loop Optimization Techniques

Loop Analysis

void analyzeLoops(llvm::Function &F, llvm::LoopInfo &LI) { for (auto *L : LI) { // Get loop trip count if (auto *TC = L->getTripCount()) { llvm::errs() << "Trip count: " << *TC << "\n"; }

    // Check if loop is simple
    if (L->isLoopSimplifyForm()) {
        llvm::BasicBlock *Header = L->getHeader();
        llvm::BasicBlock *Latch = L->getLoopLatch();
        llvm::BasicBlock *Exit = L->getExitBlock();
    }
    
    // Get induction variables
    llvm::PHINode *IV = L->getCanonicalInductionVariable();
}

}

Loop Unrolling

// Manually trigger loop unrolling #pragma unroll 4 for (int i = 0; i < N; i++) { // Loop body will be unrolled 4x }

// LLVM unroll metadata !llvm.loop.unroll.count = !{i32 4}

Vectorization

Auto-Vectorization Hints

// Enable vectorization #pragma clang loop vectorize(enable) for (int i = 0; i < N; i++) { a[i] = b[i] + c[i]; }

// Specify vector width #pragma clang loop vectorize_width(8) for (int i = 0; i < N; i++) { a[i] = b[i] * c[i]; }

SLP Vectorization

Superword Level Parallelism - vectorize straight-line code:

// Before SLP a[0] = b[0] + c[0]; a[1] = b[1] + c[1]; a[2] = b[2] + c[2]; a[3] = b[3] + c[3];

// After SLP (conceptual) <4 x float> tmp = load <4 x float> b <4 x float> tmp2 = load <4 x float> c <4 x float> result = fadd tmp, tmp2 store result to a

Debugging Optimizations

Viewing Pass Execution

Print passes being run

opt -debug-pass-manager input.ll -O2

Print IR after each pass

opt -print-after-all input.ll -O2

Print specific pass output

opt -print-after=instcombine input.ll -O2

Statistics

opt -stats input.ll -O2

Optimization Remarks

Enable all optimization remarks

clang -Rpass=.* source.c

Specific remarks

clang -Rpass=loop-vectorize source.c clang -Rpass-missed=inline source.c clang -Rpass-analysis=loop-vectorize source.c

Link-Time Optimization (LTO)

Enabling LTO

Full LTO

clang -flto source1.c source2.c -o program

Thin LTO (faster, parallel)

clang -flto=thin source1.c source2.c -o program

LTO Benefits

Whole-program dead code elimination
Interprocedural constant propagation
Cross-module inlining
Better devirtualization

Correctness Verification

Alive2

Automatic verification of LLVM optimizations:

Verify transformation correctness

alive-tv before.ll after.ll

Check specific optimization

opt -instcombine input.ll | alive-tv input.ll -

Resources

See Optimization section in README.md for specific commits and optimization-related projects.

Getting Detailed Information

When you need detailed and up-to-date resource links, tool lists, or project references, fetch the latest data from:

https://raw.githubusercontent.com/gmh5225/awesome-llvm-security/refs/heads/main/README.md

This README contains comprehensive curated lists of:

LLVM optimization commits and patches (Optimization section)
Alive2 and verification tools
Optimization courses and tutorials (CSCD70)

llvm-optimization

Safety Notice

Copy this and send it to your AI assistant to learn