mobile-verification

Automated testing workflow with pass@k metrics for mobile development. Detects flaky tests and ensures code reliability.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "mobile-verification" with this command: npx skills add ahmed3elshaer/everything-claude-code-mobile/ahmed3elshaer-everything-claude-code-mobile-mobile-verification

Mobile Verification Skill

Comprehensive testing workflow with pass@k metrics for Android development reliability.

Philosophy

Single test runs lie.

A test that passes once might fail tomorrow. Verification loops run tests multiple times to reveal:

  • Flaky tests (timing issues, async problems)
  • Intermittent failures (resource contention)
  • Reliability trends (improving vs degrading)

Pass@k Explained

Pass@k = proportion of test iterations that passed

Pass@3(test) = tests_passed / 3

testLogin(): ✓✓✓ → Pass@3 = 3/3 = 1.0 (100%)
testLogout(): ✓✓✗ → Pass@3 = 2/3 = 0.67 (67%)
testRefresh(): ✗✗✗ → Pass@3 = 0/3 = 0.0 (0%)

Verification Levels

Quick Verification (k=2)

Purpose: Fast feedback during development
Usage: /mobile-verify --k=2
Time: ~2 minutes
When: After small changes, before commit

Standard Verification (k=3)

Purpose: Standard confidence level
Usage: /mobile-verify --k=3
Time: ~5 minutes
When: Before push, after feature complete

Thorough Verification (k=5)

Purpose: High confidence, flaky detection
Usage: /mobile-verify --k=5
Time: ~10 minutes
When: Before release, after refactor

Release Verification (k=10)

Purpose: Maximum confidence
Usage: /mobile-verify --k=10
Time: ~20 minutes
When: Production release, critical bugs

Test Type Strategies

Unit Tests (JUnit)

Characteristics:

  • Fast: ~1-2 seconds per test
  • Isolated: No Android dependencies
  • Reliable: Should be Pass@k = 1.0

Target Pass@k: ≥ 0.95 (95%)

Common Flaky Causes:

  • Async operations without proper waiting
  • Date/time dependencies
  • Random data generation
  • Static state leakage

Fix Strategies:

// Bad: Flaky
@Test
fun testLoadData() {
    viewModel.loadData()
    assert(viewModel.state.value is Loaded)
}

// Good: Stable
@Test
fun testLoadData() = runTest {
    viewModel.loadData()
    advanceUntilIdle()
    assert(viewModel.state.value is Loaded)
}

UI Tests (Espresso)

Characteristics:

  • Slow: ~5-10 seconds per test
  • Device-dependent: Need emulator/device
  • Fragile: UI changes break tests

Target Pass@k: ≥ 0.80 (80%)

Common Flaky Causes:

  • Idling resource not registered
  • Animation interference
  • Screen rotation
  • Network timeouts

Fix Strategies:

// Register idling resources
@IdlingResource
val countingIdlingResource = CountingIdlingResource("api")

// Disable animations
@get:Rule
val disableAnimationsRule = DisableAnimationsRule()

Compose Tests

Characteristics:

  • Fast: ~1-3 seconds per test
  • UI-level: Tests Composable behavior
  • Modern: Uses Compose Testing framework

Target Pass@k: ≥ 0.90 (90%)

Common Flaky Causes:

  • Recomposition timing
  • State hoisting issues
  • Animation interference

Fix Strategies:

@Composable
fun TestComposable(content: @Composable () -> Unit) {
    CompositionLocalProvider(
        LocalInspectionMode provides true
    ) {
        content()
    }
}

Verification Workflow

During Development

# 1. Write test
# 2. Quick verify
/mobile-verify --class=NewTest --k=2

# 3. Fix if fails
# 4. Standard verify
/mobile-verify --class=NewTest --k=3

Before Commit

# Verify changed modules only
/mobile-verify --module=$(git diff --name-only | head -1) --k=2

Before Push

# Full verification
/mobile-verify --k=3

Before Release

# Thorough verification with flaky detection
/mobile-verify --k=5 --flaky

Interpreting Results

Pass@k Scores

ScoreMeaningAction
1.0PerfectCelebrate
0.8-0.9ExcellentMonitor
0.6-0.7GoodInvestigate
0.4-0.5FairFix needed
0.0-0.3PoorBlock release

Trends

Track pass@k over time:

Week 1: Pass@3 = 0.85
Week 2: Pass@3 = 0.87  ↗ Improving
Week 3: Pass@3 = 0.82  ↘ Degraded - investigate!
Week 4: Pass@3 = 0.88  ↗ Recovered

Flaky Test Patterns

PatternLikely Cause
Fails on iteration 1 onlyCold start issue
Fails randomlyAsync timing
Fails on specific iterationResource leak
Fails in parallel onlyShared state

Fixing Flaky Tests

Step 1: Identify Pattern

/mobile-verify --flaky --k=10

Look for patterns in failures.

Step 2: Add Diagnostics

@Test
fun flakyTest() = runTest {
    val startTime = System.currentTimeMillis()
    // ... test code ...
    val duration = System.currentTimeMillis() - startTime
    Log.d("Test", "Duration: $duration ms")  // Check for timing issues
}

Step 3: Apply Fix

Common fixes:

  • Add advanceUntilIdle() for coroutines
  • Add IdlingResource for network
  • Disable animations for UI tests
  • Use @UiThreadTest for main thread work
  • Add explicit waits for async operations

Step 4: Verify Fix

/mobile-verify --class=FixedTest --k=5

Target: Pass@5 = 1.0

Integration

With Checkpoints

Create checkpoint before verification:

/mobile-checkpoint save pre-verify
/mobile-verify --k=3

With Memory

Track pass@k in memory:

{
    "test-coverage": {
        "passAt3": 0.87,
        "trend": "improving",
        "flakyTests": []
    }
}

With Instincts

Learn testing patterns:

{
    "id": "test-coroutine-async",
    "description": "Always use runTest + advanceUntilIdle for ViewModel tests",
    "confidence": 0.95
}

Thresholds by Context

ContextPass@k ThresholdRationale
Unit tests0.95Should be deterministic
UI tests0.80More fragile, device-dependent
Compose tests0.90Better than Espresso, more stable
Integration tests0.70Complex, more variables
E2E tests0.60Full system, many variables

Best Practices

  1. Start High, Go Low: Use k=5 for investigation, k=3 for routine
  2. Fix Flaky Fast: Don't tolerate flaky tests
  3. Track Trends: Monitor pass@k over time
  4. Context Matters: UI tests can have lower thresholds than unit
  5. Block Release: Failed verification should block releases

Remember: A test that sometimes passes is worse than no test at all. It gives false confidence.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

kmp-repositories

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

kmp-networking

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

kmp-di

No summary provided by upstream source.

Repository SourceNeeds Review