Back to Blog
Best Practices

Matrix Testing: Covering All Configurations Without Exploding Your CI Bill

Browser × OS × Version matrices can create thousands of jobs. Learn smart strategies for comprehensive coverage on a budget.

David KimDecember 20, 20255 min read

Matrix Testing: Comprehensive Coverage on a Budget

The promise of matrix testing is compelling: define your configurations once, and the CI system tests every combination. The reality is often a 36-job monster that takes 4 hours and costs a fortune. Let's fix that.


The Matrix Explosion Problem

A typical web application might need to test:

  • 3 browsers: Chrome, Firefox, Safari
  • 3 operating systems: Windows, Linux, macOS
  • 4 version combinations: Current, Previous, Mobile, Edge cases

That's 3 × 3 × 4 = 36 configurations.

With 1000 tests taking 10 minutes each: • Sequential: 6000 minutes = 100 hours per run • Cost: Significant compute and agent time • Feedback: Way too slow for developer productivity

We need smarter strategies.


Strategy 1: Risk-Based Selection

Not all configurations are equally important. Prioritize based on:

User Analytics

Look at your actual user base: • 70% Chrome on Windows? That's your primary config. • 2% Safari on iOS 14? Maybe test less frequently. • 0.1% IE11? Time to drop support entirely.

Historical Failures

Some configurations fail more often: • Safari has rendering quirks? Prioritize Safari testing. • Firefox 115 introduced regressions? Add it to the matrix.

Track which configurations catch bugs and weight accordingly.

Code Change Impact

Not every change affects all configurations: • CSS change? Browser matrix matters, OS doesn't. • Backend API change? Any single browser is sufficient. • Platform-specific code? Test only affected platforms.


Strategy 2: Pairwise Testing

Mathematical insight: Most bugs are caused by interaction between TWO factors, not all factors simultaneously.

Pairwise testing ensures every pair of values is tested at least once:

Full matrix: 36 configurations Pairwise: 9-12 configurations (same bug coverage!)

Example pairwise set: 1. Chrome + Windows + Current 2. Chrome + Linux + Previous 3. Firefox + Windows + Previous 4. Firefox + macOS + Current 5. Safari + Linux + Current 6. Safari + Windows + Mobile ... (continues to cover all pairs)

Tools like PICT (Microsoft) or AllPairs can generate optimal pairwise sets automatically.


Strategy 3: Progressive Expansion

Run different matrix sizes at different stages:

On Every Commit (Quick Feedback) • Core configuration only (Chrome + Linux) • Duration: 5 minutes • Purpose: Catch obvious breakages fast

On PR Merge (Standard Validation) • Top 5 configurations by user analytics • Duration: 30 minutes • Purpose: Reasonable confidence before main branch

On Release Branch (Full Coverage) • Complete matrix • Duration: 4-6 hours (overnight) • Purpose: Comprehensive validation before deployment

This gives developers fast feedback while still achieving full coverage.


Strategy 4: Parallel Execution

Instead of running configs sequentially, distribute across agents:

36 configurations ÷ 30 agents = 1.2 configs per agent

Wall-clock time: ~10-15 minutes instead of hours

Key considerations: • Agent provisioning time adds overhead • Some tests have resource requirements (GPU, memory) • Load balancing by historical duration improves efficiency

Cloud providers make scaling easy: • Spin up agents on demand • Pay only for compute time used • Auto-scale based on queue depth


Strategy 5: Selective Re-runs

When a test fails, don't re-run the entire matrix:

Instead: • Re-run only failing tests • Re-run only on the failing configuration • Use smart retry logic (2-3 attempts with backoff)

If a test fails on Chrome but passes everywhere else, re-running on Firefox is waste.


Strategy 6: Configuration Sampling

For very large matrices, sample randomly:

  • Full matrix: 1000 configurations
  • Sample: 50 random configurations per run
  • Over time: All configurations get tested

This works well when: • Configurations are relatively independent • You have good monitoring to catch missed bugs • Historical data shows low cross-configuration bugs


Putting It Together

A practical matrix strategy for a team:

Development (feature branches): • Single config: Chrome + Linux • Duration: 5 min • Full unit + smoke integration tests

PR Review (before merge): • 5 configs: Top browsers + OS combinations • Duration: 20 min • Full integration tests

Main Branch (after merge): • 12 configs: Pairwise coverage • Duration: 45 min • Full E2E tests

Release (weekly): • Full matrix: All 36 configs • Duration: Overnight • Full regression suite


Cost Optimization

Track and optimize:

  • Cost per test minute: Benchmark your CI provider
  • Capacity utilization: Are agents sitting idle?
  • Redundant testing: Are you testing the same thing twice?
  • Cache hit rate: Are artifacts being reused?

Small optimizations compound. A 10% reduction in test time saves thousands of compute hours annually.


Conclusion

Matrix testing doesn't have to be all-or-nothing. Smart strategies let you achieve comprehensive coverage without breaking the bank or slowing down your team.

The key principles: • Prioritize configurations that matter • Use pairwise to reduce combinatorial explosion • Progressive expansion for faster feedback • Parallelize aggressively • Be selective about re-runs

Your CI bill and your developers will thank you.

Share this article:
Read More Articles