Mobile Testing Best Practices: 12 Practices Organized by When to Use Them

In this we organizes 12 mobile testing best practices into three phases: during development (every commit and PR), before release (the final gate), and in production (after the app ships). Each practice includes a concrete impact, a real metric where available, and a link to a deeper guide if you want the full picture.

During development (every commit, every PR)

These run continuously. They catch bugs while they're cheap to fix.

1. Write tests in plain English, not just code

Tests written in natural language can be authored by the entire team, not just automation engineers. One QA Engineering Lead, Morgan Ellis, described the difference: "Writing tests in plain English made automation something the whole team could contribute to. We shipped 20 tests in a single day."

Impact: Test authoring speed goes from 15 tests per month to 200 per QA engineer. More people write tests, so coverage grows faster. For a deeper look at how plain-English test authoring compares to traditional approaches, see our guide to test automation frameworks.

2. Run smoke tests on every build

Smoke testing is a broad, shallow check that the build's core functions work: app launches, login completes, home screen loads. If smoke fails, the build is rejected before anyone spends time on deeper testing. Automate this in CI/CD so broken builds never reach QA.

Impact: Catches catastrophic build failures in 2-3 minutes instead of 15-20 minutes of manual verification. Saves the QA team from wasting a full sprint morning on a build that crashes at login.

3. Run regression tests on every PR

Regression testing confirms that a code change didn't break existing features. Run selective regression (covering the changed module and its dependencies) on every pull request. Save complete regression for nightly builds.

Impact: Teams that automate regression testing see 24% lower operational costs and catch side-effect bugs before they merge into the main branch.

4. Use the testing pyramid as a budget

Not every test should be an E2E test. E2E tests are slow and expensive. Unit tests are fast and cheap. Aim for 70% unit, 20% integration, 10% E2E. Push bugs down to the cheapest layer that can catch them. If a bug can be caught by a unit test, don't wait for E2E to find it.

Impact: Suites that follow the pyramid run faster and have less flakiness. Teams that invert the pyramid (mostly E2E, few unit tests) end up with slow CI pipelines and test suites nobody trusts.

5. Run sanity tests after every targeted fix

After a developer fixes a specific bug, run sanity tests on the changed area, including edge cases around the fix. This is narrower than regression. It confirms the specific fix works before triggering the broader regression suite.

Impact: Catches incomplete fixes before they reach the full regression cycle. Saves the team from running 200 regression tests only to find the original bug wasn't fully resolved.

Before release (the final gate)

These run once per release candidate. They're the last check before the app ships.

6. Test on real devices, not just emulators

Emulators run stock Android. They don't reproduce Samsung One UI font scaling, Xiaomi HyperOS security popups, real GPU rendering, or budget-device memory pressure. One team found that 23% of their test failures came from device-specific rendering differences that emulators couldn't surface.

Impact: Testing on real devices catches the bugs your users will actually hit. Run your automated suite on at least a Samsung (One UI), a Pixel (stock Android), a budget device (3-4 GB RAM), and an iPhone.

7. Run exploratory testing sessions with a charter

Exploratory testing finds the bugs that scripted tests can't predict. Use a time-boxed session (20-60 minutes) with a written charter that defines the mission, device, and areas to explore. BBST research shows that chartered sessions find 40-60% more actionable bugs than unguided exploration.

Impact: In our walkthrough, a single 20-minute session on a Samsung Galaxy A14 found 4 bugs, including a critical payment state loss that no automated test had covered. The bugs discovered in exploratory sessions become automated regression tests through Drizz, which means they never recur.

8. Test interrupts, network transitions, and background/foreground

Mobile apps face constant interruptions that web apps don't: phone calls, SMS, low battery warnings, switching between Wi-Fi and cellular, and moving to background and back. Test your critical flows (checkout, payment, onboarding) while triggering these interrupts.

Impact: Interrupt testing catches state-loss bugs that are invisible in clean test environments. A banking app might show transfer confirmation after a phone call even though the API call timed out while the app was backgrounded.

9. Validate performance on budget devices

Performance testing on a flagship phone tells you what your app CAN do. Performance testing on a budget phone tells you what most users EXPERIENCE. Cold start above 3 seconds loses the user. Crash-free rate below 99% is a retention crisis. Memory usage past the OS kill threshold causes silent app terminations.

Impact: Budget devices (3-4 GB RAM) are where cold starts are slowest, memory kills are most frequent, and frame drops are most visible. If 40% of your users are on these devices and you only test on a Pixel 9, you're missing the problems they face daily.

10. Test with accessibility settings enabled

Turn on system font scaling (1.3x and 1.5x), enable screen readers (TalkBack on Android, VoiceOver on iOS), and verify that touch targets are at least 44x44px. Button labels truncate, error messages overflow containers, and navigation elements overlap with icons when font scaling is active. Over 30% of users aged 45+ use larger font sizes.

Impact: Accessibility testing catches layout bugs that affect real users in real conditions. An "Add to Cart" button that works at default font size might truncate to "Add to C..." at 1.3x scaling and become unusable.

In production (after release)

These run continuously after the app ships. They catch regressions and real-world failures that pre-release testing missed.

11. Monitor crash-free rates and set alerts

Use Crashlytics, Sentry, or Instabug to track crash free session rates in production. Set alerts for when the rate drops below 99.5%. Instabug's 2025 data shows that 99.95% crash free is the industry target. Google Play penalizes apps with a user perceived crash rate above 1.09% by reducing discoverability.

Impact: A crash rate regression after a release is the single fastest way to lose users. Catching it within hours (via monitoring alerts) vs days (via user reviews) is the difference between a hotfix and a brand reputation hit.

12. Feed production data back into your test suite

When production monitoring reveals a crash on a specific device or a failure in a specific flow, write an automated test for it. This closes the loop: analytics identifies the problem, testing validates the fix, and the new test prevents the regression from recurring.

Impact: Over time, your regression suite grows with tests grounded in real production bugs. These are the tests that actually prevent user-facing failures, not hypothetical scenarios. Teams using Drizz report flakiness at ~5% on these production-derived test suites compared to ~15% on traditional selector-based suites.

The checklist version

If you want a copy paste version for your team's sprint process, we have a full mobile app testing checklist that maps these practices to specific testing actions per sprint.

For a full breakdown of every testing type mentioned above, see our guide to types of mobile app testing.

FAQ

What are the most common mobile testing best practices?

Test on real devices, automate regression in CI/CD, follow the testing pyramid (70% unit, 20% integration, 10% E2E), run exploratory sessions before release, and monitor crash-free rates in production.

How many devices should I test on?

At minimum: one Samsung (One UI), one Pixel (stock Android), one budget device (3-4 GB RAM), and one iPhone. Cover your top 5-10 devices by user analytics for broader validation.

Should mobile testing be manual or automated?

Both. Automate smoke, regression, and cross device functional tests. Keep exploratory, usability, and interrupt testing manual. The split is roughly 70% automated, 30% manual for most teams.

What's the biggest mistake in mobile testing?

Testing only on emulators. Emulators miss OEM-specific behavior, real GPU rendering, and budget-device performance. A test that passes on a stock Android emulator can fail on a real Samsung or Xiaomi device.

How do I reduce flaky mobile tests?

Remove selector dependencies (use Vision AI instead), replace hardcoded waits with adaptive waits, isolate test data per test, and run on real devices. Teams switching to Drizz see flakiness drop from ~15% to ~5%.

When should mobile testing start in the development cycle?

At the first commit. Unit tests and static analysis from day one. Widget/component tests as UI is built. E2E and exploratory testing as features stabilize. Monitoring from launch onward.

‍

About the Author: