John McDole b755641559
Address frame policy benchmark flakes (#155130)
Recently the microbenchmarks were flakey, but from an older bug. Turns out, `LiveTestWidgetsFlutterBindingFramePolicy` is defaulted to `fadePointers` with this fun note:

> This can result in additional frames being pumped beyond those that
the test itself requests, which can cause differences in behavior

Both `text_intrinsic_bench` and `build_bench` use a similar pattern:
* Load stocks app
* Open the menu
* Switch to `benchmark` frame policy

What happens, rarely, is that
`LiveTestWidgetsFlutterBinding.pumpBenchmark()` will call (async) `handleBeginFrame` and `handleDrawFrame`. `handleDrawFrame` juggles a tri-state boolean (null, false, true). This boolean is only reset to `null` when handleDrawFrame is called back to back, say, from an extra frame that was scheduled.

1. Switch tri-state boolean to an enum, its easier to read
2. remove asserts that compile away in benchmarks (`--profile`)
3. use `Error.throwWithStackTrace` to keep stack traces.

I've been running this test on device lab hardware for hundreds of runs and have not hit a failure yet.

Fixes #150542
Fixes #150543 - throw stack!
2024-09-12 23:19:15 +00:00
..

microbenchmarks

To run these benchmarks on a device, first run flutter logs in one window to see the device logs, then, in a different window, run:

flutter run -d $DEVICE_ID --profile lib/benchmark_collection.dart

To run a subset of tests:

flutter run -d $DEVICE_ID --profile lib/benchmark_collection.dart --dart-define=tests=foundation/change_notifier_bench.dart,language/sync_star_bench.dart

To specify a seed value for shuffling tests:

flutter run -d $DEVICE_ID --profile lib/benchmark_collection.dart --dart-define=seed=12345

The results should be in the device logs.

Avoid changing names of the benchmarks

Each microbenchmark is identified by a name, for example, "catmullrom_transform_iteration". Changing the name passed to BenchmarkResultPrinter.addResult will effectively remove the old benchmark and create a new one, losing the historical data associated with the old benchmark in the process.