Description
Following up on some discussions about the cost of thread switches I instrumented some of our code in various places, including the scene swap. The code I used prints something when we spend more than 1ms on the swap hook (I don't think it would be useful to land as is). a simple modification in render_backend.rs:
let swap_hook_start_ns = precise_time_ns();
let (resume_tx, resume_rx) = channel();
tx.send(SceneSwapResult::Complete(resume_tx)).unwrap();
// Block until the post-swap hook has completed on
// the scene builder thread. We need to do this before
// we can sample from the sampler hook which might happen
// in the update_document call below.
resume_rx.recv().ok();
let swap_hook_time_ns = precise_time_ns() - swap_hook_start_ns;
if swap_hook_time_ns > 1_000_000 {
println!("Swap hook took {} nanoseconds ({}ms)", swap_hook_time_ns, swap_hook_time_ns as f32 * 0.000001);
}
I browsed for a few minutes on the 12 cores desktop computer I use for work:
Swap hook took 7046548 nanoseconds (7.046548ms)
Swap hook took 1215051 nanoseconds (1.215051ms)
Swap hook took 1343500 nanoseconds (1.3435ms)
Swap hook took 3820637 nanoseconds (3.820637ms)
Swap hook took 1281691 nanoseconds (1.281691ms)
Swap hook took 1086652 nanoseconds (1.086652ms)
Swap hook took 1387635 nanoseconds (1.387635ms)
Swap hook took 1495307 nanoseconds (1.495307ms)
Swap hook took 1414208 nanoseconds (1.414208ms)
Swap hook took 2053617 nanoseconds (2.053617ms)
Swap hook took 2142136 nanoseconds (2.142136ms)
Swap hook took 9683727 nanoseconds (9.683727ms)
Swap hook took 1278505 nanoseconds (1.278505ms)
Swap hook took 1569375 nanoseconds (1.569375ms)
Gecko was compiled with ac_add_options --enable-optimize --disable-debug --enable-release
This doesn't prove my point about context switches being bad because after notifying the render backend the swap hook calls into some code, so there is a chance that the scene builder is still in that code when the message from the render backend arrives, so this isn't strictly measuring the thread switches and channel overhead.
That said, occasionally spending 1 millisecond in the swap hook is already worth investigating. Notice that there are also 7 and 9ms in there. I didn't even have to use stress or to start a build on the side to give the scheduler a more challenging workload, this is just browsing.