Open
Description
I have observed a performance regression in the OpenMC scientific simulation application (which uses OpenMP target offloading) on the AMD MI250 due to commit 7dbd6cd, which is part of PR #114357. The regression is not observed on NVIDIA GPUs.
This commit causes OpenMC to be 20% slower overall on a typical benchmark problem, with the most expensive kernel in the simulation being about 2x slower.
OpenMC can be installed and its performance benchmark run using the following script: https://github.com/jtramm/openmc_offloading_builder/tree/main
FOM before this commit (higher is better):
Calculation Rate (inactive) = 239870.0 particles/second
FOM after this commit:
Calculation Rate (inactive) = 192707.0 particles/second
Main kernel timing before this commit:
XS lookups (Fuel) = 1.8720e+01 seconds
Main kernel after this commit:
XS lookups (Fuel) = 3.5231e+01 seconds
Rocprof shows similar slowdown for this kernel, which it lists as:
__omp_offloading_25_7c638f81__ZN6openmc32process_calculate_xs_events_fuelEv_l256.kd