-
Notifications
You must be signed in to change notification settings - Fork 1k
Use specialized intrinsics for dot4{I, U}8Packed
on SPIR-V and HLSL
#7574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
e06d774
to
705dc6d
Compare
dot4{I, U}8Packed
on SPIR-V and HSLSdot4{I, U}8Packed
on SPIR-V and HLSL
When checking for capabilities in SPIR-V, `capabilities_available == None` indicates that all capabilities are available. However, some capabilities are not even defined for all language versions, so we still need to check if the requested capabilities even exist in the language version we're using.
c93b2dd
to
be9debd
Compare
In case the above text is confusing: I pushed be9debd as an example of how an additional check for |
This comment was marked as resolved.
This comment was marked as resolved.
The sign issue has been resolved. The implementation should be correct as is. Turns out SPIR-V doesn't care about signedness in this case: "all signed and unsigned operations always work on unsigned types, and the semantics of operation come from the opcode" (credits to Nicol Bolas, see this StackOverflow answer). |
This comment was marked as resolved.
This comment was marked as resolved.
See gfx-rs#7574, in particular <gfx-rs#7574 (comment)>. Adds `FeaturesWGPU::NATIVE_PACKED_INTEGER_DOT_PRODUCT`, which is available on `Adapter`s that support the specialized implementations for `dot4I8Packed` and `dot4U8Packed` implemented in gfx-rs#7574 (currently, this includes DX12 with Shader Model >= 6.4 and Vulkan with device extension "VK_KHR_shader_integer_dot_product". If this feature is requested during `Device` creation, the device is set up such that `dot4I8Packed` and `dot4U8Packed` will be compiled to their respective specialized instructions. This means that, on a vulkan `Device`, the SPIR-V language version is set to 1.6, and the required SPIR-V capabilities are marked as available (on DX12, requesting the feature doesn't change anything since availability of the feature already guarantees that Shader Model >= 6.4, which is all we need to generate specialized code).
Using the specialized instructions from <gfx-rs/wgpu#7574>.
I separated out the above issue of how these naga optimizations can be used in |
See gfx-rs#7574, in particular <gfx-rs#7574 (comment)>. Adds `FeaturesWGPU::NATIVE_PACKED_INTEGER_DOT_PRODUCT`, which is available on `Adapter`s that support the specialized implementations for `dot4I8Packed` and `dot4U8Packed` implemented in gfx-rs#7574 (currently, this includes DX12 with Shader Model >= 6.4 and Vulkan with device extension "VK_KHR_shader_integer_dot_product"). If this feature is requested during `Device` creation, the device is set up such that `dot4I8Packed` and `dot4U8Packed` will be compiled to their respective specialized instructions. This means that, on a vulkan `Device`, the SPIR-V language version is set to 1.6, and the required SPIR-V capabilities are marked as available (on DX12, requesting the feature doesn't change anything since availability of the feature already guarantees that Shader Model >= 6.4, which is all we need to generate specialized code).
Connections
dot4U8Packed
anddot4I8Packed
#7494 (comment)Description
Replace the polyfills for
dot4I8Packed
anddot4U8Packed
from #7494 with specialized instructions where they are available. More precisely:dot4add_{i,u}8packed
(the specification explicitly states: "no separate capability bit check is required, beyond assuring the use of Shader Model 6.4")DotProduct
andDotProductInput4x8BitPacked
are available, in which case we use the "Packed Vector Format" argument forOp{S,U}Dot
(and we emit the correspondingOpCapability
andOpExtension
statements at the beginning of the output).If either of these tests fail, or if we are on Metal or GLSL, we fall back to the existing polyfills.
Testing
I added two tests with
.toml
configuration files that explicitly enable/prevent these specializations.Squash or Rebase?
I think each commit should pass all CI tests.
Open Questions / Notes
lang_version >= 1.6
,but I didn't check for high enoughlang_version
because I already check thecapabilities_available
field. This follows precedent in other places in the code base, but I'm not sure if it's sufficient to only checkcapabilities_available
. Ifcapabilities_available == None
, then comments in the code (here and here) indicate that this should be interpreted as "all capabilities are permitted", and so the specialized code will be generated. But I guess "all capabilities" is still restricted to "all capabilities that are defined forlang_version
". Should I check thelang_version
in this case somehow, or iscapabilities_available == None
only used for debugging anyway? Unfortunately,lang_version
isn't exposed anymore at the stage where we do the check, so I'd have to add it as an extra field to theWriter
.Unclear whether[Update: resolved in this comment]OpSDot
with "Packed Vector Format" expects signed or unsigned arguments in spv (see comment below)packed_char4
in Table 2.4 in the specification).Checklist
cargo fmt
.taplo format
.cargo clippy --tests
. If applicable, add:--target wasm32-unknown-unknown
cargo xtask test
to run tests.CHANGELOG.md
entry.