my question concerns ComputeShader, HLSL code in particular. So, DeviceContext.Dispath(X, Y, Z) spawns X * Y * Z groups, each of which has x * y * z individual threads set in attribute [numthreads(x,y,z)]. The question is, how can I get total number of ThreadGroups dispatched and number of threads in a group? Let me explain why I want it - the amount of data I intend to process may vary significantly, so my methods should adapt to the size of input arrays. Of course I can send Dispath arguments in constant buffer to make it available from HLSL code, but what about number of threads in a group? I am looking for methods like GetThreadGroupNumber() and GetThreadNumberInGroup(). I appreciate any help.
HLSL Get number of threadGroups and numthreads in code
3.9k views Asked by Ilia At
1
There are 1 answers
Related Questions in C++
- How to immediately apply DISPLAYCONFIG_SCALING display scaling mode with SetDisplayConfig and DISPLAYCONFIG_PATH_TARGET_INFO
- Why can't I use templates members in its specialization?
- How to fix "Access violation executing location" when using GLFW and GLAD
- Dynamic array of structures in C++/ cannot fill a dynamic array of doubles in structure from dynamic array of structures
- How do I apply the interface concept with the base-class in design?
- File refuses to compile std::erase() even if using -std=g++23
- How can I do a successful map when the number of elements to be mapped is not consistent in Thrust C++
- Can std::bit_cast be applied to an empty object?
- Unexpected inter-thread happens-before relationships from relaxed memory ordering
- How i can move element of dynamic vector in argument of function push_back for dynamic vector
- Brick Breaker Ball Bounce
- Thread-safe lock-free min where both operands can change c++
- Watchdog Timer Reset on ESP32 using Webservers
- How to solve compiler error: no matching function for call to 'dmhFS::dmhFS()' in my case?
- Conda CMAKE CXX Compiler error while compiling Pytorch
Related Questions in DIRECTX
- Decal renderer does not discard pixels properly
- how to fix CREATEGRAPHICSPIPELINESTATE_VS_ROOT_SIGNATURE_MISMATCH?
- Why do we only create one render target view?
- DirectX12 development, character steering problem
- DirectX rendering Transparency
- Which format should be more smooth for complex R8 textures - BC4 or BC7?
- Missing HLSL Debug Symbols with D3Dcompile in Visual Studio
- C++ DirectX compress 3D texture into 2D texture
- Nvidia HDR Encoder
- Delphi FMX: How to write a custom shader filter?
- Ternary operator with SamplerStates
- How to export symbols in rust binaries
- How can I safely alter a texture from multiple threads? (seems like there is no `InterlockedAdd`)
- How can I use (resource) barriers to sync access to a `RWTexture2D` between different shaders?
- Point light shadows work wrong, how can I debug it?
Related Questions in HLSL
- Decal renderer does not discard pixels properly
- gl_DrawID equivalent for Directx12 ExecuteIndirect HLSL
- Missing HLSL Debug Symbols with D3Dcompile in Visual Studio
- C++ DirectX compress 3D texture into 2D texture
- How do I change the way my brace completion is handled in Visual Studio 2022 for Unity when coding in HLSL?
- How to get screen UVs in the vert stage of a unity shader?
- Delphi FMX: How to write a custom shader filter?
- Ternary operator with SamplerStates
- Implementing the Phong reflection model in a compute shader - unexpected response to change of spectral and diffuse coefficients
- How can I safely alter a texture from multiple threads? (seems like there is no `InterlockedAdd`)
- Compute Shader call breaks following Blit call in build, but not in editor
- Point light shadows work wrong, how can I debug it?
- Simultaneous access to the same pixel in a ray generation shader - is it safe?
- D3D12: Can we really not have a 1-dimensional buffer/texture of size > 25000?
- How can I fix the normals in this instanced lighting example? (Monogame/XNA)
Related Questions in COMPUTE-SHADER
- MTLBuffers returning "shifted" values in kernel shader using structs, works on simulator, not on device
- Incorrect Frustum Culling behavior
- indexing into wgsl array with a variable?
- mapping and reading buffer does not give expected outcome
- getting modified buffer from compute shader
- Equivalent to float AtomicAdd in WebGPU
- Displaying image to window using compute shader not working
- trouble with using compute shaders for bevy 0.12
- what does exactly the function barrier do in GLSL?
- Proper way to read from a texture in a compute shader which uses a storage texture
- HLSL Compute Shader Race Condition
- imageStore causing crash in GL ES 3.1 Compute shader using ANGLE on windows
- ComputeShader in Unity : Why are the values all wrong?
- In a tree structure in Unity / C#, how do you sum calculated values from root to leaves using ComputeShaders?
- Compute shaders : How do you pass not-so-regular structures as parameters?
Related Questions in DIRECTCOMPUTE
- D3DCSX 11 Problems hooking up and running FFT function
- How to get correct output of compute shader in dx11,I use SharpDx
- Can I avoid exposing the kernel source code in Direct Compute?
- Num Threads trade-off in non-parallelizable work
- Compute shader best-practice/modern-style examples
- GPGPU threading strategy
- GPU frustum culling : why using scan?
- Warp threads not SIMD synchronous
- There can be at most 65535 Thread Groups in each dimension of a Dispatch call
- DirectCompute shader (HLSL) has strange array size
- DirectCompute D3DReflect GetConstantBufferByIndex always return null pointer
- DirectCompute multithreading performance (threads and thread groups) for multidimensional array processing
- DirectCompute shader: how to get rid of warning X3205: 'round'
- Implementing a SpinLock in a HLSL DirectCompute shader
- DirectCompute: How to read from a RWTexture2D<float4>?
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
The number of threads in a group is simply the product of the
numthreadsdimensions. For example,numthreads(32,8,4)will have32*8*4 = 1024threads per group. This can be determined statically at compile time.The ID for a particular thread-group can be determined by adding a
uint3input argument with theSV_GroupIdsemantic.The ID for a particular thread within a thread-group can be determined by adding a
uint3input argument with theSV_GroupThreadIDsemantic, oruintSV_GroupIndexif you prefer a flattened version.As far as providing information to each thread on the total size of the dispatch, using a constant buffer is your best bet. This is analogous to the graphics pipeline, where the pixel shader doesn't naturally know the viewport dimensions.
It's also worth mentioning that if you do find yourself in a position where each thread needs to know the overall dispatch size, you should consider restructuring your algorithm. In general, it's better to dispatch a variable numbers of thread groups, each with a fixed amount of work, rather than dispatching a fixed number of threads with a variable amount of work. There are of course exceptions but this will tend provide better utilization of the hardware.