List Question
20 TechQA 2023-11-05T18:00:44.897000On today's GPUs, can warps be recombined dynamically?
47 views
Asked by Armin Rigo
CUDA __shfl_down_sync does not work with __match_any_sync
178 views
Asked by SnowSR
What is warp shuffling in CUDA and why is it useful?
2.1k views
Asked by gonidelis
Compute per-warp histogram without shared memory
159 views
Asked by pem
Why is my CUDA warp shuffle sum using the wrong offset for one shuffle step?
619 views
Asked by nanofarad
Are threads in a multi-dimensional CUDA kernel blocks packed to fill warps?
546 views
Asked by einpoklum
Monitor active warps and threads during a divergent CUDA run
442 views
Asked by Silicomancer
Pre 8.x equivalent of __reduce_max_sync() in CUDA
259 views
Asked by Serge Rogatch
What's the alternative for __match_any_sync on compute capability 6?
1k views
Asked by Johan
Why use thread blocks larger than the number of cores per multiprocessor
299 views
Asked by Numaerius
CUDA shared memory and warp synchronization
2k views
Asked by nglee
__activemask() vs __ballot_sync()
4.5k views
Asked by Fabio T.
OpenGL compute shader mapping to nVidia warps
1k views
Asked by Danol
Warp scheduling in Kepler GPU
84 views
Asked by StrikeW
Warp shuffling for CUDA
3.8k views
Asked by Timocafé
CUDA Reduction: Warp Unrolling (School)
1.8k views
Asked by Michael Choi
How do I do the converse of shfl.idx (i.e. warp scatter instead of warp gather)?
315 views
Asked by einpoklum
Do modern nVIDIA GPUs perform sub-warp scheduling of work?
734 views
Asked by einpoklum
Some intrinsics named with `_sync()` appended in CUDA 9; semantics same?
645 views
Asked by einpoklum