Constant cache vs Texture cache for broadcasting behaviour in CUDA

Question

Constant cache vs Texture cache for broadcasting behaviour in CUDA

576 views Asked by user1096294 At 28 February 2014 at 00:07

I am interested in the differences between the constant cache and the texture cache for devices of compute capability 3.5, particularly the broadcasting behaviour. When all threads in a warps issue a request for the same data element from the constant memory and it hits in the cache, it is broadcasted to all threads in a single cycle. What is the behaviour of the texture cache in this case? Do the loads get serialised?

Also, am I correct to think that both the constant and texture cache are per multiprocessor and hence shared by multiple blocks?

Original Q&A

There are 1 answers

**Greg Smith** · Answer 1 · 2014-02-28T06:24:50+00:00

NVIDIA does not provide additional details on the size or location of the constant cache.

The number of texture caches vary.

CC 2.0 1 Texture unit per SM
CC 2.1 2 Texture units per SM (1 per warp scheduler)
CC 3.0/3.5 4 Texture units per SM (1 per warp scheduler)
CC 3.2/gk208 2 Texture units per SM (1 per 2 warp schedulers)

Warps in blocks will be allocated across the warp schedulers in a SM.

If all 32 threads in a warp perform an indexed constant read to the same address it will be performed in 1 instruction issue if the request hits in the cache.

If all 32 threads in a warp perform a LDG to the same address in CC3.5 texture cache the data will be requested and returned over 8 cycles.

TechQA.

Constant cache vs Texture cache for broadcasting behaviour in CUDA

There are 1 answers

Related Questions in CACHING

Related Questions in NVIDIA

Related Questions in GPU-CONSTANT-MEMORY

Popular Questions

Trending Questions