How to Synchronize with Julia CUDArt?

Question

How to Synchronize with Julia CUDArt?

365 views Asked by Michael Ohlrogge At 19 June 2016 at 01:12

I'm just starting to use Julia's CUDArt package to manage GPU computing. I am wondering how to ensure that if I go to pull data from the gpu (e.g. using to_host()) that I don't do so before all of the necessary computations have been performed on it.

Through some experimentation, it seems that to_host(CudaArray) will lag while the particular CudaArray is being updated. So, perhaps just using this is enough to ensure safety? But it seems a bit chancy.

Right now, I am using the launch() function to run my kernels, as depicted in the package documentation.

The CUDArt documentation gives an example using Julia's @sync macro, which seems like it could be lovely. But for the purposes of @sync I am done with my "work" and ready to move on as soon as the kernel gets launched with launch(), not once it finishes. As far as I understand the operation of launch() - there isn't a way to change this feature (e.g. to make it wait to receive the output of the function it "launches").

How can I accomplish such synchronization?

Original Q&A

There are 2 answers

Michael Ohlrogge On 19 June 2016 at 14:23

Ok, so, there isn't a ton of documentation on the CUDArt package, but I looked at the source code and I think it looks straightforward on how to do this. In particular, it appears that there is a device_synchronize() function that will block until all of the work on the currently active device has finished. Thus, the following in particular seems to work:

using CUDArt
md = CuModule("/path/to/module.ptx",false)
MyFunc = CuFunction(md,"MyFunc")
GridDim = 2*2496
BlockDim = 64
launch(MyFunc, GridDim, BlockDim, (arg1, arg2, ...)); 
device_synchronize()
res = to_host(arg2)

I'd love to hear from anyone with more expertise though if there is anything more to be aware of here.

**Chris Rackauckas** · Accepted Answer · 2016-07-11T00:35:27+00:00

I think the more canonical way is to make a stream for each device:

streams = [(device(dev); Stream()) for dev in devlist]

and then inside the @async block, after you tell it to do the computations, you use the wait(stream) function to tell it to wait for that stream to finish its computations. See the Streams example in the README.

TechQA.

How to Synchronize with Julia CUDArt?

There are 2 answers

Related Questions in PARALLEL-PROCESSING

Related Questions in JULIA

Related Questions in JULIA-GPU

Popular Questions

Popular Tags

Trending Questions