I understand that I can obtain the wave pointer by using Wavedata(wavehandle), but can I in reverse change the pointer that wave is pointing to?
This is useful (and critical) as I want to use the buffer allocated by the OpenCL functions which is more compatible with their operation.
In particular, I am exploring the Zero-Copy function with an AMD APU (Raven Ridge), in which I can leverage the 1.7 TFLOPs of the GPU.
[quote=hrodstein]If I understand your question, no. An XOP can not change the memory associated with a wave.[/quote]
That was correct. It's not the end of the world if I can't, but it would make life easier for me.
Thanks for the reply.
It is about proper alignment, fyi:
Motivation: http://pc-internet-zone.blogspot.com/2011/08/cpu-to-gpu-data-transfers-…
Requirement: https://software.intel.com/en-us/articles/getting-the-most-from-opencl-…https://arrayfire.com/zero-copy-on-integrated-gpus/
I am using AMD GPUs, in particular, I am trying to use APU where GPU and CPU effectively share the memory between them.
As a result, it is possible to perform zero copy between the two which (if I understand it correctly) ultimately allows sharing of physical memory, resulting in high speed DSP not limited by PCI-E bus.
At the moment, I am able to obtain a speed using APU faster than discrete card significantly. (For simple kernel like scaling and digital down-conversion)
To do this, however, it requires two alignment:
1. The data has to be aligned to 4k byte boundary;
2. The data should have a length in byte as integer multiple of 64 bytes
For 1, I dropped the first strip of data by pointing to the next 4k byte boundary
For 2, after taking into the reduced length due to 1, I dropped the last strip of data until the length is divisible by 64 bytes.
Instead of the above, if we can select the pointer of a wave, then we can utilize the OpenCL flag clCreateBuffer(..ALLOC_HOST_PTR..) which will handle the alignment automatically.
In particular, see option 5 on page 1-20 for AMD's implementation if we can use a pointer supplied by the API:
http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming…
Would it result in too much of a performance hit to copy data from the Igor wave to the memory block allocated by OpenCL and then back again after performing whatever operations you need from OpenCL?
[quote=jtigor]Would it result in too much of a performance hit to copy data from the Igor wave to the memory block allocated by OpenCL and then back again after performing whatever operations you need from OpenCL?[/quote]
That’s something I am trying to avoid.
I am trying to use GPU to perform some element wise wave multiplication, the operation is thus largely bounded by memory bandwidth; any copying is making it take significantly longer (I also tried).
At the moment, I think the alignment could still be done and it is working properly. But I can also imagine in some cases alignment without copying is impossible, so it maybe better if pointer can be selected, say if the wave was created in C.
[quote=Sandbo] But I can also imagine in some cases alignment without copying is impossible, so it maybe better if pointer can be selected, say if the wave was created in C.[/quote]
If you are not close the physical memory limit you can always manually align the data as you described.
[quote=thomas_braun][quote=Sandbo] But I can also imagine in some cases alignment without copying is impossible, so it maybe better if pointer can be selected, say if the wave was created in C.[/quote]
If you are not close the physical memory limit you can always manually align the data as you described.[/quote]
I can be wrong about this as I am new to C programming actually, could this case happen:
My data has a size of 4 bytes (e.g. float), and somehow I need to align it to a byte boundary of 16 bytes (maybe a bit extreme).
If for some reason the memory were allocated starting 2 bytes from the boundary, it seems there is no integer multiple of data I can drop to make it align. (for example, I am 14 bytes from the next boundary)
Appreciated for any method as a future reference.
[quote]If for some reason the memory were allocated starting 2 bytes from the boundary, it seems there is no integer multiple of data I can drop to make it align[/quote]
This is correct. However, I just checked, and our code aligns wave data on 8 byte boundaries.
July 6, 2018 at 10:44 am - Permalink
July 6, 2018 at 11:20 am - Permalink
July 7, 2018 at 05:00 am - Permalink
July 7, 2018 at 11:08 pm - Permalink
July 9, 2018 at 05:22 am - Permalink
July 9, 2018 at 07:19 am - Permalink
July 9, 2018 at 10:22 am - Permalink
July 10, 2018 at 07:32 am - Permalink
July 10, 2018 at 07:56 pm - Permalink
In reply to by hrodstein
Thanks for the check, so it seems I don't have to worry about the offset for now.
July 12, 2018 at 07:23 am - Permalink