Within the XOP, can I change the pointer of a wave object?

I understand that I can obtain the wave pointer by using Wavedata(wavehandle), but can I in reverse change the pointer that wave is pointing to? This is useful (and critical) as I want to use the buffer allocated by the OpenCL functions which is more compatible with their operation. In particular, I am exploring the Zero-Copy function with an AMD APU (Raven Ridge), in which I can leverage the 1.7 TFLOPs of the GPU.
If I understand your question, no. An XOP can not change the memory associated with a wave.
hrodstein wrote:
If I understand your question, no. An XOP can not change the memory associated with a wave.
That was correct. It's not the end of the world if I can't, but it would make life easier for me. Thanks for the reply.
@Sandbo: Are the OpenCL functions in question using a custom allocator or is it more about proper alignment?
It is about proper alignment, fyi: Motivation: http://pc-internet-zone.blogspot.com/2011/08/cpu-to-gpu-data-transfers-… Requirement: https://software.intel.com/en-us/articles/getting-the-most-from-opencl-… https://arrayfire.com/zero-copy-on-integrated-gpus/ I am using AMD GPUs, in particular, I am trying to use APU where GPU and CPU effectively share the memory between them. As a result, it is possible to perform zero copy between the two which (if I understand it correctly) ultimately allows sharing of physical memory, resulting in high speed DSP not limited by PCI-E bus. At the moment, I am able to obtain a speed using APU faster than discrete card significantly. (For simple kernel like scaling and digital down-conversion) To do this, however, it requires two alignment: 1. The data has to be aligned to 4k byte boundary; 2. The data should have a length in byte as integer multiple of 64 bytes For 1, I dropped the first strip of data by pointing to the next 4k byte boundary For 2, after taking into the reduced length due to 1, I dropped the last strip of data until the length is divisible by 64 bytes. Instead of the above, if we can select the pointer of a wave, then we can utilize the OpenCL flag clCreateBuffer(..ALLOC_HOST_PTR..) which will handle the alignment automatically. In particular, see option 5 on page 1-20 for AMD's implementation if we can use a pointer supplied by the API: http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming…
Would it result in too much of a performance hit to copy data from the Igor wave to the memory block allocated by OpenCL and then back again after performing whatever operations you need from OpenCL?
jtigor wrote:
Would it result in too much of a performance hit to copy data from the Igor wave to the memory block allocated by OpenCL and then back again after performing whatever operations you need from OpenCL?
That’s something I am trying to avoid. I am trying to use GPU to perform some element wise wave multiplication, the operation is thus largely bounded by memory bandwidth; any copying is making it take significantly longer (I also tried). At the moment, I think the alignment could still be done and it is working properly. But I can also imagine in some cases alignment without copying is impossible, so it maybe better if pointer can be selected, say if the wave was created in C.
Sandbo wrote:
But I can also imagine in some cases alignment without copying is impossible, so it maybe better if pointer can be selected, say if the wave was created in C.
If you are not close the physical memory limit you can always manually align the data as you described.
thomas_braun wrote:
Sandbo wrote:
But I can also imagine in some cases alignment without copying is impossible, so it maybe better if pointer can be selected, say if the wave was created in C.
If you are not close the physical memory limit you can always manually align the data as you described.
I can be wrong about this as I am new to C programming actually, could this case happen: My data has a size of 4 bytes (e.g. float), and somehow I need to align it to a byte boundary of 16 bytes (maybe a bit extreme). If for some reason the memory were allocated starting 2 bytes from the boundary, it seems there is no integer multiple of data I can drop to make it align. (for example, I am 14 bytes from the next boundary) Appreciated for any method as a future reference.
Quote:
If for some reason the memory were allocated starting 2 bytes from the boundary, it seems there is no integer multiple of data I can drop to make it align
This is correct. However, I just checked, and our code aligns wave data on 8 byte boundaries.