Within the XOP, can I change the pointer of a wave object?

hrodstein

If I understand your question, no. An XOP can not change the memory associated with a wave.

Log in or register to post comments

July 6, 2018 at 10:44 am - Permalink

Sandbo

hrodstein wrote:
If I understand your question, no. An XOP can not change the memory associated with a wave.

That was correct. It's not the end of the world if I can't, but it would make life easier for me. Thanks for the reply.

Log in or register to post comments

July 6, 2018 at 11:20 am - Permalink

thomas_braun

@Sandbo: Are the OpenCL functions in question using a custom allocator or is it more about proper alignment?

Log in or register to post comments

July 7, 2018 at 05:00 am - Permalink

Sandbo

It is about proper alignment, fyi: Motivation: http://pc-internet-zone.blogspot.com/2011/08/cpu-to-gpu-data-transfers-… Requirement: https://software.intel.com/en-us/articles/getting-the-most-from-opencl-… https://arrayfire.com/zero-copy-on-integrated-gpus/ I am using AMD GPUs, in particular, I am trying to use APU where GPU and CPU effectively share the memory between them. As a result, it is possible to perform zero copy between the two which (if I understand it correctly) ultimately allows sharing of physical memory, resulting in high speed DSP not limited by PCI-E bus. At the moment, I am able to obtain a speed using APU faster than discrete card significantly. (For simple kernel like scaling and digital down-conversion) To do this, however, it requires two alignment: 1. The data has to be aligned to 4k byte boundary; 2. The data should have a length in byte as integer multiple of 64 bytes For 1, I dropped the first strip of data by pointing to the next 4k byte boundary For 2, after taking into the reduced length due to 1, I dropped the last strip of data until the length is divisible by 64 bytes. Instead of the above, if we can select the pointer of a wave, then we can utilize the OpenCL flag clCreateBuffer(..ALLOC_HOST_PTR..) which will handle the alignment automatically. In particular, see option 5 on page 1-20 for AMD's implementation if we can use a pointer supplied by the API: http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming…

Log in or register to post comments

July 7, 2018 at 11:08 pm - Permalink

jtigor

Would it result in too much of a performance hit to copy data from the Igor wave to the memory block allocated by OpenCL and then back again after performing whatever operations you need from OpenCL?

Log in or register to post comments

July 9, 2018 at 05:22 am - Permalink

Sandbo

jtigor wrote:
Would it result in too much of a performance hit to copy data from the Igor wave to the memory block allocated by OpenCL and then back again after performing whatever operations you need from OpenCL?

That’s something I am trying to avoid. I am trying to use GPU to perform some element wise wave multiplication, the operation is thus largely bounded by memory bandwidth; any copying is making it take significantly longer (I also tried). At the moment, I think the alignment could still be done and it is working properly. But I can also imagine in some cases alignment without copying is impossible, so it maybe better if pointer can be selected, say if the wave was created in C.

Log in or register to post comments

July 9, 2018 at 07:19 am - Permalink

thomas_braun

Sandbo wrote:
But I can also imagine in some cases alignment without copying is impossible, so it maybe better if pointer can be selected, say if the wave was created in C.

If you are not close the physical memory limit you can always manually align the data as you described.

Log in or register to post comments

July 9, 2018 at 10:22 am - Permalink

Sandbo

thomas_braun wrote:

Sandbo wrote:
But I can also imagine in some cases alignment without copying is impossible, so it maybe better if pointer can be selected, say if the wave was created in C.
If you are not close the physical memory limit you can always manually align the data as you described.

I can be wrong about this as I am new to C programming actually, could this case happen: My data has a size of 4 bytes (e.g. float), and somehow I need to align it to a byte boundary of 16 bytes (maybe a bit extreme). If for some reason the memory were allocated starting 2 bytes from the boundary, it seems there is no integer multiple of data I can drop to make it align. (for example, I am 14 bytes from the next boundary) Appreciated for any method as a future reference.

Log in or register to post comments

July 10, 2018 at 07:32 am - Permalink

hrodstein

Quote:
If for some reason the memory were allocated starting 2 bytes from the boundary, it seems there is no integer multiple of data I can drop to make it align

This is correct. However, I just checked, and our code aligns wave data on 8 byte boundaries.

Log in or register to post comments

July 10, 2018 at 07:56 pm - Permalink

Sandbo

Thanks for the check, so it seems I don't have to worry about the offset for now.

Log in or register to post comments

July 12, 2018 at 07:23 am - Permalink