Memory management - effect of Redimension and other built-ins

My experiment files tend to bloat in size very quickly (*.pxp files routinely head into 2-3GB range), especially during the exploration phase of data analysis. Because of this, I am trying to optimize my functions to use memory more effectively and to stay clear of fragmentation issues. As such, I had a question about how some of the built-in functions affect memory.

My most pressing question is if a call to Redimension that reduces the number of points in a wave frees the memory holding the leftover points? I assume that a Redimension that preserves the number of points in the original wave has no effect on memory and that a Redimension that extends the wave will remain in the same memory location if there is enough free space following the original memory block. If there is not enough room in the current memory location, then a new block is allocated and the old one is freed after copying over?

Also, does a KillWave/MoveWave combination have an identical memory effect as a Duplicate/O call? Is one far preferable than the other? KillWave can't handle killing waves that are in use while Duplicate/O *can* overwrite waves that are in use, suggesting that there is a difference in how these operations work under the hood.

I discovered this difference when I recently coded some functions that create a free wave and then return a wave reference to that wave. That design motivates use of a call to KillWave and then MoveWave to mimic the effect of a Duplicate/O call when I need to persist the free wave.
yamad wrote:
My experiment files tend to bloat in size very quickly (*.pxp files routinely head into 2-3GB range), especially during the exploration phase of data analysis.


The data written to disk is a copy of the data in RAM, and is written as efficiently as possible. The precise layout of the data in RAM is immaterial.

So either I'm totally misunderstanding your post, or you're misunderstanding the relation between data in memory and data on disk.

There aren't all that many options to reduce the size of these pxp files, short of using compression (look at ImageTransform) or adjusting the number type of the data (e.g. if your data contains integers then you can redimension to an integer wave – just be careful of overflows). I think the hassle is typically not worth it.

If your main concern is the time that it takes to save then you can convert to the unpacked format. However, in addition to being less convenient to handle you will also lose out on some features such as saving waves containing wave references.
741 wrote:
The data written to disk is a copy of the data in RAM. However, the precise layout of the data in RAM is immaterial.


Sorry, I was just using the size of the files on disk as a way to indicate that memory is a precious resource while these files are open. If I understand correctly, Igor will load all of the waves, variables, etc in an experiment file into RAM when the experiment is open. In that situation, if the experiment is large and I use a lot of my own functions, the way those functions handle memory is critically related to the degree of memory fragmentation and how soon I will run into memory limits. Isn't that right, or am I missing something?
Quote:
My most pressing question is if a call to Redimension that reduces the number of points in a wave frees the memory holding the leftover points?


Redimensioning to a smaller size does not free memory. This is because the underlying OS memory manager calls don't free memory in that case.

Mac OS 9 and Windows 95 did free memory but Mac OS X and Windows XP and later don't. I think reclaiming the memory is a headache, as it requires taking an existing block in the heap and splitting into two blocks - one kept by the original pointer and the other free. So, I suspect, the programmers who write low-level OS code decided it was best to not do that.

To reclaim the memory you would have to shrink with Redimension, then Duplicate to a new wave, then kill the original wave. Igor will free the memory for the killed wave when nothing holds a reference to it. This means that, if you have a wave reference to the killed wave in a function, the memory won't be freed until you leave the function or clear the wave reference using WaveClear.

Quote:
KillWave can't handle killing waves that are in use while Duplicate/O *can* overwrite waves that are in use, suggesting that there is a difference in how these operations work under the hood.


Duplicate/O, even when the destination wave already exists, doesn't kill anything so there is no restriction like KillWaves. Duplicate/O redimensions the destination and copies data from the source to the destination. There is no killing involved.

Quote:
My experiment files tend to bloat in size very quickly (*.pxp files routinely head into 2-3GB range), especially during the exploration phase of data analysis.


If you really need all of that data in memory at one time, and if you are running on Windows, you would be a good candidate for IGOR64. Read the IGOR64 ReadMe carefully and use it only if you really need gigabytes of data in memory at one time.



Thanks, that info is really useful.

Quote:
Duplicate/O, even when the destination wave already exists, doesn't kill anything so there is no restriction like KillWaves. Duplicate/O redimensions the destination and copies data from the source to the destination. There is no killing involved.


If I understand this correctly, a Duplicate/O acts on the original memory location when the destination wave already exists. That is, it requires no additional memory allocation if a destination wave is already found. In the KillWave case, I am freeing the original memory location and then MoveWave turns my local reference into a global reference but does not actually copy data from one memory location to another. Is that right?

Quote:
If you really need all of that data in memory at one time, and if you are running on Windows, you would be a good candidate for IGOR64. Read the IGOR64 ReadMe carefully and use it only if you really need gigabytes of data in memory at one time.


Yes, I have IGOR64 installed, but I want to be able to pass around packed experiments with other members of the lab and all the "only if you really need it" warnings had me scared off from using it extensively.
Quote:
If I understand this correctly, a Duplicate/O acts on the original memory location when the destination wave already exists. That is, it requires no additional memory allocation if a destination wave is already found.


It will require additional memory if the source wave is bigger than the destination wave before the duplication. Otherwise it does not require additional memory. Duplicates resizes the destination wave, if necessary, and then copies bytes from the source to the destination.

Quote:
In the KillWave case, I am freeing the original memory location and then MoveWave turns my local reference into a global reference but does not actually copy data from one memory location to another. Is that right?


I don't know what that means. You can't do MoveWave after KillWaves. All wave references in functions are local variables that refer to global objects (waves).

MoveWave does not copy the wave - it just removes it from the linked list for its original data folder and appends it to the linked list of the destination data folder.

KillWaves marks a wave to be killed, and its memory deleted, when there are no more wave references pointing to the wave. The wave references, which you can think of as pointers to the memory for the wave, may be in user-defined functions, in WAVE waves, or stored internally in Igor.