Laggy performance with large data sets

The total amount of memory that you are using is more than likely not affecting your performance here.

I suspect the main issue is that you have many waves in one data folder. Before you do anything else I recommend that you close the Data Browser window. The DB is going to spend time looking at your data and with 12k waves it may take some time. Next, the presence of a large number of waves also affects the execution of any command that applies to one or more waves as these have to be looked up. To get around this issue you might consider concatenating your individual waves into a 2D matrix where each wave is a single column. These are still easy to display and could reduce the over-all number of waves by a factor of ~600 if you are displaying that many waves per graph.

I hope this helps,

A.G.
WaveMetrics, Inc.

Log in or register to post comments

July 22, 2013 at 12:35 pm - Permalink

johnweeks

Simply drawing a graph with that many traces and points in it may be slow. A curve fit that adds a fit curve to a graph will cause it to redraw every iteration of the curve fit. To prevent that, try adding the /N flag to the CurveFit or FuncFit command. You could also remove the /D flag from the end of the command if you don't need the fit curve.

Anything else that causes data in the graph to change will cause the graph to redraw. You could try closing the graph and saving the recreation macro while you work on the analysis.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com

Log in or register to post comments

July 22, 2013 at 02:19 pm - Permalink

tkessler

Thanks for the tips. Closing the data browser does not seem to have an effect, and the curve fit tweaks do help a touch, but this issue is an overall slowdown and not just for the curve fitting.

If I cut out 4000 of the waves so there are only 8000 then the responsiveness greatly increases, even though the waves are still displayed 600 to a graph. This suggests it may be how the waves are stored more than anything else, and that there's some threshold after which Igor gets bottlenecked. For now I will try organizing the waves into subdirectories, and while using a 2D wave approach might help, in the long term it would take me redoing close to 5 years and thousands of lines of programming in Igor to implement that approach (ugh!). However, I'll give it a shot, at least for the routines I'm using for this one project.

Log in or register to post comments

July 26, 2013 at 03:04 pm - Permalink

aclight

Igor stores waves in each data folder as a linked list. When it needs to look up a wave name, it must traverse that linked list. When you have thousands of waves in a data folder, this lookup can be slow. That's why splitting your waves across data folders or concatenating multiple 1D waves into a single 2D wave can dramatically decrease lookup time.

I don't think there's much you can do to make loading of experiments you've already saved be faster. But maybe it's possible for you to load each experiment and reorganize the waves in that experiment into multiple data folders.

In any code you write, make sure you're using functions and not macros as much as possible.

Log in or register to post comments

July 27, 2013 at 06:27 am - Permalink

andyfaff

aclight wrote:
Igor stores waves in each data folder as a linked list. When it needs to look up a wave name, it must traverse that linked list. When you have thousands of waves in a data folder, this lookup can be slow.

Since there is a big re-write going on for IP7, I wonder if this is the time to jump to using a C++ container? I think using a map container offers O(lnN) performance, which would be faster than O(N) for a linked list. There is also unordered_map.

Log in or register to post comments

July 27, 2013 at 09:53 pm - Permalink

johnweeks

andyfaff wrote:
Since there is a big re-write going on for IP7, I wonder if this is the time to jump to using a C++ container? I think using a map container offers O(lnN) performance, which would be faster than O(N) for a linked list. There is also unordered_map.

And a hash map has constant-time lookup.

It's on the list on the whiteboard in my office. But it is relatively low priority compared to getting a working application out to beta. This topic comes up from time to time- I wonder how many people are affected by lengthy look-up when there are many waves.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com

Log in or register to post comments

July 29, 2013 at 08:59 am - Permalink

andyfaff

I think thousands of waves in a datafolder is quite common. I would normally do datafolders, but every man to himself.
A map container in c++ would store a wave by the hash of the key (wavename). The map container is in the STL, and will be available in most compilers/platforms, whereas hash_map is not. Perhaps I got big O wrong. unordered_map may also be a contender.

From stackoverflow (treat with caution):
"Some of the key differences are in the complexity requirements.
A map requires O(log(N)) time for inserts and finds.
An unordered_map requires an 'average' time of O(1) for inserts and finds but is allowed to have a worst case time of O(N).
So, usually, unordered_map will be faster, but depending on the keys and the hash function you store, can become much worse."

Log in or register to post comments

July 29, 2013 at 03:46 pm - Permalink

johnweeks

That's all correct. To get hash_map as part of the C++ standard library, you have to have C++11. That requires newer versions of Visual Studio and Xcode than we can require presently. There are other hashed maps available; we will look at the various benefits when it gets high enough on the priority list.

std::map uses binary search, which results in O(ln(N)). A hash map gets constant time from using the hash as a direct index into an array. The possibility of O(N) comes from hash collisions- when more than one item has the same hash, then the items are simply stored in an array, and the lookup goes back to simply walking the array. But if your hash is reasonably good, the arrays will be very small.

We also have to be careful that in implementing a fast wave lookup that we don't slow down wave *creation*. Some containers feature fast lookup, but slow insertion.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com

Log in or register to post comments

July 29, 2013 at 04:57 pm - Permalink

thomas_braun

Just one minor comment:

unordered_map (aka hash map) in namespace std::tr1:: was introduced with C++98 TR1, which e. g. VS2008 already supports. And recent xcode versions I guess also.

Log in or register to post comments

July 31, 2013 at 10:04 am - Permalink