Performance issues when loading data from HDF5 files

Has anyone else noticed major performance problems when loading data from HDF5 files larger than one or two hundred MB? The data can take many minutes to load, and occasionally Igor will simply crash in the end. The experience of my colleagues and I is that 64-bit Windows is the most affected, 32-bit Windows is also affected but marginally better, and OSX is the best and least crash-prone (but still far slower than seems reasonable).

Naturally, you should suspect a memory jam, but the activity monitors always indicate that there's plenty of available memory. Likewise, the processor is not going nuts either.

Moreover, if I open any of the same files using the free Java-based HDFView, it can easily do the job in seconds. You might think that HDFView is not really loading the whole file upfront (don't know whether that's true or not), but it's easy to check that that doesn't really account for the difference: just ask HDFView to export all the data to ASCII, and it will generally take only a small fraction of the time that Igor requires to load the data. That also seems to argue against there being something "wrong" with our files.

Similar experiences? Any ideas? Thanks.
Could you make one of the HDF5 files available?

I already browsed and opened HDF5 files of sizes up to 10/20 Gig in 32bit IP6 and never had any problems.
In order to investigate I would need a sample file and instructions on how you loaded the file.
Howard,

I have the same question about very slow loading of 100+ MB HDF5 files.

I am using HDF5XOP "Load Group" to load two HDF5 files. They are formatted identically. One is 5 MB and loads in 2 s. The other is 103 MB and loads in 3:40 min (an eternity!)

I am running 64-bit Windows 7. I posted the two files here so you can check them out: http://web.gps.caltech.edu/~rebeccaw/HDF5_Example_Files/

Thanks,
Rebecca
Rebecca: The problem is that your files have large arrays of strings - specifically the Timestamp dataset, combined with the fact that the HDF5 XOP is using a simple but slow way to store text data. This is probably because I was did not think about the possibility of loading very large string datasets (about 1 million elements in your h1.h5 file).

If I load your larger file in Igor7 (currently in beta testing), it is very fast. This is because we changed the way text wave elements are stored in Igor7, making it much faster.

It would be possible to improve the HDF5 XOP in this regard but I don't know if/when I will have the time to work on it.

If you want to run the Igor7 beta, see http://www.igorpro.net/beta-signup/
Howard,

Thanks for the quick reply. These are our own HDF5 files (written from Labview), so I'll be able to change the strings to numerics for future data. I also signed up as a beta tester for Igor 7.

- Rebecca