FindDuplicates improvements for text waves

Apologies if this has been posted already. 

FindDuplicates is pretty handy, but it seems to be missing some features when used on text waves.

1. It would be nice if there was a flag to turn off case sensitivity.  StringMatch is already case insensitive, and strsearch and cmpstr can be either, so adding this feature would make the behavior of FindDuplicates more consistent with the other functions.

2. It would also be nice if there was a text wave equivalent of the UN and UNC flags for numeric waves.  (Also, a description of those flags isn't in either the pdf of the manual or in the help files in Igor).

 

Maybe you can use Extract?

make/O/N=100/T junk
•junk[0,;2]="w"+num2str(p)
•junk[1,;2]="W"+num2str(p)
Extract junk, lowercase, CmpStr((junk)[0,1], "w1", 1)==0
print lowercase
  lowercase[0]= {"w10","w12","w14","w16","w18"}

 

I should have mentioned that I have workarounds using lowerstr and upperstr, and adding the case sensitive flag is more about convenience and having nice compact code than not being able to do something.

I second the request of KZarzana. Both would be nice improvements.

@KZarzana: What do the UN/UNC flags?

The conversion of arbitrary encoded text to upper/lower is not simple.  If you happen to know that the contents of your wave are ascii characters it is simple enough for you to convert before calling FindDuplicates as mentioned above.  I will add your request to the wish list.

 

A.G.

Hi,

I noticed today that if the input wave has a length of 1, it throws an error of insufficient number of points.  Since I am am only looking for unique values, a wave with length 1 should return that one point.  I have coded around it but it would be nice if findduplicate could handle a length of 1.

Andy

Hi Andy,

I find it logically impossible to look for duplicates when you have less than two points in the wave.

A.G.

We will have to disagree on what it means to be "robust".  

My programming philosophy is that I want an operation or function to return an error as soon as one is encountered.  Otherwise you may find that something did not work 15 steps later and you would have to trace it all the way to an operation or function that silently returned a zero point wave or a NaN.  In other words, sooner or later you need to implement some tests in your code.  They could be before you use bad input data or after.

 

Can we compromise and at least make a note in the documentation?  It took  some troubleshooting to figure out that that specific entry had only 1 point.  I was scanning 6000+ input sets and it barfed in the middle.

Andy

It's always a good idea to improve the documentation.  Please email me a specific suggestion.

A.G.