Searching characters through a string

Hi, 

 

I have a 1D text wave with chemical formulas. Now I want to sort them into categories. So for example: Formulas look like CxHyNzOp where x,y,z and p are all numerals and z could also be zero. From this superset, I want to filter out a subset with the formulas of the format CxHyOp, and put them in a new 1D text wave. Kindly advise how to do this. 

 

Many Thanks, 

Peeyush

Here is what I came up with:

Function ExtractFormula(Wave in)
    String searchFor = "^[cC][0-9][hH][0-9][oO][0-9]"
   
    String wList
    wfprintf wList, "%s;", in
   
    String strResult = GrepList(wList,searchFor)
    Wave/T wResult = ListToTextWave(strResult,";")
    Duplicate/O wResult, $(NameOfWave(in)+"_out")
End

Adjust the Grep string 'searchFor' to match your desired chemical formula.

With chemical formula in proper syntax (capitalization), the search string can be shortened to this:

String searchFor = "^[C][0-9][H][0-9][O][0-9]"

 

These are amazing insights and they did the job! Thank you so much, chozo, jjweimer and tony! I deeply appreciate your taking out time to respond, and thoughts. 

It was working without declaring wave/T for the input. Doesn't a function input accepts both numeric and text waves or is that considered a 'bug'?

Huh. I guess wfprintf is one of those baffling Igor inconsistencies. In fact, of course, wfprintf needs to work with any wave type, so it probably determines the wave's type at run-time.

And if you want to avoid the list conversion and just return the result as free wave:

 

Function/WAVE ExtractFormula(Wave/T in)
    String searchFor = "^[CHO0-9]+$"
   
    Make/FREE/T/N=0 result
    Grep/E=searchFor in as result
   
    return result
End

 

It's always interesting to see how a problem gets successively ground down to the bare minimum here in the forum. I was somehow under the impression that Grep is only for files. Good to learn a new trick.