Running combinations (e.g. assigning chemical formulas to molar masses)

Hi, 

I have a list of molecular masses (accurate to four places after decimal) to which I need to assign chemical formulas. Lets call this list A. I want to write a program that can do this. Basically, I will have a list of elements CHNOS with their molar masses (in list B), and my program needs to pick a combination of elements that gets me closest to the molecular masses in list A. 

I am wondering what is the best way to do this without involving a bunch of for loops going in circles to narrow down on the mass. Fundamentally, it is just about trying out different combinations of elements to get closest to the measured mass. Is there a library in IGOR that can perform this kind of a task? Or a simpler technique? 

As an idea, I am thinking whether writing something like C(x) + H(y) + N(z) + O(z1) + S(z2) = mass  where each of x,y,z,z1 and z2 are generated by random seed generator abs(enoise) could do this.  

Thanks a ton, 

Peeyush 

I don't know of a cunning function to do this, but if it is just a few calculations then running loops is not too bad.

For example, run the following with:

FindBestCombo(1000)
Function MakeCHNOS()
    Make/D/O/N=5 wAtomicMasses
    SetDimLabel 0,0,S,wAtomicMasses
    SetDimLabel 0,1,O,wAtomicMasses
    SetDimLabel 0,2,N,wAtomicMasses
    SetDimLabel 0,3,C,wAtomicMasses
    SetDimLabel 0,4,H,wAtomicMasses
    wAtomicMasses[%S] = 32.066
    wAtomicMasses[%O] = 15.9994
    wAtomicMasses[%N] = 14.00674
    wAtomicMasses[%C] = 12.0107
    wAtomicMasses[%H] = 1.00794
End

Function FindBestCombo(vMr)
    variable vMr // target Molecular mass
    MakeCHNOS()
    wave/D wAtomicMasses
   
    variable vMaxS, vMaxO, vMaxN, vMaxC
   
    variable vRows =1e5 // some large number
   
    Make/O/W/U/N=(vRows, 5) wCount
    SetDimLabel 1,0,S,wCount
    SetDimLabel 1,1,O,wCount
    SetDimLabel 1,2,N,wCount
    SetDimLabel 1,3,C,wCount
    SetDimLabel 1,4,H,wCount
    Make/O/D/N=(vRows) wResidual
    wResidual = NaN
   
    variable vRow = 0
    variable vResMassS, vResMassO, vResMassN, vResMassC
    variable vS, vO, vN, vC, vH
   
    vMaxS = ceil(vMr / wAtomicMasses[%S])
    for(vS = 0; vS < vMaxS; vS +=  1)
        vResMassS = vMr - vS * wAtomicMasses[%S]
        vMaxO = ceil( vResMassS / wAtomicMasses[%O] )
        for(vO = 0; vO < vMaxO; vO +=  1)
            vResMassO = vResMassS - vO * wAtomicMasses[%O]
            vMaxN = ceil( vResMassO / wAtomicMasses[%N] )
            for(vN = 0; vN < vMaxN; vN +=  1)
                vResMassN = vResMassO - vN * wAtomicMasses[%N]
                vMaxC = ceil( vResMassN / wAtomicMasses[%C] )
                for(vC = 0; vC < vMaxC; vC +=  1)
                    vResMassC = vResMassN - vC * wAtomicMasses[%C]
                    vH = round(vResMassC / wAtomicMasses[%H])
                    wCount[vRow][%S] = vS
                    wCount[vRow][%O] = vO
                    wCount[vRow][%N] = vN
                    wCount[vRow][%C] = vC
                    wCount[vRow][%H] = vH
                    wResidual[vRow] = vResMassC - vH * wAtomicMasses[%H]
                    vRow += 1
                    // add a load more rows if needed
                    if (vRow >= DimSize(wResidual,0))
                        InsertPoints DimSize(wResidual,0), vRows, wResidual
                        InsertPoints DimSize(wResidual,0), vRows, wCount
                    endif
                endfor
            endfor
        endfor
    endfor
    DeletePoints vRow,DimSize(wResidual,0)-vRow, wCount,wResidual
    Duplicate/O wResidual, wAbsRes, wIndexSort
    wAbsRes[] = abs(wResidual[p])
    MakeIndex wAbsRes, wIndexSort
   
    wCount[][] = wCount[wIndexSort[p]][q]
    wResidual[] = wResidual[wIndexSort[p]]
    Edit wCount.ld
    AppendToTable wResidual
End

EDIT: Deleted my nonsense comment about isotopes.

 

 

The problem seems to me to be prone to fail either by finding non-sense local minima (e.g. CH32 instead of CS) or by taking an inordinate amount of time through what amounts to about a 5^N search grid.

Have you considered generating the sets of possible combinations of masses in advance, sorting them by mass, and then doing a simple find level operation.

M         value

1            H
2            H2
12          C
13          CH
14          CH2
...
 

You can always continue to "improve upon" (expand) the M (numeric molar mass) and value (text) waves in a spreadsheet, import that sheet, and work with it in Igor Pro. I imagine this is how most library searches work -- Not by searching over a space using a random walk with enoise in a multi-parameter function fit but rather by searching on a manually pre-built library.

Thank you so much for these great suggestions. I'll try them out and see what works for me.. really appreciate the help!