Running combinations (e.g. assigning chemical formulas to molar masses)


I have a list of molecular masses (accurate to four places after decimal) to which I need to assign chemical formulas. Lets call this list A. I want to write a program that can do this. Basically, I will have a list of elements CHNOS with their molar masses (in list B), and my program needs to pick a combination of elements that gets me closest to the molecular masses in list A. 

I am wondering what is the best way to do this without involving a bunch of for loops going in circles to narrow down on the mass. Fundamentally, it is just about trying out different combinations of elements to get closest to the measured mass. Is there a library in IGOR that can perform this kind of a task? Or a simpler technique? 

As an idea, I am thinking whether writing something like C(x) + H(y) + N(z) + O(z1) + S(z2) = mass  where each of x,y,z,z1 and z2 are generated by random seed generator abs(enoise) could do this.  

Thanks a ton, 


I don't know of a cunning function to do this, but if it is just a few calculations then running loops is not too bad.

For example, run the following with:

Function MakeCHNOS()
    Make/D/O/N=5 wAtomicMasses
    SetDimLabel 0,0,S,wAtomicMasses
    SetDimLabel 0,1,O,wAtomicMasses
    SetDimLabel 0,2,N,wAtomicMasses
    SetDimLabel 0,3,C,wAtomicMasses
    SetDimLabel 0,4,H,wAtomicMasses
    wAtomicMasses[%S] = 32.066
    wAtomicMasses[%O] = 15.9994
    wAtomicMasses[%N] = 14.00674
    wAtomicMasses[%C] = 12.0107
    wAtomicMasses[%H] = 1.00794

Function FindBestCombo(vMr)
    variable vMr // target Molecular mass
    wave/D wAtomicMasses
    variable vMaxS, vMaxO, vMaxN, vMaxC
    variable vRows =1e5 // some large number
    Make/O/W/U/N=(vRows, 5) wCount
    SetDimLabel 1,0,S,wCount
    SetDimLabel 1,1,O,wCount
    SetDimLabel 1,2,N,wCount
    SetDimLabel 1,3,C,wCount
    SetDimLabel 1,4,H,wCount
    Make/O/D/N=(vRows) wResidual
    wResidual = NaN
    variable vRow = 0
    variable vResMassS, vResMassO, vResMassN, vResMassC
    variable vS, vO, vN, vC, vH
    vMaxS = ceil(vMr / wAtomicMasses[%S])
    for(vS = 0; vS < vMaxS; vS +=  1)
        vResMassS = vMr - vS * wAtomicMasses[%S]
        vMaxO = ceil( vResMassS / wAtomicMasses[%O] )
        for(vO = 0; vO < vMaxO; vO +=  1)
            vResMassO = vResMassS - vO * wAtomicMasses[%O]
            vMaxN = ceil( vResMassO / wAtomicMasses[%N] )
            for(vN = 0; vN < vMaxN; vN +=  1)
                vResMassN = vResMassO - vN * wAtomicMasses[%N]
                vMaxC = ceil( vResMassN / wAtomicMasses[%C] )
                for(vC = 0; vC < vMaxC; vC +=  1)
                    vResMassC = vResMassN - vC * wAtomicMasses[%C]
                    vH = round(vResMassC / wAtomicMasses[%H])
                    wCount[vRow][%S] = vS
                    wCount[vRow][%O] = vO
                    wCount[vRow][%N] = vN
                    wCount[vRow][%C] = vC
                    wCount[vRow][%H] = vH
                    wResidual[vRow] = vResMassC - vH * wAtomicMasses[%H]
                    vRow += 1
                    // add a load more rows if needed
                    if (vRow >= DimSize(wResidual,0))
                        InsertPoints DimSize(wResidual,0), vRows, wResidual
                        InsertPoints DimSize(wResidual,0), vRows, wCount
    DeletePoints vRow,DimSize(wResidual,0)-vRow, wCount,wResidual
    Duplicate/O wResidual, wAbsRes, wIndexSort
    wAbsRes[] = abs(wResidual[p])
    MakeIndex wAbsRes, wIndexSort
    wCount[][] = wCount[wIndexSort[p]][q]
    wResidual[] = wResidual[wIndexSort[p]]
    Edit wCount.ld
    AppendToTable wResidual

EDIT: Deleted my nonsense comment about isotopes.



The problem seems to me to be prone to fail either by finding non-sense local minima (e.g. CH32 instead of CS) or by taking an inordinate amount of time through what amounts to about a 5^N search grid.

Have you considered generating the sets of possible combinations of masses in advance, sorting them by mass, and then doing a simple find level operation.

M         value

1            H
2            H2
12          C
13          CH
14          CH2

You can always continue to "improve upon" (expand) the M (numeric molar mass) and value (text) waves in a spreadsheet, import that sheet, and work with it in Igor Pro. I imagine this is how most library searches work -- Not by searching over a space using a random walk with enoise in a multi-parameter function fit but rather by searching on a manually pre-built library.

Thank you so much for these great suggestions. I'll try them out and see what works for me.. really appreciate the help!