Running combinations (e.g. assigning chemical formulas to molar masses)
Mon, 05/16/2022 - 03:09 am
I have a list of molecular masses (accurate to four places after decimal) to which I need to assign chemical formulas. Lets call this list A. I want to write a program that can do this. Basically, I will have a list of elements CHNOS with their molar masses (in list B), and my program needs to pick a combination of elements that gets me closest to the molecular masses in list A.
I am wondering what is the best way to do this without involving a bunch of for loops going in circles to narrow down on the mass. Fundamentally, it is just about trying out different combinations of elements to get closest to the measured mass. Is there a library in IGOR that can perform this kind of a task? Or a simpler technique?
As an idea, I am thinking whether writing something like C(x) + H(y) + N(z) + O(z1) + S(z2) = mass where each of x,y,z,z1 and z2 are generated by random seed generator abs(enoise) could do this.
Thanks a ton,
I don't know of a cunning function to do this, but if it is just a few calculations then running loops is not too bad.
For example, run the following with:
wAtomicMasses[%S] = 32.066
wAtomicMasses[%O] = 15.9994
wAtomicMasses[%N] = 14.00674
wAtomicMasses[%C] = 12.0107
wAtomicMasses[%H] = 1.00794
variable vMr // target Molecular mass
variable vMaxS, vMaxO, vMaxN, vMaxC
variable vRows =1e5 // some large number
Make/O/W/U/N=(vRows, 5) wCount
wResidual = NaN
variable vRow = 0
variable vResMassS, vResMassO, vResMassN, vResMassC
variable vS, vO, vN, vC, vH
vMaxS = ceil(vMr / wAtomicMasses[%S])
for(vS = 0; vS < vMaxS; vS += 1)
vResMassS = vMr - vS * wAtomicMasses[%S]
vMaxO = ceil( vResMassS / wAtomicMasses[%O] )
for(vO = 0; vO < vMaxO; vO += 1)
vResMassO = vResMassS - vO * wAtomicMasses[%O]
vMaxN = ceil( vResMassO / wAtomicMasses[%N] )
for(vN = 0; vN < vMaxN; vN += 1)
vResMassN = vResMassO - vN * wAtomicMasses[%N]
vMaxC = ceil( vResMassN / wAtomicMasses[%C] )
for(vC = 0; vC < vMaxC; vC += 1)
vResMassC = vResMassN - vC * wAtomicMasses[%C]
vH = round(vResMassC / wAtomicMasses[%H])
wCount[vRow][%S] = vS
wCount[vRow][%O] = vO
wCount[vRow][%N] = vN
wCount[vRow][%C] = vC
wCount[vRow][%H] = vH
wResidual[vRow] = vResMassC - vH * wAtomicMasses[%H]
vRow += 1
// add a load more rows if needed
if (vRow >= DimSize(wResidual,0))
InsertPoints DimSize(wResidual,0), vRows, wResidual
InsertPoints DimSize(wResidual,0), vRows, wCount
DeletePoints vRow,DimSize(wResidual,0)-vRow, wCount,wResidual
Duplicate/O wResidual, wAbsRes, wIndexSort
wAbsRes = abs(wResidual[p])
MakeIndex wAbsRes, wIndexSort
wCount = wCount[wIndexSort[p]][q]
wResidual = wResidual[wIndexSort[p]]
EDIT: Deleted my nonsense comment about isotopes.
May 16, 2022 at 08:35 am - Permalink
The problem seems to me to be prone to fail either by finding non-sense local minima (e.g. CH32 instead of CS) or by taking an inordinate amount of time through what amounts to about a 5^N search grid.
Have you considered generating the sets of possible combinations of masses in advance, sorting them by mass, and then doing a simple find level operation.
You can always continue to "improve upon" (expand) the M (numeric molar mass) and value (text) waves in a spreadsheet, import that sheet, and work with it in Igor Pro. I imagine this is how most library searches work -- Not by searching over a space using a random walk with enoise in a multi-parameter function fit but rather by searching on a manually pre-built library.
May 16, 2022 at 10:18 am - Permalink
Thank you so much for these great suggestions. I'll try them out and see what works for me.. really appreciate the help!
May 19, 2022 at 04:01 pm - Permalink