Calculating Percentiles of 1 wave grouped based on another wave

Hello!

I am using Igor 9.02 and am not great at programming so thought I would ask here for advice.

I have a list of points that have an associated age (modeled_cr) and an associated unit type (Unit). I am trying to calculate the 10th percentile of modeled_cr values based on the Unit. I tried to use the percentiles and box plot function but I can only get percentiles based on the entire list, not divided up by unit. Is it possible to separate them out? I am attaching a subset of my data as a text file.

Thanks in advance for any help! 

Rachael

UnitAgeCondensed.txt

You can use the Extract operation to create individual waves with a given string from the Unit wave. For instance, I created a wave of the values corresponding to "A29" like this:

Extract/INDX Unit, Unit_A29, CmpStr(Unit, "A29")==0
Make/N=4/D modeled_cr_A29 = modeled_cr[Unit_A29[p]]      // Unit_A29 has four points
print modeled_cr_A29
  modeled_cr_A29[0]= {4.8787,4.71625,3.97224,4.19104}

This could be encapsulated into a user-defined function to make it easier to use.

I think this will do what you want. I've basically put John's code into a loop.

function extractUnitData(Wave/T unit, Wave modeled_cr)

    // Remove duplicates to get a unique list of units
    FindDuplicates/FREE/RT=unitList unit
   
    // Extract the data from modeled_cr that corresponds to each unit
    // into it's own wave
    for (string theUnit : unitList)
        Extract/INDX/FREE unit, unitLocations, CmpStr(unit, theUnit) == 0

        string outputName = "modeled_cr_" + theUnit
        Make/N=(DimSize(unitLocations,0))/O/D $outputName/Wave = extractedWave
        extractedWave = modeled_cr[unitLocations[p]]
    endfor
end

 

Make sure you're providing the function with valid waves. To be safe, provide the full path to each wave. Running this from the command line might look like: 

extractUnitData(root:Unit, root:modeled_cr)

where root:Unit is a text wave with the units, and root:modeled_cr is a numeric wave. This is assuming you loaded each column of your text file into a separate wave, of course.

This function:

Function quantiles(String wlist)
    Variable i
    Variable nwaves = ItemsInList(wlist)
    Make/O/N=(nwaves)/T/O tenthWaves
    Make/O/N=(nwaves)/D/O tenthPercentiles
    for (i = 0; i < nwaves; i++)
        String wname = StringFromList(i, wlist)
        Wave w = $wname
        Duplicate/FREE w, sortedw
        Sort sortedw, sortedw
        tenthWaves[i] = wname
        Variable tenthIndex = (numpnts(w)-1)/10
        tenthPercentiles[i] = sortedw[tenthIndex]
    endfor
end

will make two waves: a text wave with the names of the waves contained in wlist, and a numeric wave with the tenth percentile value from each of the waves in the list. The tenth percentile is determined in unsophisticated way: it simply gives you the interpolated value corresponding to the position one tenth of the way from the smallest to largest value in the sorted values.

I used Ben's function to extract the individual waves, then invoked my function like this:

quantiles(WaveList("modeled_cr_*", ";", ""))

You may wish to think whether this is truly what you want :)

Building off the code you all provided I tried to write additional code to use the Percentile function to loop through each new unique wave name and calculate the 10th percentile. I am getting an error for this line - 

String uniqueWaveList = WaveList("modeled_cr_*") 

that says expected comma, and if i added a comma it says "expected string variable or string function." Any ideas whats going on?

Thanks!

function extractUnitData(Wave/T unit, Wave modeled_cr)

    // Remove duplicates to get a unique list of units
    FindDuplicates/FREE/RT=unitList unit
   
    // Extract the data from modeled_cr that corresponds to each unit
    // into it's own wave
    for (string theUnit : unitList)
        Extract/INDX/FREE unit, unitLocations, CmpStr(unit, theUnit) == 0

        string outputName = "modeled_cr_" + theUnit
        Make/N=(DimSize(unitLocations,0))/O/D $outputName/Wave = unitWave
        unitWave = modeled_cr[unitLocations[p]]
    endfor
   
    // Calculate the 10th percentile for each unique wave
    string uniqueWaveList = WaveList("modeled_cr_*")
   
    for (string waveName : uniqueWaveList)
        Wave thisWave = waveName
        Variable percentile = Percentile(thisWave, 10)
        Print "10th percentile of " + waveName + " is: " + num2str(percentile)
    endfor
end

 


 
DisplayHelpTopic "WaveList"

Running that from the command line will bring up the documentation for the WaveList function. Note that it expects 3 parameters: the matchStr, a separatorStr, and an optionStr. Your code is missing the last two parameters.

For future reference in the documentation - any parameter enclosed in square brackets is an optional parameter, everything else is required.

Additionally, there is another mistake in your code and a detail you might want to reconsider:

for (string waveName : uniqueWaveList)
   Wave thisWave = waveName

First, you might notice that waveName is colored in orange. This is because 'WaveName' is the name of a builtin operation. While such code works in principle, I cannot recommend reusing names for builtin function or operations as string / variable name as this may lead to confusion and possibly bugs down the road.

The second line will not work. You cannot just equate a string and a wave reference in this way. You would need to use the '$' reference operator. Read more about this here:

DisplayHelpTopic "Converting a String into a Reference Using $"

 

In reply to by chozo

Thanks for the info, I just had this realization this morning ha

My current issue is that uniqueWaveList is returning one string, is there an easy way to split this string up, or is there an alternate function that does it for me?

Function extractUnitData(Wave/T unit, Wave modeled_cr)
    // Remove duplicates to get a unique list of units
    FindDuplicates/FREE/RT=unitList unit
   
    // Extract the data from modeled_cr that corresponds to each unit
    // into its own wave
    for (string theUnit : unitList)
        Extract/INDX/FREE unit, unitLocations, CmpStr(unit, theUnit) == 0

        string outputName = "modeled_cr_" + theUnit
        Make/N=(DimSize(unitLocations,0))/O/D $outputName/Wave = unitWave
        unitWave = modeled_cr[unitLocations[p]]
    endfor
   
    // Calculate the 10th percentile for each unique wave
    Make/T uniqueWaveList
    uniqueWaveList = WaveList("modeled_cr_*", ",", "")
    print uniqueWaveList