Code for Standard Deviation, Averages and r statistic

Hello,
I have completed numerous experiments for analysis of H2O (water vapour mixing ratio) and D_DH (deuterium). For these experiments (often lasting in excess of 8 hours with 5 sec data points), I have performed variable temperature increases which are sometimes 40 mins, sometimes 60 mins, sometimes longer. These temperature increases cause differences in the data which I’m testing for to understand the effect of temperature on H2O and D_DH.

Currently I’m using WaveStats to generate the average and standard deviation over each temperature increase as well as the r statistic (StatsLinearCorrelationTest/T=1/Q H2O,D_DH). However, currently I’m doing this process manually, which is not only very time consuming, but can be prone to errors during data entry.

I’m wondering if anyone has an Idea to formulate a code with the input of a data point range which captures each temperature range corresponding to a time I have recorded for each temp setting (i.e. data points 122 – 350) that returns the standard deviation, average vale and ‘r’ statistic over the designated range of points. As well as this I would need to repeat this process multiple times for each temperature range in an experiment.

Any help would be greatly appreciated!
mean(dataWave, 122, 350) and sqrt(variance(dataWave, 122, 350)) will give you the mean and s.d. of a wave for 122 < x < 350. Here x is the scaled x value. If you use default wave scaling then x is identical to the point numbers, i.e. deltax(dataWave) == 1 and leftx(dataWave) == 0.
Yeah, I'd say this type of problem crops up so frequently, the solution probably gets reinvented every time somebody new learns how to program an igor function.

Here's my version that I end up using all too often. You'll need to modify this to meet your own needs (add a second raw data wave, add in the code to do StatsLinearCorrelationTest on subsets of the raw data, do anything else...?). I don't know if you started down the road of learning to write a function yet, but if not this will get you started:


function processstats(raw,startpts,endpts)
	wave raw,startpts,endpts
	variable idx
	make /o /d /n = (numpnts(startpts)) avgs, sdevs
	for (idx = 0; idx < numpnts(startpts); idx += 1)
		wavestats /Q /R=[startpts[idx],endpts[idx]] raw
		avgs[idx] = V_avg
		sdevs[idx] = V_sdev
	endfor
	return 0
End
Here's a function that will do the job of iterating.

You need to set it up by creating two waves expStart and expStop which represent the start (non-inclusive) and stop (inclusive) times of each desired averaging period. e.g., if you did an experiment from 11:00 to 12:00 today:


expStart = {date2secs(2013,07,24) + 11*3600}
expStop = {date2secs(2013,07,24) + 12*3600}


will need to be done before running
averageExperiments(data, dataTimes, expStart, expStop)
.

I'll leave the addition of the r-values code to you...

Note that using Extract means your times don't have to be ordered, but it makes the function way slower on large waves. The second helper function isn't really necessary but I had used it in my own function so I just threw it in here.


#pragma rtGlobals=3
#pragma IgorVersion=6.2
#pragma version=20130725

function averageExperiments(data, dataTimes, expStart, expStop)
 // Given two waves 'expStart' and 'expStop' that represent expt measurement 
 //	periods, averages wave 'data' for each time range in expStart/expStop.
 //
 // INPUT:
		// wave expStart  = experiment start times (type "dat", in seconds)
		// wave expStop   = experiment stop times
		// wave data	  = data to average
		// wave dataTimes = timestamps for wave data
 // OUTPUT: 
		// wave data+"_ExpAvg" = average of data for each experiment 
		//			 [one experiment is the time from expStart[i] to expStop[i], for 0 < i < numpnts(expStart)]
		// wave data+"_ExpAvgSD" = standard error of each average

	wave expStart
	wave expStop
	wave data
	wave dataTimes
	
	variable nExp = numpnts(expStart)
	variable i
	wave dtemp
	
	// sanity checks
	if (!(WaveExists(expStart) && WaveExists(expStop)))
			print "ERROR in averageExperiments(): expStop and expStart do not both exist."
			return -1
	endif
	
	for (i=0; i<nExp; i+=1)
		// throw an error if expStop[i]<expStart[i]
		if (expStop[i] < expStart[i])
			printf "ERROR in averageExperiments(): expStop times cannot precede expStart times."
			printf "Problem at: %s -- %s", secs2dt(expStart[i]), secs2dt(expStop[i])
			return -1
		endif
		// throw a warning if experiments overlap
		if (i>0) 
			if (expStart[i] < expStop[i-1]) // if next exp starts before previous ends
				printf "// WARNING: averageExperiments() detected overlapping experiments: "
				printf "(%s -- %s ", secs2dt(expStart[i]), secs2dt(expStop[i]) 
				printf "overlaps with the next expt at %s). ", secs2dt(expStop[i-1])
				printf "Continuing...\r"
			endif
		endif
	endfor
	
	// prepare waves for saving average and standard error of mean (sem)
	make /o/d/n=(numpnts(expStart)) $(nameOfWave(data)+"_ExpAvg")  /WAVE=expAvg
	make /o/d/n=(numpnts(expStart)) $(nameOfWave(data)+"_ExpAvgSD") /WAVE=expAvgSD
	expAvg  = nan
	expAvgSD = nan
	
	// Extract any points in 'dataTimes' that fall within expStart and expStop,
	// average 'data' at these points into the new waves labelled _expAvg and _expAvgSD
	for (i=0; i<nExp; i+=1)
		// select only times between expStart[i] and expStop[i]
		Extract /FREE/O data, dtemp, dataTimes>expStart[i] && dataTimes<=expStop[i]
		if (numpnts(dtemp)>0)
			WaveStats/Q dtemp 													
			// WaveStats sets the variables V_sdev and V_avg
		else
			V_avg = nan
			V_sdev = nan
		endif
		expAvg[i]   = V_avg
		expAvgSD[i] = V_sdev
	endfor
end

static function/s secs2dt(secs)
// Given a datetime in seconds, returns an ISO-8601--formatted datetime.
// 		e.g. secs2dt(date2secs(2012,1,31)) returns "2012-01-31 00:00:00"
	variable secs
	if (secs)
		return secs2date(secs,-2) + " " + secs2time(secs,3)
	endif
	return ""
end