Linear regression for non-overlapping segments of a wave

To analyse the persistence of share prices I have written a set of functions for the first time in my life. The function appears to work well and as always I am very impressed about the power of Igor Pro. Nevertheless, I need to re-write the last function (Fluctuationmagnitude), so that ultimately the regression analysis can be performed for segments of the wave instead of the entire wave, but I fail to see how this might be achieved.

What I would like to do is calculate the regression line first of all for a segment with the first 32 points, then for the subsequent 32 points, next for the third segment of 32 points, and so on, up to the 30th segment.

Per segment I would also like to calculate the Fluctuation magnitude (see variable Fluctuationmagn) and write the outcomes (a single real number for each of the 30 segments) to a wave, so that another curve fit can be performed. This time with the Fluctuation magnitude as the dependent factor and the segment number as the independent factor.

Any help is highly welcome. Many thanks in advance.

Eduard Kas



#pragma rtGlobals=1		// Use modern global access method.

Function PriceTrend(Startlevel)
	Variable Startlevel  	// Price of share at time = 0
	
	Variable Simulation = CreatePrices(Startlevel) 	// Fuction to simulate random walk of share 
													// prices	
	Variable Returncalc =  CalcReturn(input)		// Function to calculate  natural logs of return
												// per period	
	Variable Demeanedcalc = Demeaned(periodreturn)		// Function to calculate demeaned returns	
	Variable FFactor = Fluctuationmagnitude(squareddiff) 	// Function to calculate local trend
															// and fluctuation magnitude 
End


Function CreatePrices(Startlevel)	// Function to simulate random walk of share prices 
	Variable Startlevel	//Share price at time = 0 in euros	
	
	Make/O/N=961  input	// Create wave for share prices 
	
	Input[0] = Startlevel	// Share price at time = 0	
	Input[1,960] =Input(p-1) + gnoise(0.15)	// Share prices in subsequent period, where price is 
											// price in preceding period plus period return drawn
											// from a Gaussian (noise) distribution 
End


Function CalcReturn(input)	// Function to calculate natural logs of return per period 
	Wave Input	// Read share prices  generated by preceding function 
	String OutputName = NameOfWave(input)+"_logarithm"		// Store the name of wave plus
																// "logartihm" into the new wave 
																// OuputName 

	Duplicate/O input $OutputName	// Duplicate the wave with generated share prices as the 
									// wave named input_logarithm 
	
	Wave output = $OutputName		// Reference input_logarithm so that you can use it 	
	output = Ln(input)	// Calculate logarithmic values of wave 
	
	Make/O/N=960 Periodreturn
	
	Periodreturn[0]= 0 	// Assume return in first period to be zero
	Periodreturn[1,959] = output(p) - output(p-1)	// In next period logartihmic return equals logarithmic
											//  share price of this period minus logarithmic 
											// share price in previous period 
End





Function Demeaned(periodreturn)	// Function to calculate demeaned returns
	Wave periodreturn	// Read return values generated by previous function
	
	WaveStats/Q periodreturn	// Determine statistis of return wave
	Variable avg = V_avg	// Determine average return
	
	Make/O/N=960 Cumulative	// Create wave that will include each period's return above or
								//  below the average for all periods
	Cumulative[0] = Periodreturn[0]-V_avg	// Determine return below or above the average for 
										// the first period
	Cumulative[1,959] = Cumulative(p-1) + Periodreturn(p) - V_avg	// Determine return below or
																// above the average for all 
																// subsequent periods
End


Function Fluctuationmagnitude(cumulative)	// Function to calculate local trend and fluctuation magnitude  
	Wave cumulative	// Read cumulative demeaned returns from previous function 
	
	Make/O/N=960 Number 	// Create a new wave to show the period numbers
	
	Variable i = 0	// Initialise i
	
	do 	

	Number[i] = i +1 	// Add the number 1 to the level in the preceding period, so that the first
						// period carries the number 1  
	
	i += 1
	
	while(i+1<959)
	
	CurveFit line cumulative[0,959] /x=NUMBER /D // Generate a linear regression function in which 
													// the y values (demeaned cumulative 
													// returns are regressed against the x values) 
													// (the period numbers)
													
													
	Wave Fitted = fit_Cumulative	// Read fitted regression line into Wave named 'Fitted'
	
	Wave Squareddiff	// Create wave squared differences 
	
	Squareddiff = (Cumulative - Fitted)^2	// Calculate squared differences between
														//  cumulative demeaned returns and fitted 
														// regression line 
	
	WaveStats Squareddiff	// Dtermine statistics of wave squared differences 
	
	Variable Sumofsquareddiff 	// Create parameter sum of squared differences 
	
	Sumofsquareddiff = V_Sum //Determine sum
	
	Variable Fluctuationmagn 	// Create wave Fluctuation magnitude 
	
	Fluctuationmagn = Sqrt(V_Sum/960)	// Calculate fluctuation magnitude 
End

What I would like to do is calculate the regression line first of all for a segment with the first 32 points, then for the subsequent 32 points, next for the third segment of 32 points, and so on, up to the 30th segment.


You can limit a curve fit to a subset of a wave using square brackets like this:

Make/O testWave = p + gnoise(5)
Display testWave
CurveFit line  testWave[40,80] /D 
ModifyGraph rgb(fit_testWave)=(0,0,65535), lsize(testWave)=3


I recommend changing your Fluctuationmagnitude to take startPoint, endPoint parameters passed in from a new, higher-level function. The new function would create the wave now created in Fluctuationmagnitude. It would then go into a loop calling Fluctuationmagnitude 30 times, once for each segment, passing startPoint and endPoint as parameters.

In the higher level function you can create a 30-point wave to store the outcome from each invocation of Fluctuationmagnitude. Fluctuationmagnitude would return this outcome result.

BTW, this:

	Make/O/N=960 Number 	// Create a new wave to show the period numbers
	Variable i = 0	// Initialise i
	do 	
		Number[i] = i +1 	// Add the number 1 to the level in the preceding period, so that the first
						// period carries the number 1  
	i += 1
	while(i+1<959)


can be written like this:

Make/O/N=960 Number = p+1


For details, execute:

DisplayHelpTopic "Waveform Arithmetic and Assignment"


Also, this:

Wave Fitted = fit_Cumulative	// Read fitted regression line into Wave named 'Fitted'

does not read anything. It creates a reference to fit_Cumulative.

Likewise, this does not create any wave. Rather it creates a reference to an existing wave:

Wave Squareddiff	// Create wave squared differences


A wave reference is a symbol local to a function the references a wave. For details:

DisplayHelpTopic "Wave References"




I'm not sure where you want the result to go, but here's a skeleton of what you need to do, assuming that what you want is to save the slope and intercept:

Function SegmentedLineFit(Ywave, Xwave, seglength)
	Wave Ywave, Xwave
	Variable seglength
	
	Variable nsegments = floor(numpnts(Ywave)/seglength)
	Variable i
	
	Make/O/D/N=(nsegments, 2) fitAandB		// you probably want to make this more general by allowing a customized name
	for (i = 0; i < nsegments; i += 1)
		CurveFit/Q/W=2 line, Ywave[i*seglength, (i+1)*seglength - 1] /X=Xwave
		Wave W_coef
		fitAandB[i][] = W_coef[q]
	endfor
end

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
The advice of both of you has been really helpful. On that basis I created the functions below.

#pragma rtGlobals=1		// Use modern global access method.

Function HurstCalculation(cumulative, Xwave, length, segmlength,nsegments)	// Create Hurst Calculation function 	
	Wave cumulative, Xwave		// Create waves 
	Variable segmlength, length, nsegments	// Create variables 
	Variable FittedLine = SegmentedLineFit(cumulative,Xwave, length, segmlength, nsegments) 	
End


Function SegmentedLineFit (cumulative,Xwave, length, segmlength, nsegments) 	// Function to calculate segmented Curve Fit 
	Wave cumulative, Xwave		// Create parameters cumulative (returns) and Xwave (= time scale)
	Variable segmlength, length, nsegments	// Create paramaters segmlength (= length of segment), length (= length of time 
											// series for price variable) and nsegments (= number of segments)

	Variable i 	// Create local variable i 
	
	Make/O/N=(length-1) LogWave = Ln(XWave)	// Transform time scale into natural logarithm 
	
	Variable Summedsquaredresids	// Create local variable sum of squared residuals 
	
	Make/O/N=(length-1) Squaredresids	// Create wave to which residuals for each of the successive curve fits are written down to 
	Make/O/N=(1, 1) Fluctuation	// Create wave to which square root of average residuals is written down to 
		for (i=0; i < nsegments; i += 1)	// Repeat cuve fit calculation for each equal size segment 
			CurveFit/W=2/Q line Cumulative[i*segmlength, (i+1)*segmlength-1] /X=LogWave /R	// Carry out curve 
				Wave Res_Cumulative	// Determine residuals for each curve 
				Squaredresids = Res_Cumulative^2	// Determe squared residuals for each curve fit 
		endfor
		WaveStats/Q Squaredresids 		// Determine statistics for the entire wave Squaredresids	
		Summedsquaredresids = V_avg	// Determine average value of wave
		Fluctuation = Sqrt(V_avg) 	// Take square root of the average value 
End



Eduard 

To play with the function I would like to change the Xwave depending on the length of the segment. In case of a segment length of 16, I would like to
set the first 16 points of the Xwave equal to 1 up to and including 16 and to repeat these values 60 times for an entire series length of 960 in terms of
return values (and 961 in terms of price values). (So far I created the Function SeriesCalc separately in order to test it. Later on, I would like to integrate
it with the other functions and determine the number of segments, depending on the length of a segment.)

The Function SeriesCalc does not behave like expected. If I call it by SeriesCalc(961, 16) for each of the 960 points of Xwave the value of -928 is created.
My question: what I am doing wrong.

Any help is highly welcome. Many thanks in advance.

Eduard


#pragma rtGlobals=1		// Use modern global access method.

Function SeriesCalc(serieslength, segmlength)	// Function to create variable lengths of segments 
	Variable serieslength, segmlength	// Create parameters 
	
	Make/O/N=(serieslength -1) Xwave	// Create X (time) variable equal to series length (for price series)
										// minus 1
	
	Variable nsegments = floor(serieslength-1)/segmlength	// Parameter nsemgents must be an integer
	
	Variable i, j 	// Create local variables 
	
	i=0 	
	do 
		j=0	
		do 
			Xwave = j+1 - (i*segmlength)	// Set values of Xwave equal to 1 up to latest segment value 
											// that is the value of 16 if segment length is equal to 16 
			j += 1 
		while (j < segmlength)
		i += 1
	while (i < nsegments)	// Repeat Xwave values for each of the n segments 
End


I'm not sure that I follow what you are trying to do but I suspect that this statement is wrong:

Xwave = j+1 - (i*segmlength)

This sets every point of Xwave every time through the loop. Perhaps you mean something like:

Xwave[i*segmlength+j] = j+1


This is equivalent to a single statement:

Make /O /N=(960) xWave = mod(p,16) +1


For details on this kind of statement:

DisplayHelpTopic "Waveform Arithmetic and Assignment"


You will probably have to read this section multiple times and do some experimentation for it to sink in.

Thank you for your suggestions. This delivers exactly what I wished to do. It is indeed a good idea to reread the text about wave references a variety of times to let it sink in and I have started to do just that.

Eduard