Scoped SetDataFolder calls

I like to use data folders to store waves and variables that are internal to my routines, that is, I don't want them to clutter the root data folder. The approach sanctioned by WaveMetrics seems to be that I should put that folder in root:Packages:, which is fine with me.

That means that I often find myself doing the following:

DFREF savDF = GetDataFolderDFR()
SetDataFolder root:Packages:myFolder

ImageTransform /P=0 getPlane, someWave // just an example
// ImageTransform doesn't allow me to tell where the wave is to be made,
// so it forces me to use explicit SetDataFolder calls
if (weShouldStopHere)
    SetDataFolder savDF // reset the active data folder to whatever it was so the user doesn't get confused
    return 0
endif

ImageTransform /P=1 getPlane, someWave // make some more waves

SetDataFolder savDF // also reset the active folder here
return 0


99.999% of the time I find SetDataFolder and friends to be obnoxious, and feel that they just clutter my code. Imagine if there was a way to have a scoped data folder call:

SetScopedDataFolder root:Packages:myFolder
// SetScopedDataFolder promises that the active data folder will be reset
// to whatever it was before it was called
// when we leave the current scope

ImageTransform /P=0 getPlane, someWave // just an example

if (weShouldStopHere)
    return 0
endif

ImageTransform /P=1 getPlane, someWave // make some more waves

return 0


How's that for boilerplate avoidance? The functionality is the same, the code is clearer, and more robust.

In a lot of cases I can attempt to make sure that the active data folder is reset by covering every exit path with a SetDataFolder call. But occasionally it's impossible to do so if an operation throws an error. Consider the following:

DFREF savDF = GetDataFolderDFR()
SetDataFolder root:Packages:myFolder

SomeOperationThatThrowsAnError
SetDataFolder savDF // I'm screwed, control never reaches here

return 0


There's no straightforward way to reset the active data folder before we crash! It's possible if I get really kludgy and start using GetRTError and the like. But honestly, that's just a shame.

So, please, WaveMetrics, provide SetScopedDataFolder for us to play with. I'm tired of sprinkling SetDataFolder calls all over my code just because I don't like loose ends. And even then I can't catch everything by the very nature of Igor's design.
In Igor Pro 6.10 or later you can use data folder references which make SetDataFolder calls mostly unnecessary.

I find it useful to create a routine that creates my private data folder and another that returns a reference to it, like this:


Function CreatePrivateDataFolder()
	NewDataFolder/O root:Packages
	NewDataFolder/O root:Packages:MyPackage
End

Function/DF GetPrivateDFR()
	DFREF dfr = root:Packages:MyPackage
	return dfr
End

Function CreateMyPrivateData()
	DFREF dfr = GetPrivateDFR()
	Make/O dfr:myWave = 0
	Variable/G dfr:myVariable = 0
End

Function UseMyPrivateData()
	DFREF dfr = GetPrivateDFR()
	Wave myWave = dfr:myWave
	NVAR myVariable = dfr:myVariable
End


For details, execute this:

DisplayHelpTopic "Data Folder References"


Hello Howard,

I assume that you're referring to a consistent use of dfref:something, e.g. in Make and WAVE calls. I already try to do this as much as possible, even going to the extent of using DataFolderAndName parameters in my XOPs.

But it's still impossible to avoid the use of SetDataFolder entirely. Some examples that come to mind are ImageTransform, MatrixLLS, and probably MatrixOP, as well as a significant number of others. I'm aware of the techniques that you refer to, but even with them the setdatafolder calls remain a kludge.
An alternative to data folder switching is to move or duplicate the output wave from the current data folder to your package data folder and then clean up by deleting the output wave from the current folder. This example uses ImageHistogram. I would be interested if there are pitfalls to this method that I have overlooked.


Function MyTest()
	String MyPath = "root:Packages:MyPackage:"
	String MyWave = "MyWave"
	String sDestinationWave
	sDestinationWave = MyPath + MyWave
	WAVE MyImageWave = $(MyPath + "MyImageWave")
	imageHistogram /P=0  MyImageWave
	Duplicate /O W_ImageHist, $sDestinationWave
	KillWaves /Z W_ImageHist
End
jtigor wrote: An alternative to data folder switching is to move or duplicate the output wave from the current data folder to your package data folder and then clean up by deleting the output wave from the current folder. This example uses ImageHistogram. I would be interested if there are pitfalls to this method that I have overlooked.

... code follows ....


Well this is certainly a way to avoid changing the data folder unintentionally. But let's be honest: it's still a kludge that exists because there is no way to have a scoped SetDataFolder call. I think we can all agree that these kludges reduce code readability, place an extra burden on the programmer, and increase complexity.
Kludge-ish maybe, but not a full fledged kludge. Perhaps as an alternative to the SetScopedDataFolder, operations of this type could permit the specification of the output folder. This option would probably require a lot of effort on the part of the WaveMetrics folks, unfortunately.
What I find useful is to NOT CHANGE the data folder at all; use absolute paths to your out-of-the-way data folder.

To make this less painful, I usually create a simple routine that takes a name and returns the full path, but sometimes use a SetDF-type routine to isolate the exact data folder path to one or two routines (in case I change my mind):


Function/S PathInDF(name)
	String name
	
	NewDataFolder/O root:Packages
	NewDataFolder/O root:Packages:myDataFolder
	return "root:Packages:myDataFolder:"+PossiblyQuoteName(name)
End

// If I need to change to the data folder, I use this:

Function/S SetMyDF()

	String oldDF= GetDataFolder(1)
	NewDataFolder/O root:Packages
	NewDataFolder/O/S root:Packages:myDataFolder
	return oldDF
End

// Example
Function doSomething(inputWaveinRoot)
	Wave inputWaveinRoot
	
	// Duplicate the input and normalize it to max = 1
	String out= PathInDF("output")
	Duplicate/O inputWaveinRoot, $out
	Wave wout= $out
	WaveStats/Q wout
	wout /= V_Max
	// Example: create a global variable in my data folder
	Variable/G $PathInDF(name) = 1	// initialize
	NVAR nv=$PathInDF(name)		// to use or alter the global
	// show how to switch to DF and back
	String oldDF= SetMyDF()
	// do stuff
	SetDataFolder oldDF
	// etc
End




--Jim Prouty
Software Engineer, WaveMetrics, Inc.
I believe this is a problem only for operations that don't permit you to specify the output wave (like ImageTransform). Otherwise the DFREF technique that I posted above works well.

So I think the real solution is to add a /DEST flag to operations that lack them, such as ImageTransform.

Pending that, for those operations that lack /DEST or the equivalent, I think I would create a wrapper function that uses jtigor's idea (Duplicate/O followed by KillWaves/Z) to emulate the /DEST flag. This would allow you to use the DFREF technique throughout the rest of your code.
Thanks for the suggestions. While they can save a bit of typing, fundamentally the burden is still on me to assure that the data folder gets reset in all possible code paths. An operation throwing an error will still mess this up.

The Duplicate technique is more robust, but carries with it the overhead of copying the data, and also adds code complexity.

The /DEST flag sounds like a more fundamental solution. However, a significant number of operations create more than one wave (e.g. funcfit, MatrixLLS, ...), which means that the /DEST flag would be inadequate. An option in that case would be a /DF flag that allows one to pass in a DFREF. But there's plenty of possibilities for confusion, and overall it seems like a lot of operations would have to be changed.

So I still consider SetScopedDataFolder the best solution. With the addition of free data folders and free waves, scoped entities are already available in Igor, so it would not be unprecedented. It's easy on the programmer, robust, unobtrusive, doesn't clutter the code, and does not require any copying overhead. Furthermore there would be no need to to modify the existing operations.
Another approach to solving this problem is to make sure that you handle any run time errors in your code so that operations that produce run time errors don't prevent any of your code from executing. In your example above you used
SomeOperationThatThrowsAnError
as an example, but obviously that isn't a real operation. So what operation(s) cause problems for you? Many operations accept a /Z flag which allows the programmer to handle any errors produced by the operation. I recommend using the /Z flag with the NVAR, SVAR, and WAVE variable types as well and then check that the variable exists (NVAR_Exists(), WaveExists(), SVAR_Exists()).

You could also put your code that might produce run time errors within a try...catch...endtry construct. You may also need to append
;AbortOnRTE
to the end of statements with operations that may produce errors but which don't accept the /Z flag. Here is an example of how you would use this construct:


Constant kAbortButNoError = 30000
Function test()
	Variable weShouldStopHere
	WAVE someWave = someWave
	DFREF savDF = GetDataFolderDFR()
	
	try
		NewDataFolder/O root:Packages
		NewDataFolder/O/S root:Packages:myFolder
		 
		ImageTransform /P=0 getPlane, someWave;AbortOnRTE // just an example
		
		// The next line will cause an abort if weShouldStopHere is true.
		AbortOnValue weShouldStopHere, kAbortButNoError
		 
		ImageTransform /P=1 getPlane, someWave;AbortOnRTE // make some more waves
	catch
		// You may want to clear the error code, like so.
		Variable errorCode = GetRTError(1)
		
		if (V_AbortCode != kAbortButNoError)
			// You may need to do stuff here if you want
			// to do extra error handling.
			
		endif
	endtry
	
	// As long as you don't have any return statements above,
	// execution should always reach here. So the data folder
	// always gets reset.
	 
	SetDataFolder savDF // also reset the active folder here
	return 0
End
Hello aclight,

where I find myself missing scoped data folders the most is in a package that includes both I/O as well as fairly complex calculations, which can naturally cause runtime errors. In this particular project XOP operations play a big role, however, the problem is the same for Igor's operations as well.

For what it's worth, I did add /Z flags to these operations. However, an important aspect is that I do not wish to suppress the error by any means. The error happened, it's significant, and the user should be notified about it right away. I could go about handling the error with try/catch and/or GetRTError, but I still want an alert dialog. Moreover, the meaning of the different error codes should not be hardcoded in the procedure file since this makes the code harder to maintain. But I need to get the exact error message so I can display an abort panel! Easy, right? GetErrMessage() to the rescue! Except that it doesn't recognize any XOP-specific error messages, which defeats the purpose.

So in the end I've just complicated my code even more to get something that approaches robustness (but requires other sacrifices). Also note that I'm still responsible for covering every code path with SetDataFolder calls, so I haven't made much progress either way.



Anyway, I've been doing some more thinking on them:


Function level1()
    SetScopedDataFolder root:Packages:myPackage
    // active data folder is now root:Packages:myPackage
    
    // do some stuff here
    // if an error or return statement occurs here then we'll
    // be back in whatever data folder was set before
    
    level2()
    // level2 changes to another data folder
    // but it's scoped to be local to level2 so we don't care
    // whatever happens we're back in root:Packages:myPackage when we reach this line
    
    SetScopedDataFolder root:Packages:myPackage2
    // now we're in root:Packages:myPackage2
    // when the function goes out of scope we'll go back to the original data folder
    // but possibly passing through root:Packages:myPackage, which is transparent to the user
End

Function level2()
    SetScopedDataFolder root:someFolder
    // we're now in root:someFolder
    // this function is called from level1(), so when we leave this scope
    // the folder should be reset to root:Packages:myPackage
End

Function PotentialProblem()
    SetScopedDataFolder root:hello
    
    // do something
    
    SetDataFolder root
   // There's a possible problem here, though it's likely a corner case
   // the user wants to change the active 'static' data folder
   // the best way to handle it seems for Igor to look at the current stack
   // of scoped data folder calls and to modify the highest-level one (that will go out of scope last)
   // so it will change the active DF to root when it goes out of scope (i.e. when we stop executing a user function)
   // the idea here is that there is a distinction between changing the active data folder
   // for the user (globally) and locally within a function
End
741 wrote: An option in that case would be a /DF flag that allows one to pass in a DFREF. But there's plenty of possibilities for confusion, and overall it seems like a lot of operations would have to be changed.

I also second this feature request.

Coming back to the original question of a SetScopedDataFolder function, I think this is not the correct solution for the problem. The standard way of how XOP runtime errors are handled is not very programmer friendly in my eyes. Having a dialoge popup by default is not what I expect if I call an XOP operation from a user defined function.
Adding support for the return value of an XOP and its error message to be read out by GetErrMessage(), and therefore turning off the automatic dialoge popping up, would greatly reduce the cases where a "SetDataFolder $privateFolder" is umatched.

Slightly offtopic:
I've choosen for a XOP I'm currently coding a similiar solution for error handling. The return state of the operation is (as number) returned in a V_flag variable. The coresponding error message can be retrieved by MFR_GetXOPErrorMessage as string.

In the procedures this looks like:

Structure errorCode
	int32	SUCCESS 
	int32	ALREADY_FILE_OPEN
	int32	FILE_NOT_READABLE
        [...]
EndStructure

Function initStruct(errorCode)
	Struct errorCode &errorCode
	errorCode.SUCCESS =0
	errorCode.ALREADY_FILE_OPEN=10002
	errorCode.FILE_NOT_READABLE=10008
        [...]
end

Function mytest()

	Struct errorCode errorCode
	initStruct(errorCode)

        MFR_OpenResultFile
	if(V_flag != errorCode.SUCCESS)
		MFR_GetXOPErrorMessage
                print S_errMsg
         endif
         [...]
End

I must admit it still looks a bit clumsy because of the required initStruct() call. As all my XOP operations always end with "return 0;" I don't have to fear that they stop at an unexpected moment and I still know how they exited.
The standard way of how XOP runtime errors are handled is not very programmer friendly in my eyes. Having a dialoge popup by default is not what I expect if I call an XOP operation from a user defined function. Adding support for the return value of an XOP and its error message to be read out by GetErrMessage(), and therefore turning off the automatic dialoge popping up, would greatly reduce the cases where a "SetDataFolder $privateFolder" is umatched.


XOP errors are handled the same as Igor errors. In both cases, the default is to stop procedure execution and display a dialog. In both cases, the Igor programmer can suppress the dialog and allow execution to continue using GetRTError(1). In both cases, GetRTErrMessage returns the error string. Try executing this:


Function Test()
	GBLoadWave /P=Igor /F=-1 "License Agreement.txt"
	String errMessage = GetRTErrMessage()
	Variable err = GetRTError(1)
	Print err, errMessage
	Print GetErrMessage(err)
End


The function result from an external operation or external function C routine, if non-zero, is intended to signify a normally fatal error that should stop procedure execution. You can make it non-fatal using GetRTError(1).

If you want to return status information, as opposed to a normally fatal error, use something other than the function result such as a V_ variable for an external operation or, for an external function, the p->result field or a pass-by-reference parameter.

thomas_braun wrote:
Coming back to the original question of a SetScopedDataFolder function, I think this is not the correct solution for the problem. The standard way of how XOP runtime errors are handled is not very programmer friendly in my eyes. Having a dialoge popup by default is not what I expect if I call an XOP operation from a user defined function.


I can't say I agree with this. My definition of an error is an event or circumstance that causes the result of the operation to be invalid. In my opinion displaying a popup is entirely appropriate in this situation.


However, this topic has sort of drifted away from my original intent, which is that the SetDataFolder spaghetti is annoying. The error handling just popped up as an extreme situation where the spaghetti fails even if set up properly.

So let me just summarize my points again:
1. The current SetDataFolder handling is tedious and unpleasant. Examples of this can be found in my own code but also in WaveMetrics procedures.
2. In over five years of daily Igor use and development I have have never wanted to change the data folder globally from a procedure.
3. The way Igor is designed currently allows the use of SetDataFolder to be reduced, but not abolished.

In this thread we've highlighted two possible solutions to avoid the spaghetti:
1. A scoped SetDataFolder call.
2. Adding /DEST and/or /DF flags to every operation.

My preference is still with scoped SetDataFolder calls. It's elegant, bullet-proof (if implemented properly in Igor), and will do away with the SetDataFolder cluttering.

To convince you that this cluttering happens to the best, I submit the following code from MultiPeakFit in Igor 6.21 (some comments removed):


        resultDF = resultDFBase+"_0"
		if (!DataFolderExists(resultDF))
			return MPF2_Err_NoDataFolder
		endif
		SetDataFolder resultDF
		if (nBLParams > 0)
			Wave w = $BLCoefWaveName
			if (!WaveExists(w))
				SetDataFolder dataFolderPath
				return MPF2_Err_BLCoefWaveNotFound
			endif
			if (numpnts(w) != nBLParams)
				SetDataFolder dataFolderPath
				return MPF2_Err_BLCoefWrongNPnts
			endif
		endif
		
		// This loop counts the peaks as it checks the length of the coefficient waves
		nPeaks = 0
		do
			sprintf cwavename, PeakCoefWaveFormat, nPeaks
			Wave w = $cwavename
			if (!WaveExists(w))
				break;
			endif
			if (numpnts(w) != nPeakParams)
				SetDataFolder dataFolderPath
				return MPF2_Err_PeakCoefWrongNPnts	
			endif
			nPeaks += 1
		while(1)
		if (nPeaks == 0)
			SetDataFolder dataFolderPath
			return MPF2_Err_PeakCoefWaveNotFound
		endif


4 out of those 5 calls would be unnecessary with scoped SetDataFolders.
As the author of the code, I should point out that I wasn't very clever.

The multiple calls to SetDataFolder saveDF could be eliminated by either:

1) using full paths (which have to be constructed in a string) with the Wave lookups.

2) using a DFREF variable to eliminate constructing that string.

That is, replace

Wave w = $BLCoefWaveName


with either

Wave w = $(resultDF+":"+ BLCoefWaveName)

or

DFREF df = $resultDF
...
Wave w = df:$BLCoefWaveName

That code you copied is from a much larger function; I would have to look it over carefully to make sure that none of the called functions expects the current data folder to be set to the resultDF, and I would have to go through all the code of the function to see if later code expects the data folder to be set in a particular way. But the code you copied can be cleaned up using these techniques. In my own defense, DFREF didn't exists when I wrote that code :)

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
hrodstein wrote:
The function result from an external operation or external function C routine, if non-zero, is intended to signify a normally fatal error that should stop procedure execution. You can make it non-fatal using GetRTError(1).

If you want to return status information, as opposed to a normally fatal error, use something other than the function result such as a V_ variable for an external operation or, for an external function, the p->result field or a pass-by-reference parameter.

Thanks for these clarifications Howard.