Acceptable way to use global variables, strings

In a discussion with jjweimer (http://www.igorexchange.com/node/7112), I was reminded that global strings and variables are generally deplored. I would like to provide an example of the way I use them: when I have input, I like to put it in once and for all, store it globally, and then access it from the root folder in all called functions.

I have never ran into problems with this technique, and on the contrary, I really enjoy having my functions with no input parameters (order of loading the parameters, for example, isn't important my way...if you load such as function0(param1,param2), you have to be careful about ordering). Please see below. Have I just not ventured into any territory that would get me into trouble or am I limiting myself for future coding with this technique?

function function1(str0,var0)
string str0
variable var0

string/g str1=str0
variable/g var1=var0

function2()
function3()
end

function function2()
svar str1
print str1
end

function function3()
nvar var1
svar str1
print num2str(var1)+"&"+str1
end
If you have input data that is shared between many functions then it makes perfect sense to me to use global string and variables for that input data. However, as your procedure grows and includes hundreds of functions it can become difficult to keep track of all the global strings and variables, especially if your are also using Igor procedures written by other people.

I like to use SetVariable controls in a graph or panel for such input strings and variables. With SetVariable value=_NUM:xxx and SetVariable value=_STR:"xxx" you have the option to store the value in the control instead of in a global variable. I like that solution because it keeps the number of global variables low. It does mean the value is lost when the window hosting the control is closed, you will have to use a global variable to avoid that. Creating global variables in data folders other than root: might help to distinguish different global variables.

Using global variables to transfer temporary data between functions is usually not a good idea. You either end up with a huge number of global variables, or you end up using the same global variable for several different things, which increases the chance of two functions simultaneously using the same variable for different things.

The example you use is just a temporary transfer of data from function 1 to functions 2 and 3. You could imagine having a SetVariable control or global variable as input for function 1, but transferring the values to functions 2 and 3 is best done with local variables. Your approach works well for three functions, but once you have three hundred functions it's just not viable anymore.

If you have to pass a huge number of variables to a function and you have problems remembering the order of all the different variables you might look at structures. I don't use them much, but they allow you to create structured groups of strings and variables, so you no longer have to keep track of whether the peak area is the 6th or 7th variable.
I don't like global variables and refrain from using it in most cases.

Because global variables

  1. make it difficult to reason about code as functions have non-local dependencies which are difficult to understand

  2. can always be modified as you can not declare them read-only

  3. clutter the datafolder hierarchy

  4. have a slight different semantics in some cases. E.g you can not pass NVAR/SVAR as pass-by-reference parameters



Having said that one of the legitimate cases is accessing data in background functions for data acquisition. In these cases I prefer to use functions like

Function/S GetMyCounter()
    NVAR/Z/SDFR=root myCounter
    if(!NVAR_Exists(myCounter))
        variable/G root:myCounter
    end

    return "root:myCounter"
End


so that at the calling side you have

Function DoCalc()
    NVAR myCounter = $GetMyCounter()
    ...
End


this results in code you can easily search for all references to the global variable myCounter, always ensures that myCounter exists and thus relieves you from tedious error checking in DoCalc.
When I have a panel that is to control the operations in an experiment, I have taken to avoid using globals. I query the status of the panel instead.

When I have functions that interact with one another and require a common set of "initialization" parameters, I use globals in an appropriately named Packages folder. In such cases, I generally have an InitializePackage(...) function that sets up what is needed.

When you do use globals, remember the string/G and variable/G calls are used to define them and initialize them for the first time. After that, you should use SVAR and NVAR to address them. Also, to reduce clutter, I strongly recommend using an equivalent to this type of function to set up the globals ...

Function InitializeGlobals(...)

   DFREF cdf = GetDataFolderDFREF()
   SetDataFolder root:
   NewDataFolder/O/S Packages
   NewDataFolder/O/S MyPackage
   string/G ...
   variable/G ...
   SetDataFolder cdf
   return 0
end


--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
Structures are actually only useful within a sequence of functions one calling another. For data storage they are less convenient: A structure variable, like all local variables in functions, disappears when its host function returns. (You can convert them to global variables, see "StructPut" and "StructGet". However, then you might just use dedicated data folders for global variables.)

In my opinion, global variables are perfectly fine for "parameters", i.e. values which are initialized once in an experiment file (hence not a real variable), but might change between experiments (hence not a real constant).
Examples for parameters could be a pulse length or a sample rate (to check, whether several loaded data files are compatible). Things that can be set on an actual experiment and remain constant within one data set.
Constants are, well, constants (Avogadro, Planck, etc.) and could be replaced by the actual number in program code and are independent of a specific experiment.
Everything else should be passed as real variables.

HJ
In addition to JJ's and HJ's comments:

I also like to initialise a "Package" folder when I have a larger set of functions, but I often use waves to store global variables and then use dimension labels to access the latter.
E.g. I have a wave called "Settings", which can be saved or another set of parameters can easily be loaded. There might be downsides performance wise, but for me this is a very convenient way of handling globals (which I otherwise tend to avoid).
HJDrescher wrote:

In my opinion, global variables are perfectly fine for "parameters", i.e. values which are initialized once in an experiment file (hence not a real variable), but might change between experiments (hence not a real constant).
Examples for parameters could be a pulse length or a sample rate (to check, whether several loaded data files are compatible). Things that can be set on an actual experiment and remain constant within one data set.
HJ


This was my mindset as well. I wasn't sure if there was a caveat I wasn't foreseeing by using them in this way.

Thanks for all the responses so far!
Here is my two cents.

Use globals to store settings that need to persist from one invocation of a procedure to the next.

Don't use globals to pass information from one procedure to another. Doing so creates a tangled web of implicit dependencies and makes understanding and debugging the code difficult. It also means that you can not tell what the inputs to a procedure are by looking at the signature - you have to read the whole procedure. It also means that, to test a procedure, you need to first set a bunch of globals.

Ideally, a given function should depend only on its parameters. There may be exceptions, such as for settings that are used by nearly all of your functions and are changed only at well-defined times.

If a lot of functions need to access a lot of settings, I like to create a single function that collects the settings into a structure. I then call this function from each function that needs the settings. This technique is used in the HDF5 Browser procedures. See CreateHDF5BrowserGlobals and SetHDF5BrowserData (which probably should have been called GetHDF5BrowserData).

For packages that other people will use, store settings in a "package" data folder. Execute this for details:
DisplayHelpTopic "Managing Package Data"


It is a good idea to implement setter functions for setting globals as this provides a way to determine what is changing a global. This is often important during debugging when a global is being changed unexpectedly and you need to find out where.

Settings globals should ideally be accessed only by high-level routines close to the user interface. These high-level routines should call worker routines that do not rely on globals. This organization localizes access to globals, so you don't need to search your entire program to see where they are used. It also preserves the generality of the worker functions, making testing, debugging, and reuse much easier.

Searching for "why are global variables bad" comes up with a lot of hits. Here is a good one: http://www.learncpp.com/cpp-tutorial/4-2a-why-global-variables-are-evil/