Gaussian mean of waves

I don't know how the Gaussian mean differs from the arithmetic mean. The average waves package will generate an average wave from your input waves. It can detect the differences in range and points of the input waves. It uses interpolation of the input waves to generate a result wave which has evenly spaced x values, with x scaling to show the average (arithmetic mean) of the interpolated input waves at each x point. Is this what you want?

Log in or register to post comments

January 24, 2020 at 03:09 am - Permalink

Igor

It would be helpful to define what you mean by "Gaussian mean". Are you trying to fit 6 numeric values to a Gaussian and then return the center of the Gaussian as the value of your "Gaussian mean"?

Log in or register to post comments

January 24, 2020 at 02:41 pm - Permalink

Andika Asyuda

Thanks sjr51 for the answer. I am clear now how the "average waves" package work.

And you're right Igor. Gaussian average wave should be generated by distribution fitting of 6 numeric values (waves) at the same x-axis. The fitting parameters (mean and standard deviation) are used afterward to generate Gaussian average wave. In ideal experiment, arithmatic and Gaussian average should be identical. However, I should distinguish error from real data based on their distribution. I want only the meaningful data is included in the averaged wave.

Log in or register to post comments

January 25, 2020 at 07:02 am - Permalink

tony

Are the data independent in the x dimension, or should the Gaussians be somehow constrained in some kind of global fit?

Log in or register to post comments

January 27, 2020 at 04:44 am - Permalink

tony

If your 6 values are random samples of normally distributed values, the sample mean is an estimate of the population mean. Fitting will not improve upon this.

On the other hand, if the data collection is set up so that you are adjusting some quantity and collecting data that is distributed as a function of that quantity, then fitting a Gaussian will require some additional knowledge: what do your six waves represent (what is the independent variable)?

Even then, a least-squares fit of a Gaussian function to just 6 points sounds perilous to me!

Log in or register to post comments

January 27, 2020 at 05:14 am - Permalink

Andika Asyuda

I am not sure what you mean by " data independent in the x dimension". The data (wave) represents current density, which obviously depends on the x value (voltage). Gaussian fitting at each voltage is independent, or in another word there is no global fit.
In my current case, Gaussian average and Arithmatic average give almost identical result. However in the next case, I might have considerable error (30%), which comes from short circuit and open circuit of my setup. I hope to distinguish the real data from error, based on normal distribution. With arithmatic average, for sure both error and real data are included, which leads to some error in the final calculation.
I already tried the procedure in Origin, by combining interpolation, matrix transpose, and normal distribution fitting. It works, but looks slow. I might have problems later, when I try 30 or 200 data sets. However, the same procedure might also work in Igor.

Thank you very much for the suggestion Tony. Currently, I am exploring the method using 6 data, which are good. When I know for sure the method works, I will try with more data. I am still new with Igor, and still need time to master the tool.

Log in or register to post comments

January 27, 2020 at 06:44 am - Permalink

KurtB

I am struggling to understand what you are doing, but if your data is (going to be) prone to outlier (open circuit or short circuit) data points, then using the Median as a measure of central tendancy is a reasonably robust thing to do.

Log in or register to post comments

January 27, 2020 at 07:10 am - Permalink

jjweimer

Can you post the equations that you would apply to determine an arithmetic mean and a Gaussian mean. Can you post an example of the plot or data set that you test with the respective values of what you want from arithmetic and Gaussian mean.

Log in or register to post comments

January 27, 2020 at 08:36 am - Permalink

Andika Asyuda

Hi KurtB,

on the contrary, you get it perfectly in my opinion. Median is another robust way to handle my data, probably even better depending on the case. These stuffs have been discussed in detail by Reus et al ( J. Phys. Chem. C 2012, 116, 11, 6714-6733 : Statistical Tools for Analyzing Measurements of Charge Transport). My experiment is transport characterization of monolayer film, exactly as discussed in the paper. Gaussian mean procedure might be preferable, because most recent papers which I read use this method

Log in or register to post comments

January 29, 2020 at 02:49 am - Permalink

serrano

So the "Gaussian Mean" as described in the paper consists of fitting a Gaussian to a histogram. You can do this by using the Analysis->Histogram and Analysis->Curve Fitting functions.

However, I should point out that doing this would introduce quite some variability as the histogram binning is arbitrary. As others have pointed out, if the data contains outliers, the median would be the better choice.

As for using the curve fitting method, you could eliminate the binning issue by creating a cumulative histogram of only the data points and fitting that to a Gaussian CDF, i.e. an error function. However, if my statistics knowledge doesn't fail me, I'd expect the location value of this to converge to the MLE, which is the mean of your sample, so you wouldn't gain anything. Correct me on the last part if necessary.

Log in or register to post comments

January 29, 2020 at 03:14 am - Permalink

tony

So you want to histogram six measurements and then fit a Gaussian with three adjustable parameters?

Log in or register to post comments

January 29, 2020 at 03:27 am - Permalink

Andika Asyuda

Hi jjweimer,

I only used usual equations for both Arithmetic average and Gaussian fitting (model). The Arithmetic average is: y_average = (y₁+y₂+y₃+y₄+y₅+y₆) /6.For the equation for Gaussian distribution fitting, please the attached figure normal_distribution . I am really sorry, that I cannot post my data at the moment. However it is still too ideal to demonstrate how arithmetic and Gaussian mean might give slightly different result. Otherwise the paper, which I mentioned above ( J. Phys. Chem. C 2012, 116, 11, 6714-6733 ), demonstrate how it should work quite nicely.

Thank you very much for the suggestion and explanation serrano. I will keep it mind.

Hi tony,
I believe Gaussian equation only has 2 adjustable parameters, mean and standard deviation. Please correct me if I am wrong.

Attachments normal_distribution (2.04 KB)

Log in or register to post comments

January 30, 2020 at 12:13 am - Permalink

olelytken

Creating a histogram from just 6 data points is tricky. I don't think binning will work for that. What might work is to calculate the frequency of data points as the inverse distance between neighboring points.

Let's say we have 6 data points: (1.2), (3.1), (3.4), (4.0), (5.0), (7.1)

The frequency of data points half way between point 1 and point 2 can now be calculated as 1/(3.1 - 1.2) = 1/1.9 = 0.53

In terms of Igor code you would do:

Make/O RawData={1.2, 3.1, 3.4, 4.0, 5.0, 7.1}
Make/O/N=5 HistogramX, HistogramY
HistogramX=(RawData[p+1]+RawData[p])/2
HistogramY=1/(RawData[p+1]-RawData[p])
Display HistogramY vs HistogramX

Now you have a histogram you can fit with a Gaussian.

I still don't know if this is more accurate than simply calculating an average.

Log in or register to post comments

January 30, 2020 at 07:00 am - Permalink

olelytken

I guess calculating a simple average fails when the distribution is not symmetric but has extreme outliers in one direction because of a particular error which might occur in your measurement. In that case the average is more sensitive to the extreme asymmetric outliers than fitting the histogram is.

Log in or register to post comments

January 30, 2020 at 07:23 am - Permalink

tony

Andika Asyuda wrote:

I believe Gaussian equation only has 2 adjustable parameters, mean and standard deviation. Please correct me if I am wrong.

sure, the PDF has 2 parameters, so you can normalise the histogram by (bin width * number of samples) and fit the PDF.

Log in or register to post comments

January 30, 2020 at 07:43 am - Permalink

Igor

A histogram is usually employed to reduce the number of representative data points by dividing the original set into bins. If you only have 6 points there is no advantage in using a histogram because that will reduce your number of data even further. If you are trying to find a systematic approach to removing outliers (or their contribution) you may be able to use clustering (see documentation for FPClustering, kMeans, etc.)

Log in or register to post comments

January 30, 2020 at 11:06 am - Permalink

Andika Asyuda

Hi all,

I really appreciate all the feedback. It seems the amount of data set can be an issue in my case. How many number of datasets are suitable for a histogram-based analysis ? Currently I have 9-20 datasets. I am wondering, if I need to collect more data.

Log in or register to post comments

February 3, 2020 at 08:03 am - Permalink

Igor Pro 9

Igor XOP Toolkit

Igor NIDAQ Tools MX