JointHistogram with non-normalized output

Edit: Solved - see the last comments. Ended up being quite trivial and should also work if outliers are excluded (I think - I have not tested with outliers though) End edit.

Dear all

I am using JointHistogram to create a 2D image using two waves with x- and y-coordinates. This works like a charm but I run into one minor problem that should be easy to solve but I cannot figure it out.

The documentation for JointHistogram states:

"When outliers are excluded the output wave sums to 1. When they are included the sum of the output wave is smaller by the ratio of the number of outliers to the total number of points in the histogram."

In my data I have no outliers, so the output will sum to 1.

However, I would like to scale this so that it scales to the actual number of "counts" in each bin. This way the output 2D image will correspond to the actual number of "counts" I have in each bin. I cannot see a flag for this, nor can I right now see how to get this multiplication-factor from the raw x-y-data waves...

I hope that someone has solved this already...

Best,

Johan S

Without thinking this through fully, wouldn't the scaling factor be just the sum of your input x,y data divided by the number of bins? After all, you have used all counts and the output from JointHistogram represents the relative distribution of these counts into each bin, right?

Actually, thinking again I was under the impression that your highest bin is 1, but reading again you say that the sum (area) of the output is 1. Then, I think the scaling factor should just be the sum of your input data (without dividing by the number of bins), because then you get sum of input = sum of histogram.

In reply to by chozo

I think that since each x-y-coordinate corresponds to a count then I think that the following code-snippet will work.

At least this is what I think will work... Note that I do not use the sum, but rather the number of points... Thanks a lot for your input (again :)

 

string s_temp = s_Path_To_DataFolder+s_DataFolder+":JS_VERITAS:Data:D2:"+s_Output_Name //This is the name of the output wave from JointHistogram

JointHistogram /BINS={v_bins_Column_0, v_bins_Column_1, 0, 0} /C /DEST=$s_temp /P=0  w_data_x, w_data_y

wave w_JointHistogram_Output = $s_temp

w_JointHistogram_Output *= numpnts(w_data_x) //This scales the "normalized" image to the actual number of counts. See https://www.wavemetrics.com/forum/general/jointhistogram-non-normalized…

 

FWIW:  In response to this thread, JointHistogram in IP10 uses automatic multithreading for the 1D and 2D analysis.

In reply to by Johan.Soderstrom

Edit: I did some preliminary tests, and at least on very simple example data similar to mine this approach seems to work well. So likely it will be accurate for all. My data is not that sensitive to absolute numbers so it is somewhat tricky to test, but as I said, I think that this approach will work for all. JointHistogram can also produce M_JointHistogram that is normalized to less than one (read the manual) and also in these cases the above approach should work to get the "real" scale... I think...

 

Original comment:

In case someone tries the same approach - this should not be multiplied by numpnts()... I think. I will think about how to get an accurate scale and get back to the forum once I figured it out.