Remove outliers from box plot

Hi all,

I created IgorPro experiment file in which there are two synthetic samples: group1 and group2. I built box plot, where green line is median, red diamond is mean, whiskers are one standard deviation, and there are several outliers. Mean and standard deviation values are calculated taking into account outliers, but as I think, I have to exclude outliers from statistics. Therefore, I'd like to rebuild box plot without found outliers. Is there option to do this operation quickly? Or there is only one way - to manually determine which values in the samples are outliers shown in box plot, delete them from the table, and build box plot again?

Please find attached the experiment file.

box plot with outliers

Another "hacky" way to do it is to right click and open the modify box plot dialog.  In the markers tab set the color of the outliers and/or far outliers to the background color of the graph, white for your example.

Not removed, but visually hidden.

Andy

In reply to by hegedus

hegedus wrote:

Another "hacky" way to do it is to right click and open the modify box plot dialog. In the markers tab set the color of the outliers and/or far outliers to the background color of the graph, white for your example.

Not removed, but visually hidden.

Andy

Thank you! But hiding will not affect mean and standard deviation values. Do you usually leave outliers and allow them to affect statistics?

Hi,

I have no singular methods to handle outliers.  Even the very term outliers has a subjective connotation. Before I exclude any outliers from an analysis, I really have to convince myself that I have root cause for their values and for me this is a pretty high threshold.

To select the points that warrant a second look there are a variety of techniques such as jack knife and resampling (though your example has pretty limited sample sizes).  If you want to automate the process you will need to define a workflow or decision process for identifying which points constitute an outlier. Once you have a mechanism for example:

Interquartile Rule for Outliers 

The interquartile range can be used to help detect outliers. All that we need to do is to is the following:

  1. Calculate the interquartile range for our data
  2. Multiply the interquartile range (IQR) by the number 1.5
  3. Add 1.5 x (IQR) to the third quartile. Any number greater than this is a suspected outlier.
  4. Subtract 1.5 x (IQR) from the first quartile. Any number less than this is a suspected outlier.

I would use the rule to create a masking wave and then extract a new wave for your box plots.

Andy

Forum

Support

Gallery

Igor Pro 9

Learn More

Igor XOP Toolkit

Learn More

Igor NIDAQ Tools MX

Learn More