An outlier in a probability distribution function is a number that is more than 1.5 times the length of the data set away from either the lower or upper quartiles. Specifically, if a number is less than ${Q_1 - 1.5 \times IQR}$ or greater than ${Q_3 + 1.5 \times IQR}$, then it is an outlier.
Outlier is defined and given by the following probability function:
${Outlier\ datas\ are\, \lt Q_1 - 1.5 \times IQR\ (or)\ \gt Q_3 + 1.5 \times IQR }$
Where −
${Q_1}$ = First Quartile
${Q_2}$ = Third Quartile
${IQR}$ = Inter Quartile Range
Problem Statement:
Consider a data set that represents the 8 different students periodic task count. The task count information set is, 11, 13, 15, 3, 16, 25, 12 and 14. Discover the outlier data from the students periodic task counts.
Solution:
Given data set is:
11 | 13 | 15 | 3 | 16 | 25 | 12 | 14 |
Arrange it in ascending order:
3 | 11 | 12 | 13 | 14 | 15 | 16 | 25 |
First Quartile Value() ${Q_1}$
${ Q_1 = \frac{(11 + 12)}{2} \\[7pt] \ = 11.5 }$
Third Quartile Value() ${Q_3}$
${ Q_3 = \frac{(15 + 16)}{2} \\[7pt] \ = 15.5 }$
Lower Outlier Range (L)
${ Q_1 - 1.5 \times IQR \\[7pt] \ = 11.5 - (1.5 \times 4) \\[7pt] \ = 11.5 - 6 \\[7pt] \ = 5.5 }$
Upper Outlier Range (L)
${ Q_3 + 1.5 \times IQR \\[7pt] \ = 15.5 + (1.5 \times 4) \\[7pt] \ = 15.5 + 6 \\[7pt] \ = 21.5 }$
In the given information, 5.5 and 21.5 is more greater than the other values in the given data set i.e. except from 3 and 25 since 3 is greater than 5.5 and 25 is lesser than 21.5.
In this way, we utilize 3 and 25 as the outlier values.