- The five numbers that help describe the Center, Spread and Shape of the data.
 - X(smallest)
 - First Quartile (Q1)
 - Median (Q2)
 - Third Quartile(Q3)
 - X(largest)
 - BoxPlot is based on these five measures
 
- BoxPlot has Q1 and Q3 as its edges
 - BoxPlot can be horizontally or vertically plotted
 - If data is symmetric around the Median, the box and central line are centered between the endpoints as shown below
 
- BoxPlot can be Left-Skewed or Right-Skewed or Symmetric based on the data-set distribution as shown below
 
BoxPlot example showing an outlier
- BoxPlot plotted using the following data-set: 0, 2, 2, 2, 3, 3, 4, 5, 5, 9, 27
 - X(smallest) = 0
 - First Quartile (Q1) = 2nd 2
 - Median (Q2) = 6th element 3
 - Third Quartile(Q3) = 2nd 5
 - X(largest) = 27
 - A value is considered an outlier if it is more than 1.5 times the Inter-Quartile-Range (IQR) below Q1 and above Q3.
 - IQR = Q3 - Q1 = 5 - 2 = 3
 - Lower Limit for outlier below Q1 = Q1 - (1.5 * IQR) = 2 - 4.5 = -2.5
 - Upper Limit for outlier above Q3 = Q3 + (1.5 * IQR) = 5 + 4.5 = 9.5
 - 27 > 9.5 and hence is the outlier in the data-set
 
BoxPlot usage
- Used when comparing segment performance
 - Used to identify the pattern in the data-set and any outliers in the data-set
 
Comments
Post a Comment