05.08.2018

Averages, Quartiles, and the Five Number Summary

Descriptive statistics: A way to quickly summarize data within a set using just a few numbers.
Mean: The average of a set calculated by adding all the values in the set and dividing by the number of values in the set.
Outlier: A value or values significantly higher or lower than the rest of the set that can skew the mean of a set.
Median: The middle value in a data set.
Mode: The value that appears most often in the set.
When a set has two modes it is called bimodal. When it has more than two modes, it is multimodal.
Standard deviation: A measurement of the amount of variation from the mean in a data set.
For example, if a data set has a mean of 50 units and a standard deviation of 20 units, we can conclude that most of the data will fall between 30 and 70 units.
Five number summary: The minimum, first quartile, median, third quartile, and maximum of a data set.
Each quartile represents 25% of the data within a set.
The first and third quartiles can be found by identifying the medians of the lower and upper halves of the data.
Range: The distance between the maximum and minimum.
Interquartile range (IQR): The distance between the third and first quartiles.

Graphical Organization

Boxplot: A graph representing the five number summary.
The boxed area represents the IQR with the median at the center.
Frequency distribution: A table that sorts data into equally-sized classes.
Frequency: The amount of data points that fall into each class.
Cumulative frequency: The running total of the frequencies.
Relative frequency: The frequency divided by the total number of data points.
Cumulative relative frequency: The running total of the relative frequencies.
Histogram: A frequency distribution shown in graph form.

