Download Careers360 App
Measures of Dispersion: Definition, Formulas and Examples

Measures of Dispersion: Definition, Formulas and Examples

Edited By Komal Miglani | Updated on Jul 02, 2025 07:52 PM IST

Collecting the data and expressing it in the form of measures of data is an essential concept for us. The measure of the spread shows how much variation is there in data. It shows how the data is spread, and scattered, and what is the deviation, and variance of the data. These values describe the data in a better way and help the analyst to analyze the data in a better way and take out the insights from it. This is one of the fundamentals of statistics which has numerous applications in various domains like data analysis, weather forecast, business, etc.

Measures of Dispersion: Definition, Formulas and Examples
Measures of Dispersion: Definition, Formulas and Examples

This article is about the concept Measures of Dispersion. This is an important concept which falls under the broader category of Statistics. This is not only important for board exams but also for various competitive exams.

Measures of the Dispersion of the Data

An important characteristic of any set of data is the variation in the data. The degree to which the numerical data tends to vary about an average value is called the dispersion or scatteredness of the data.

The following are the measures of dispersion:

  1. Range

  2. Mean Deviation

  3. Standard deviation and Variance

Range

Range is the difference between the highest and the lowest value in a set of observations.

The range of data gives us a rough idea of variability or scatter but does not tell about the dispersion of the data from a measure of central tendency.

Mean Deviation

Mean deviation measures the deviation of the average mean to the given set of data.

Mean deviation for ungrouped data

Let n observations are $\mathrm{x}_1, \mathrm{x}_2, \mathrm{x}_3, \ldots ., \mathrm{x}_{\mathrm{n}}$.
If $x$ is a number, then its deviation from any given value $a$ is $|x-a|$
To find the mean deviation about mean or median or any other value M of ungrouped data, following steps are involved:
1. Calculate the measure of central tendency about which we need to find the mean deviation. Let it be ' $a$ '
2. Find the deviation of each $x_i$ from $a$, i.e., $\left|x_1-a\right|,\left|x_2-a\right|,\left|x_3-a\right|, \ldots,\left|x_n-a\right|$
3. Find the mean of these deviations. This mean is the mean deviation about ' $a$ ', i.e.,

Mean deviation about 'a', M.D. $(a)=\frac{1}{n} \sum_{i=1}^n\left|x_i-a\right|$
Mean deviation about mean, M.D. $(\bar{x})=\frac{1}{n} \sum_{i=1}^n\left|x_i-\bar{x}\right|$
Mean deviation about median, M.D.(Median $) \left.=\frac{1}{n} \sum_{i=1}^n \right\rvert\, x_i-$ Median $\mid$

Mean deviation for ungrouped frequency distribution

Let the given data consist of $\underline{n}$ distinct values $\underline{x_1}, \underline{x_2}, \ldots, x_n$ occurring with frequencies $\underline{f_1}, \underline{f_2}, \ldots, f_n$ respectively.

$
\begin{array}{lll}
x: x_1 & x_2 & x_3 \ldots x_n \\
f: f_1 & f_2 & f_3 \ldots f_n
\end{array}
$

1. Mean Deviation About Mean

First find the mean, i.e.

$
\bar{x}=\frac{\sum_{i=1}^n x_i f_i}{\sum_{i=1}^n f_i}=\frac{1}{\mathrm{~N}} \sum_{i=1}^n x_i f_i
$


N is the sum of all frequencies
Then, find the deviations of observations $x_i$ from the mean $\bar{x}$ and take their absolute values, i.e., $\left|x_i-\bar{x}\right|$ for all $i=1,2, \ldots, n$
After this, find the mean of the absolute values of the deviations
$\operatorname{M.D.}(\bar{x})=\frac{\sum_{i=1}^n f_i\left|x_i-\bar{x}\right|}{\sum_{i=1}^n f_i}=\frac{1}{N} \sum_{i=1}^n f_i\left|x_i-\bar{x}\right|$


2. Mean Deviation About any value 'a'

$
\text { M.D.(a) }=\frac{1}{\mathrm{~N}} \sum_{i=1}^n f_i\left|x_i-\mathrm{a}\right|
$

Mean deviation for grouped frequency distribution

The formula for mean deviation is the same as in the case of ungrouped frequency distribution. Here, $x_i$ is the midpoint of each class.
Note
The mean deviation about the median is the lowest as compared to the mean deviation about any other value.

Standard Deviation

The standard deviation is a number that measures how far data values are from their mean.
The positive square root of the variance is called the standard deviation. The standard deviation is usually denoted by $\sigma$ and it is given by

$
\sigma=\sqrt{\frac{1}{n} \sum_{i=1}^n\left(x_i-\bar{x}\right)^2}
$

Variance

The mean of the squares of the deviations from the mean is called the variance and is denoted by $\sigma^2$ (read as sigma square). Variance is a quantity that leads to a proper measure of dispersion.

The variance of $n$ observations $x_1, x_2, \ldots, x_n$ is given by

$
\sigma^2=\frac{1}{n} \sum_{i=1}^n\left(x_i-\bar{x}\right)^2
$

Variance and Standard Deviation of a Ungrouped Frequency Distribution

The given data is

$
\begin{aligned}
& x: x_1, \quad x_2, \quad x_3, \quad \ldots \quad x_n \\
& f: f_1, f_2, f_3, \ldots f_n
\end{aligned}
$


In this case, Variance $\left(\sigma^2\right)=\frac{1}{N} \sum_{i=1}^n f_i\left(x_i-\bar{x}\right)^2$ and, Standard Deviation $(\sigma)=\sqrt{\frac{1}{N} \sum_{i=1}^n f_i\left(x_i-\bar{x}\right)^2}$ where, $\mathrm{N}=\sum_{i=1}^n f_i$

Variance and Standard deviation of a grouped frequency distribution

The formula for variance and standard deviation are the same as in the case of ungrouped frequency distribution. Here, $x_i$ is the mid point of each class.

Another formula for Standard Deviation

Variance
$\begin{aligned}
\begin{aligned}
\left(\sigma^2\right) & =\frac{1}{\mathrm{~N}} \sum_{i=1}^n f_i\left(x_i-\bar{x}\right)^2=\frac{1}{\mathrm{~N}} \sum_{i=1}^n f_i\left(x_i^2+\bar{x}^2-2 \bar{x} x_i\right) \\
& =\frac{1}{N}\left[\sum_{i=1}^n f_i x_i^2+\sum_{i=1}^n \bar{x}^2 f_i-\sum_{i=1}^n 2 \bar{x} f_i x_i\right] \\
& =\frac{1}{N}\left[\sum_{i=1}^n f_i x_i^2+\bar{x}^2 \sum_{i=1}^n f_i-2 \bar{x} \sum_{i=1}^n x_i f_i\right] \\
& =\frac{1}{N}\left[\sum_{i=1}^n f_i x_i^2+\bar{x}^2 N-2 \bar{x} \cdot N \bar{x}\right] \\
& =\frac{1}{N}\left[\sum_{i=1}^n f_i x_i^2+\bar{x}^2 N-2 \bar{x} \cdot N \bar{x}\right] \\
{\left[\because \frac{1}{N} \sum_{i=1}^n x_i f_i\right.} & \left.=\bar{x} \text { or } \sum_{i=1}^n x_i f_i=\mathrm{N} \bar{x}\right] \\
& =\frac{1}{\mathrm{~N}} \sum_{i=1}^n f_i x_i^2+\bar{x}^2-2 \bar{x}^2=\frac{1}{\mathrm{~N}} \sum_{i=1}^n f_i x_i^2-\bar{x}^2
\end{aligned}
\end{aligned}$

Standard Deviation

$\sigma = \sqrt{\frac{1}{\mathrm{~N}} \sum_{i=1}^n f_i x_i^2-\bar{x}^2}$

Recommended Video Based on Measures of Dispersion


Solved Example Based On Measures of Dispersion

Example 1: What is the range of the data $3,8,6,5,2,1,9,3,2$ ?
1) $9$
2) $10$
3) $8$
4) $5$

Solution
Range - The range is the difference between the smallest and largest observations. It is the simplest measure of Dispersion
Range $=9-1=8$

Hence,the answer is an option 3.

Example 2:The mean of $5$ observations is $5$ and their variance is $124$ . If three of the observations are $1,2$ and $6$ ; then the mean deviation from the mean of the data is :
1) $2.4$
2) $2.8$
3) $2.5$
4) $2.6$

Solution
Initially, we need to look at the following concepts:
Arithmetic Mean -
For the values $x_1, x_2, \ldots . x_n$ of the variant $x$ the arithmetic mean is given by

$
\bar{x}=\frac{x_1+x_2+x_3+\cdots+x_n}{n}
$

Mean Deviation -
If $x_1, x_2, \ldots x_n$ are $n$ observations then the mean deviation from the point $A$ is given by :

$
\frac{1}{n} \sum\left|x_i-A\right|
$

Variance -

In case of discrete data

$
\sigma^2=\left(\frac{\sum x_i^2}{n}\right)-\left(\frac{\sum x_i}{n}\right)^2
$
Now,

$
\begin{aligned}
& \frac{\sum x_i}{5}=5 \Rightarrow \sum x_i=25 \\
& \frac{\sum x_i^2}{n}-\left(\frac{\sum x_i}{n}\right)^2=124 \\
& \frac{\sum x_i^2}{5}-25=124 \\
& \sum x_i^2=149 \times 5=745
\end{aligned}
$

Let the two observations be $\mathrm{a} \& \mathrm{~b}$

$
\begin{aligned}
& a+b+1+2+6=25 \\
& a+b=16 \\
& a^2+b^2+1^2+2^2+6^2=745 \\
& a^2+b^2+1+4+36=745 \\
& a^2+b^2=704
\end{aligned}
$

$
\begin{aligned}
& \text { Mean deviation }=\frac{\sum\left|x_i-5\right|}{5}=\frac{\left|x_1-5\right|+\left|x_2-5\right|+8}{5} \\
& =\frac{8+\left|x_1-5\right|+\left|11-x_1\right|}{5}=\frac{8+6}{5}=2.8
\end{aligned}
$
Hence, the answer is the option 2.

Example 3: If the mean deviation of the numbers $1,1+d, \ldots, 1+100 d$ from their mean is $255$ , then a value of $d$ is :
1) $10.1$
2) $20.2$
3) $10$
4) $5.05$

Solution
Mean Deviation -If $x_1, x_2, \ldots x_n$ are $n$ observations then the mean deviation from point $A$ is given by :

$
\frac{1}{n} \sum\left|x_i-A\right|
$

$
\text { Mean }=\frac{1+1+d+1+2 d+---\cdots----1+100 d}{101}=1+50 d
$

Mean deviation
$
\begin{aligned}
& \Rightarrow \frac{1}{101} \sum_{r=0}^{100}|(I+r d)-(I+50 d)| \\
& \Rightarrow \frac{1}{101} \times 2 d \times \frac{50 \times 51}{2}=255 \\
& d=10.1
\end{aligned}
$
Hence, the answer is the option 1.

Example 4: The mean deviation of the numbers $3,4,5,6,7$ is
1) $0$
2) $1.2$
3) $5$
4) $25$

Solution

$
\begin{aligned}
&\text { Here, the mean can be calculated as: }\\
&\bar{x}=\frac{3+4+5+6+7}{5}=5
\end{aligned}
$

x $|x-\bar{x}|$

3 2

4 1

5 0

6 1


$
\sum|x-\bar{x}|=6
$


Mean deviation from the mean

$
\begin{aligned}
& =\frac{6}{5} \\
& =1.2
\end{aligned}
$
Hence, the answer is the option (2).


Example 5: Let $\bar{X}$ and $M.D.$ be the mean and the mean deviation about $\bar{X}$ of n observations $x_i, i=1,2, \ldots \ldots \ldots \ldots, n$. If each of the observations is increased by $5$ , then the new mean and the mean deviation about the new mean, respectively, are:
1) $\bar{X}, M \cdot D$.
2) $\bar{X}+5, M \cdot D$.
3) $\bar{X}, M \cdot D \cdot+5$
4) $\bar{X}+5$, M. D. +5

Solution
Observation all increased by $5$

$
\text { New mean }=\frac{\text { new sum }}{n}=\frac{\left(x_1+5\right)+\left(x_2+5\right)+\ldots \ldots+\left(x_n+5\right)}{n}
$


$
\begin{aligned}
& =\frac{x_1+x_2+\ldots \ldots+x_n}{n}+\frac{5 n}{n} \\
& =\bar{X}+5
\end{aligned}
$


New mean deviation about the new mean:

$
\begin{aligned}
& =\frac{1}{n} \sum_{i=1}^n\left|\left(x_i+5\right)-(\bar{X}+5)\right| \\
& =\frac{1}{n} \sum_{i=1}^n\left|x_i-\bar{X}\right|
\end{aligned}
$

$=$ old mean deviation
So, the mean will be increased by 5 but there will be no change in M.D.
Hence, the answer is the option (2).

Frequently Asked Questions (FAQs)

1. What are the measures of dispersion?

The degree to which the numerical data tends to vary about an average value is called the dispersion or scatteredness of the data. The measures of dispersion are Range, Mean deviation, Variance and Standard deviation.

2. Is mean a measure of dispersion?

No, Mean is not a measure of dispersion but it is a measure of central tendency which includes mean, median and mode.

3. What is Mean deviation?

Mean deviation is the average deviation in the mean of the data.

4. What is variance?

The mean of the squares of the deviations from the mean is called the variance.

5. What is standard deviation?

The standard deviation is a number that measures how far data values are from their mean.

6. What is the relationship between standard deviation and the empirical rule (68-95-99.7 rule) in normal distributions?
The empirical rule, also known as the 68-95-99.7 rule, relates directly to standard deviation in normal distributions. It states that approximately 68% of data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations. This rule demonstrates how standard deviation quantifies the spread of data in normal distributions and helps in understanding the probability of data falling within certain ranges.
7. How does the coefficient of variation (CV) differ from standard deviation?
The coefficient of variation (CV) is a relative measure of dispersion, while standard deviation is an absolute measure. CV is calculated as (standard deviation / mean) * 100, expressing variability as a percentage of the mean. This allows for comparison of dispersion between datasets with different units or vastly different means, which standard deviation alone cannot do.
8. In what situations might the mean absolute deviation (MAD) be preferred over standard deviation?
The mean absolute deviation (MAD) might be preferred over standard deviation when dealing with datasets that have outliers or non-normal distributions. MAD is less sensitive to extreme values because it uses absolute values instead of squares. It's also easier to interpret for some audiences, as it represents the average distance from the mean without the additional step of squaring and then taking the square root.
9. How do measures of dispersion help in identifying outliers?
Measures of dispersion help identify outliers by providing a reference for what's considered "normal" variability in the dataset. For example, in a normal distribution, about 95% of data points fall within two standard deviations of the mean. Values beyond this range might be considered outliers. Similarly, points beyond 1.5 times the interquartile range below Q1 or above Q3 are often flagged as potential outliers in box plots.
10. What is the concept of degrees of freedom and how does it relate to measures of dispersion?
Degrees of freedom (df) represent the number of independent pieces of information that go into estimating a parameter. In the context of dispersion measures, df is often n-1 for a sample (where n is the sample size). This adjustment accounts for the fact that we've used one degree of freedom to estimate the mean. Understanding df is crucial for correctly calculating and interpreting measures like sample variance and for statistical inference based on these measures.
11. What are measures of dispersion and why are they important in statistics?
Measures of dispersion, also known as measures of variability, describe how spread out data points are in a dataset. They are important because they provide information about the distribution of data that measures of central tendency (like mean or median) alone cannot capture. Measures of dispersion help us understand the variability and consistency of data, which is crucial for making accurate interpretations and comparisons between datasets.
12. How does the range differ from other measures of dispersion?
The range is the simplest measure of dispersion, calculated as the difference between the highest and lowest values in a dataset. Unlike other measures like standard deviation or variance, the range only considers the extreme values and doesn't account for the distribution of data points in between. This makes it easy to calculate but less informative about the overall spread of the data compared to other measures.
13. Why might the interquartile range (IQR) be preferred over the range in some situations?
The interquartile range (IQR) is often preferred over the range because it's less sensitive to outliers. The IQR measures the spread of the middle 50% of the data, ignoring the top and bottom 25%. This makes it more robust and representative of the typical spread in the data, especially when extreme values or outliers are present that could skew the simple range calculation.
14. How does variance relate to standard deviation?
Variance and standard deviation are closely related measures of dispersion. Variance is calculated as the average squared deviation from the mean, while standard deviation is the square root of the variance. In other words, standard deviation = √variance. Standard deviation is often preferred because it's in the same units as the original data, making it easier to interpret, while variance is in squared units.
15. Why do we use n-1 instead of n in the sample variance formula?
We use n-1 instead of n in the sample variance formula to correct for bias. This adjustment, known as Bessel's correction, compensates for the fact that we're using the sample mean (which is an estimate) instead of the true population mean. Using n-1 makes the sample variance an unbiased estimator of the population variance, especially important for smaller sample sizes.
16. Why is it important to consider both measures of central tendency and measures of dispersion when analyzing data?
Considering both measures of central tendency and dispersion provides a more complete picture of the data. Central tendency measures (like mean or median) tell us about the typical or average value, while dispersion measures reveal how spread out the data is around that central value. Together, they give insight into the distribution's shape, variability, and potential outliers, allowing for more accurate interpretations and comparisons between datasets.
17. How does skewness relate to measures of dispersion?
Skewness describes the asymmetry of a distribution and is related to measures of dispersion. In skewed distributions, measures of dispersion can be affected differently. For instance, in a right-skewed distribution, the mean is pulled towards the tail, increasing the standard deviation. Understanding skewness helps interpret dispersion measures more accurately and choose appropriate measures (e.g., using median and IQR for highly skewed data instead of mean and standard deviation).
18. How do measures of dispersion change when data is transformed (e.g., squared or log-transformed)?
When data is transformed, measures of dispersion change in ways that depend on the specific transformation. For example:
19. Why might we use different measures of dispersion for ordinal vs. interval/ratio data?
We use different measures of dispersion for ordinal vs. interval/ratio data because of the nature of the measurement scales. For ordinal data, where the intervals between values may not be equal, measures like the interquartile range are more appropriate as they don't assume equal intervals. For interval/ratio data, where intervals are meaningful, measures like standard deviation can be used as they take advantage of the precise numerical differences between values.
20. How does sample size affect the reliability of measures of dispersion?
Sample size significantly affects the reliability of measures of dispersion. Larger sample sizes generally provide more reliable estimates of population dispersion. With small samples, measures like standard deviation can be heavily influenced by extreme values or sampling variability. As sample size increases, the estimates become more stable and closer to the true population values. This is why it's important to consider sample size when interpreting or comparing measures of dispersion.
21. How do measures of dispersion behave in multimodal distributions?
In multimodal distributions (those with multiple peaks), measures of dispersion can be misleading if interpreted without context. Standard measures like variance or standard deviation may indicate high dispersion, even if the data clusters tightly around multiple modes. In these cases, it's often more informative to use methods that can capture the complex structure, such as mixture models or non-parametric approaches, alongside traditional dispersion measures.
22. What is the difference between population and sample measures of dispersion?
Population measures of dispersion describe the variability in an entire population, while sample measures estimate this variability from a subset of the population. The formulas differ slightly:
23. What is the concept of homoscedasticity and how does it relate to measures of dispersion?
Homoscedasticity refers to the condition where the variability of a variable is uniform across the range of values of another variable that predicts it. In other words, the dispersion of the dependent variable should be consistent for all values of the independent variable. This concept is important in regression analysis and ANOVA. Violations of homoscedasticity (called heteroscedasticity) can be detected by examining how measures of dispersion change across different subgroups or values of predictors.
24. How do outliers affect different measures of dispersion?
Outliers can significantly affect measures of dispersion, but the impact varies:
25. What is the relationship between measures of dispersion and statistical power?
Measures of dispersion are closely related to statistical power, which is the ability to detect a true effect in a study. Generally, higher dispersion (larger standard deviation) reduces statistical power, making it harder to detect significant differences or relationships. This is because increased variability makes it more challenging to distinguish true effects from random fluctuations. Understanding this relationship is crucial for designing studies with adequate sample sizes to achieve desired power levels.
26. How can measures of dispersion be used to detect data quality issues or errors?
Measures of dispersion can help detect data quality issues or errors by identifying unexpected patterns of variability. Unusually high dispersion might indicate data entry errors, measurement inconsistencies, or the presence of outliers. Conversely, suspiciously low dispersion could suggest data fabrication or rounding errors. By comparing observed dispersion to expected values based on similar datasets or theoretical distributions, researchers can flag potential issues for further investigation.
27. What is the concept of robust measures of dispersion and why are they important?
Robust measures of dispersion are those that are less sensitive to outliers or departures from normality. Examples include the interquartile range (IQR) and median absolute deviation (MAD). These measures are important because they provide reliable estimates of spread even when data contains extreme values or follows non-normal distributions. Robust measures are particularly useful in real-world datasets where outliers or non-normality are common, ensuring that the dispersion estimate isn't unduly influenced by a few extreme points.
28. How do measures of dispersion relate to the concept of effect size in statistics?
Measures of dispersion play a crucial role in calculating and interpreting effect sizes, which quantify the magnitude of differences between groups or relationships between variables. Many effect size measures, such as Cohen's d or standardized mean difference, express the difference between groups in terms of standard deviations. Understanding the dispersion in your data is therefore essential for accurately calculating and interpreting effect sizes, which in turn are important for assessing practical significance beyond mere statistical significance.
29. What is the relationship between measures of dispersion and probability distributions?
Measures of dispersion are fundamental characteristics of probability distributions. Different distributions have specific relationships with dispersion measures:
30. How can measures of dispersion be used in cluster analysis or classification problems?
In cluster analysis and classification problems, measures of dispersion help quantify the spread within and between groups. They're used to:
31. What is the concept of dispersion matrices in multivariate statistics?
Dispersion matrices, such as the variance-covariance matrix, extend the concept of dispersion to multiple variables. These matrices capture not only the variability of individual variables (on the diagonal) but also the covariances between pairs of variables (off-diagonal elements). They're crucial in multivariate analyses, including:
32. How do measures of dispersion relate to the concept of entropy in information theory?
Measures of dispersion are conceptually related to entropy in information theory. Both quantify the amount of uncertainty or variability in a system. Higher dispersion generally corresponds to higher entropy, indicating more uncertainty or information content. While traditional dispersion measures focus on numerical spread, entropy considers the probabilities of different outcomes. Understanding this relationship can provide insights into data complexity and predictability, bridging concepts from statistics and information theory.
33. What is the role of measures of dispersion in financial risk assessment?
In financial risk assessment, measures of dispersion are crucial for quantifying uncertainty and potential variability in returns. Key applications include:
34. How can measures of dispersion be visualized effectively in data presentations?
Effective visualization of dispersion can greatly enhance data presentations. Some methods include:
35. What is the concept of dispersion indices in ecology and how do they relate to statistical measures of dispersion?
Dispersion indices in ecology measure how organisms are distributed in space. They relate to statistical measures of dispersion but are adapted for ecological contexts:
36. How do measures of dispersion change when data is aggregated or summarized?
When data is aggregated or summarized, measures of dispersion typically change:
37. What is the concept of heterogeneity and how does it relate to measures of dispersion?
Heterogeneity refers to the degree of dissimilarity or variability within a dataset or population. It's closely related to measures of dispersion:
38. How do measures of dispersion behave in time series data?
In time series data, measures of dispersion can reveal important patterns:
39. What is the relationship between measures of dispersion and the concept of statistical moments?
Measures of dispersion are closely related to statistical moments:
40. How can measures of dispersion be used to compare the consistency of different measurement methods or instruments?
Measures of dispersion can be used to assess the precision or consistency of measurement methods or instruments. Lower dispersion (e.g., smaller standard deviation or coefficient of variation) indicates higher consistency. For example, if two thermometers are used to take multiple readings of the same temperature, the one with lower dispersion in its measurements would be considered more precise. This application is crucial in fields like metrology and quality control.

Articles

Back to top