Measures of Central Tendency in Statistics

Measures of Central Tendency in Statistics

Komal MiglaniUpdated on 02 Jul 2025, 07:54 PM IST

In statistics, the central value of data is an important concept as it helps to summarize the data and describe the set with a single value like the mean. These provide better insights about data that cluster around a value. Understanding these concepts helps to solve complex problems more easily. These values describe the data in a better way and help the analyst to analyze the data in a better way and take out the insights from it. This is one of the fundamentals of statistics which has numerous applications in various domains like data analysis, weather forecast, business, etc.

Measures of Central Tendency in Statistics
Measures of Central Tendency in Statistics

This article is about the concept Measures of Central Tendency. This is an important concept which falls under the broader category of Statistics. This is not only important for board exams but also for various competitive exams.

Central Value of Data(Central Tendency)

A measure of central tendency (or central value) is a single value that attempts to describe a set of data by identifying the central position within that set of data. Apart from mean (often called the average), there are other central values such as the median and the mode.

The mean, median, and mode are all valid measures of central tendency, but under different conditions, some measures of central tendency become more appropriate to use than others.

Mean

The mean is equal to the sum of all the values in the data set divided by the number of values in the data set. If we have $n$ values in a data set, i.e. $x_1, x_2, x_3, \ldots, x_n$, then its mean, usually denoted by $\bar{x}$ (pronounced " $x$ bar"), is:

$
\bar{x}=\frac{x_1+x_2+\cdots+x_n}{n}
$

Applications of Mean:
1. Calculating average income or expenditure.
2. Analyzing trends and patterns.

For example, to calculate the mean weight of 50 people, add the 50 weights together and divide by 50. Technically this is the arithmetic mean.

Mean of the Ungrouped Data

If n observations in data are $\mathrm{x}_1, \mathrm{x}_2, \mathrm{x}_3, \ldots \ldots, \mathrm{x}_n$, then arithmetic mean $\bar{x}$ is given by

$
\bar{x}=\frac{x_1+x_2+x_3+\ldots \cdots+x_n}{n}=\frac{1}{n} \sum_\limits{i=1}^n x_i
$

Mean of Ungrouped Frequency Distribution

If observations in data are $x_1, x_2, x_3, \ldots \ldots, x_n$ with respective frequencies $f_1, f_2$, $f_3, \ldots \ldots, f_n$; then

Sum of the value of the observations $=f_1 x_1+f_2 x_2+f_3 x_3+\ldots \ldots .+f_n x_n$
and Number of observations $=f_1+f_2+f_3+\ldots .+f_n$
The mean in this case is given by

$\bar{x}=\frac{f_1 x_1+f_2 x_2+f_3 x_3+\ldots \ldots+f_n x_n}{f_1+f_2+f_3+\ldots \ldots+f_n}=\frac{\sum_\limits{i=1}^n f_i x_i}{\sum_\limits{i=1}^n f_i}$

Grouped Frequency Distribution

$x_i$ is taken as mid-point of respective classes (or interval). i.e.,

$
m=\frac{\text { lower boundary }+ \text { upper boundary }}{2}
$

then, $\bar{x}=\frac{\sum_\limits{i=1}^n f_i m_i}{\sum_\limits{i=1}^n f_i}$

For example,
A frequency table displaying professor's last statistic test is shown, the best estimate of the class mean is

$
\begin{array}{|c|c|}
\hline \text { Grade Interval } & \text { Number of Students } \\
\hline 10-12 & 1 \\
\hline 12-14 & 2 \\
\hline 14-16 & 0 \\
\hline 16-18 & 4 \\
\hline 18-20 & 1 \\
\hline
\end{array}
$

First find the midpoints for all intervals

$
\begin{array}{|c|c|}
\hline \text { Grade Interval } & \text { Midpoint } \\
\hline 10-12 & 11 \\
\hline 12-14 & 13 \\
\hline 14-16 & 15 \\
\hline 16-18 & 17 \\
\hline 18-20 & 19 \\
\hline
\end{array}
$

Now calculate the sum of the product of each interval frequency and midpoint,

$
\begin{aligned}
& \sum_{i=i}^n f_i m_i \\
& 11(1)+13(2)+15(0)+17(4)+19(1)=124 \\
& \bar{x}=\frac{\sum_\limits{i=1}^n f_i m_i}{\sum_\limits{i=1}^n f_i}=\frac{124}{8}=15.5
\end{aligned}
$

Median

The median is the middle value for a set of data that has been arranged in ascending or descending order.

It is a number that separates ordered data into 2 equal halves. Half the values are the same number or smaller than the median, and half the values are the same number or larger.

For example, to find the median of the following data

$\begin{array}{lllllllllll}65 & 55 & 89 & 56 & 35 & 14 & 56 & 55 & 87 & 45 & 92\end{array}$

We first rearrange that data into order (ascending)
$\begin{array}{lllllllllll}14 & 35 & 45 & 55 & 55 & 56 & 56 & 65 & 87 & 89 & 92\end{array}$
The median mark is the value exactly in the middle - in this case, 56
When the $n$ is even in the data set, then simply you have to take the middle two scores and average them.

Median helps do Income distribution analysis.

Median of Ungrouped Data

If the number of observations is $n$,
First arrange the observations in ascending or descending order.

If n is odd :

$
\text { Median }=\left(\frac{n+1}{2}\right)^{t h} \text { observation }
$

If n is even :

$
\text { Median }=\frac{\text { Value of }\left(\frac{n}{2}\right)^{t h} \text { observation }+ \text { Value of }\left(\frac{n}{2}+1\right)^{t h} \text { observation }}{2}
$

For example,

Consider the following data: $1 ; 11.5 ; 6 ; 7.2 ; 4 ; 8 ; 9 ; 10 ; 6.8 ; 8.3 ; 2 ; 2 ; 10 ; 1$
Ordered from smallest to largest: : $1 ; 1 ; 2 ; 2 ; 4 ; 6 ; 6.8 ; 7.2 ; 8 ; 8.3 ; 9 ; 10 ; 10 ; 11.5$
Since there are 14 observations, the median is average of $(\mathrm{n} / 2) \mathrm{th}=7$ th and $(\mathrm{n} / 2$ +1 )th $=8$ th term. So median is the average of 6.8 and 7.2 , which equals 7 .

The median is seven. Half of the values are smaller than seven and half of the values are larger than seven.

Median of Ungrouped Frequency Distribution

To find the median, first arrange the observations in ascending order. After this the cumulative frequencies are obtained.

Let the sum of frequencies is denoted by N .
Now if $N$ is odd, then identify the observation whose cumulative frequency equal to or just greater than $\frac{N+1}{2}$. This value of the observation lies in the middle of the data and therefore, it is the required median.

If $N$ is even, then find two observations, first whose cumulative frequency equal to or just greater than (N/2) and second whose cumulative frequency equal to or just greater than $(\mathrm{N} / 2+1)$. The median is the average of these two observations

Median of Continuous Frequency Distribution

In this case, the following formula can be used when observations arranged in ascending order

$
\text { Median }=l+\frac{\left(\frac{N}{2}-c f\right)}{f} \times h
$

where,
I = lower limit of median class,
$\mathrm{N}=$ number of observations,
cf = cumulative frequency of class preceding the median class,
$f=$ frequency of median class,
$\mathrm{h}=$ class size (width) (assuming class size to be equal).

Mode

The mode is the most frequent value in our data set.

Normally, the mode is used for categorical data where we wish to know which is the most common category,

$
\begin{array}{llllllllllll}
65 & 55 & 89 & 56 & 35 & 14 & 56 & 55 & 87 & 45 & 92 & 55
\end{array}
$

in the above case, the mode of the data set is 55.

Mode is useful in Market research.

Mode is that value among the observations which occurs most often, that is, the value of the observation having the maximum frequency.

In a grouped frequency distribution, it is not possible to determine the mode by looking at the frequencies. Here, we can only locate a class with the maximum frequency, called the modal class. The mode is a value inside the modal class, and is given by the formula:

Mode $=l+\left(\frac{f_1-f_0}{2 f_1-f_0-f_2}\right) \times h$
where
I = lower limit of the modal class,
$\mathrm{h}=$ size of the class interval (assuming all class sizes to be equal),
$\mathrm{f}_1=$ frequency of the modal class,
$\mathrm{f}_0=$ frequency of the class preceding the modal class,
$\mathrm{f}_2=$ frequency of the class succeeding the modal class.

Recommended Video Based on Central Value of Data

Solved Examples Based On Central Value Of Data

Example 1: The median of the items $6,10,4,3,9,11,22,18$ is

1) $9$

2) $10$

3) $9.5$

4) $11$

Solution

Measure of location - A measure of location or a measure of central tendency helps us to know the average character of the data under study by a Single quantity.

Let s arrange the items in ascending order $3,4,6,9,9,10,11,18,22$.
In this data, the number of items is $\mathbf{n}=8$, which is even.
Median $=\mathrm{M}=$ average of $\left(\frac{n}{2}\right)$ th and $\left(\frac{n}{2}+1\right)$ th terms.
$=$ Average of $\left(\frac{8}{2}\right)$ th and $\left(\frac{8}{2}+1\right)$ th terms
$=$ Average of $4^{\text {th }}$ and $5^{\text {th }}$ terms
$
=\frac{9+10}{2}=\frac{19}{2}=9.5
$

Hence, the answer is option 3.

Example 2: In a class of $100$ students there are $70$ boys whose average marks in a subject are $75$. If the average marks of the complete class is $72$, then what is the average of the girls?

1) $73$

2) $65$

3) $68$

4) $74$

Solution

$\begin{aligned} & \frac{\sum_\limits{i=1}^{75} x_i}{70}=75 \\ & \Rightarrow \frac{S_B}{70}=75 \\ & S_B=5250 \\ & \text { Also } \\ & \qquad \frac{\sum_\limits{i=1}^{100} x_i}{100}=72 \\ & \Rightarrow \frac{S_T}{100}=72 \\ & S_T=7200 \\ & \Rightarrow S_G=7200-5250 \\ & \quad=1950\end{aligned}$

$
\begin{aligned}
&\text { Thus, it gives us the mean marks for girls }\\
&\begin{aligned}
& =\frac{1950}{30} \\
& =65
\end{aligned}
\end{aligned}
$

Hence, the correct option is option (2).

Example 3: The mean of $5$ observations is $5$ and their variance is $124$. If three of the observations are $1, 2$ and $6$ ; then the mean deviation from the mean of the data is :

1) $2.4$

2) $2.8$

3) $2.5$

4) $2.6$

Solution

Initially, we need to look at the following concepts:

Arithmetic Mean -

$
\begin{aligned}
&\text { For the values } x_1, x_2, \ldots x_n \text { of the variant } x \text { the arithmetic mean is given by }\\
&\bar{x}=\frac{x_1+x_2+x_3+\cdots+x_n}{n}
\end{aligned}
$

In case of discrete data,

Mean Deviation -

If $x_1, x_2, \ldots x_n$ are $n$ observations then the mean deviation from the point $A$ is given by :

$
\frac{1}{n} \sum\left|x_i-A\right|
$
Variance -

In case of discrete data,

$\sigma^2=\left(\frac{\sum x_i^2}{n}\right)-\left(\frac{\sum x_i}{n}\right)^2$

Now,

$\begin{aligned} & \frac{\sum x_i}{5}=5 \Rightarrow \sum x_i=25 \\ & \frac{\sum x_i^2}{n}-\left(\frac{\sum x_i}{n}\right)^2=124 \\ & \frac{\sum x_i^2}{5}-25=124 \\ & \sum x_i^2=149 \times 5=745\end{aligned}$

Let the two observations be $a \& b$

$
\begin{aligned}
& a+b+1+2+6=25 \\
& a+b=16 \\
& a^2+b^2+1^2+2^2+6^2=745 \\
& a^2+b^2+1+4+36=745 \\
& a^2+b^2=704
\end{aligned}
$

$\begin{aligned} & \text { Mean deviation }=\frac{\sum\left|x_i-5\right|}{5}=\frac{\left|x_1-5\right|+\left|x_2-5\right|+8}{5} \\ & =\frac{8+\left|x_1-5\right|+\left|11-x_1\right|}{5}=\frac{8+6}{5}=2.8\end{aligned}$

Hence, the answer is the option 2.

Example 4: In a set of $2n$ distinct observations, each of the observations below the median of all the observations is increased by $5$ and each of the remaining observations is decreased by $3$. Then the mean of the new set of observations :

1) increases by $1$.

2) decreases by $1$.

3) decreases by $2$.

4) increases by $2$.

Solution

The observations are $x1 x2.................x2n $
New observations $=x1+5, x2+5 ..........................xn+5$

and $x_{n+1}-3, x_{n+2}-3 \cdots \cdots x_{2 n}-3$

$\begin{aligned} \int Q \bar{x}_{\text {new }} & =\frac{\sum x i+5 n-3 n}{2 n} \\ = & \frac{\sum x i}{2 n}+1 \\ = & \bar{x}_{\text {old }}+1\end{aligned}$

Hence, the answer is the option 1.

Example 5: All the students of a class performed poorly in Mathematics. The teacher decided to give grace marks of $10$ to each of the students. Which of the following statistical measures will not change even after the grace marks are given?

1) variance

2) mean

3) median

4) mode

Solution

Mean, Mode, and Median are the measures of central tendency. All of these change with change in any observation.

Variance is the measure of the scattering of data. It is a measure of dispersion which do not change if every given observation changes by the same amount.

The measures of central tendency will change, but not measures of dispersion.

So variance will not change.

Hence, the answer is the option (1).

Frequently Asked Questions (FAQs)

Q: How do measures of central tendency behave in multimodal distributions?
A:
In multimodal distributions (those with multiple peaks), measures of central tendency can be challenging to interpret:
Q: How do measures of central tendency relate to the concept of "typical value" in different contexts?
A:
The concept of a "typical value" can vary depending on the context and the nature of the data:
Q: What is the concept of a "trimean," and how does it relate to other measures of central tendency?
A:
The trimean is a measure of central tendency calculated as (Q1 + 2*Median + Q3) / 4, where Q1 and Q3 are the first and third quartiles. It combines aspects of the median and the quartiles, making it more robust than the mean but more sensitive to the distribution than the median alone. The trimean can be thought of as a weighted average of the median (weight 2) and the two quartiles (weight 1 each). It's particularly useful for slightly skewed distributions, providing a balance between the robustness of the median and the sensitivity of the mean.
Q: How do measures of central tendency behave in the presence of censored or truncated data?
A:
Censored data (where values beyond a certain point are imprecisely reported) and truncated data (where values beyond a certain point are completely missing) can significantly affect measures of central tendency:
Q: What is the relationship between measures of central tendency and data visualization techniques?
A:
Measures of central tendency are often incorporated into data visualization techniques to provide quick insights into the data's center:
Q: How can measures of central tendency be used in outlier detection?
A:
Measures of central tendency, particularly when used in conjunction with measures of spread, can be useful for outlier detection:
Q: What is the concept of a "population parameter" versus a "sample statistic" in the context of measures of central tendency?
A:
Population parameters are the true values that describe the entire population, while sample statistics are estimates of these parameters based on a subset of the population. For measures of central tendency:
Q: How do measures of central tendency relate to the concept of expected value in probability theory?
A:
The expected value in probability theory is closely related to the arithmetic mean in statistics. For a discrete random variable, the expected value is calculated by multiplying each possible value by its probability and summing these products. This is conceptually similar to calculating a weighted mean. In fact, for a large number of observations from a probability distribution, the sample mean tends to converge to the expected value (this is known as the law of large numbers). Understanding this connection helps bridge the concepts of descriptive statistics and probability theory.
Q: What is Simpson's Paradox, and how does it relate to measures of central tendency?
A:
Simpson's Paradox occurs when a trend appears in different groups of data but disappears or reverses when these groups are combined. This paradox can significantly affect measures of central tendency. For example, the mean of a combined dataset might be higher than the means of its subgroups. This paradox highlights the importance of considering subgroup analysis and not relying solely on aggregate measures. It also emphasizes the need to understand the context and structure of the data when interpreting measures of central tendency.
Q: How do you choose the most appropriate measure of central tendency for a given dataset?
A:
Choosing the most appropriate measure depends on several factors: