A collection of numerical facts with particular information is called data. Consider, the marks obtained by 20 students in mathematics in 8^{th} standard midterm examination:
56, 31, 44, 78, 67, 74, 38, 60, 56, 59, 87, 73, 38, 77, 84, 80, 49, 60, 60, 71
The above data is a collection numerical entries. It is called observation. Such a collection of data is called raw data.
The difference between the highest and the lowest score is called range. The range of above data is (87 – 31) = 56.
The number of times a particular observation occurs in a data is called its frequency. The data represented in the tabular form is called frequency distribution table.
Example 1: The marks scored by 20 students in a unit test out of 25 marks are given below.
12, 10, 08, 12, 04, 15, 18, 23, 18, 16, 16, 12, 23, 18, 12, 05, 16, 16, 12, 20. Prepare a frequency distribution table.
Solution:
The table looks like:
Marks  Tally marks  No. of students (frequency) 
23  II  2 
20  I  1 
18  III  3 
16  IIII  4 
15  I  1 
12  IIIII  5 
10  I  1 
08  I  1 
05  I  1 
04  I  1 
Total  20  20 
1.5.2 Grouping Data
Organising the data in the form of frequency distribution table is called grouped frequency distribution of raw data.
Example 2: Consider the following marks (out of 50) scored in mathematics by 50 students of 8^{th} class:
41, 31, 33, 32, 28, 31, 21, 10, 30, 22, 33, 37, 12, 05, 08, 15, 39, 26, 41, 46, 34, 22, 09, 11,16, 22, 25, 29, 31, 39, 23, 31, 21, 45, 47, 30, 22, 17, 36, 18, 20, 22, 27, 39, 28, 17.
Prepare a frequency distribution table:
Solution:
[Note: If we prepare frequency distribution table for each observation, then the table would be too long. So for our convenience, we make groups of observations like 0 – 9, 10 – 19, and so on. We obtain a frequency distribution of number of observations coming under each group. In this way, we prepare a frequency distribution table for the above data as below:]
groups  tally marks  frequency 
0 – 9  III  03 
10 – 19  IIII IIII  10 
20 – 29  IIII IIII IIII I  16 
30 – 39  IIII IIII IIII  15 
40 – 49  IIII I  06 
50 – 59  0  
Total  50  50 
In the above table, marks are grouped into 0 – 9, 10 – 19, and like. No scores overlap in any group. Each of these groups is called a class interval or a class. This method of grouping data is called inclusive method.
Class limit: In the class interval, say (10 – 19) is called the lower class limit and 19.5 is called the upper class limit.
Note: To find the class limit, in inclusive method, subtract 0.5 from lower score to get lower class and add 0.5 to the upper score to get upper class limit.
Class size: The number of scores in the class internal say, (10 – 19), including 10 and 19, is called the class size or width of the class. In this example, the class size is 10.
Class mark: The midpoint of a class is called its class mark. It is obtained by adding the two limits and dividing by 2. For example, the class mark of (10 – 19) is (10+19)/2 = 14.5
The data can be grouped in class intervals like 0 – 10, 10 – 20, 20 – 30, and so on. The frequency distribution table will look like:
groups  tally marks  frequency 
0 – 10  III  3 
10 – 20  IIII IIII  10 
20 – 30  IIII IIII IIII III  18 
30 – 40  IIII IIII III  13 
40 – 50  IIII I  06 
Total  50  50 
Here we observe that, 10 occurs in both the classes (0 – 10) as well as (10 – 20). But it is not possible that, 10 belongs to both (0 – 10) and (10 – 20). In order to avoid this, we follow a convention that the common observation (here 10) will belong to (10 – 20) but not to (0 – 9). This method of grouping is called exclusive method.
Class limit: In the class interval (10 – 20), 10 is called the lower limit and 20 is called the upper limit.
Class size: The difference between the upper limit and the lower limit is called the class size or width. The width of class in (10 – 20) is 20 – 10 = 10.
Example 3: Forty candidates from 10^{th} class of aschool appear for a test. The number of questions (out of 60) attempted by them in forty five minutes is given here:
52, 42, 40, 36, 12, 28, 15, 37, 35, 22, 39, 50, 54, 39, 21, 34, 46, 31, 10, 09, 13, 24, 29, 31, 49, 58, 40, 44, 37,28, 13, 16, 29, 36, 39, 41, 47, 55, 52, 09.
Prepare frequency distribution table with the class size 10 and answer the following questions:
 Which class has the highest frequency?
 Which class has the lowest frequency?
 Write the upper limit and lower limits of the class (20 – 29)
 Which two classes have the same frequency?
Solution:
Let us prepare the frequency distribution table pertaining to this class.
Class interval  tally frequency  Frequency 
0 – 9  II  2 
10 – 19  IIII I  6 
20 – 29  IIII II  7 
30 – 39  IIII IIII I  11 
40 – 49  IIII III  8 
50 – 59  IIII I  6 
Total  40  40 
Using this table, we can observe:
 The class interval (30 – 39) has the highest frequency.
 The class interval (0 – 9) has the lowest frequency.
 Upper limit is 29.5 and lower limit is 19.
 The class intervals (10 – 19) and (50 – 59) has the same frequency.
Example 4: The heights of 25 children in centimetres are given below:
174, 168, 110, 142, 156, 119, 110, 101, 190, 102, 111, 172, 140, 136, 174, 128, 124, 136, 147, 168, 192, 101, 129, 114.
Prepare a frequency distribution table, taking the size of the class interval as 20, and answer the following:
 Mention the class intervals of highest and lowest frequency.
 What does the frequency 6 corresponding to class interval (160 – 180) indicate?
 Find out the class mark (or midpoint) of ( 140 – 160)
 What is the range of heights?
Solution:
The frequency distribution for the given data is as follows:
Class interval  Tally marks  Frequency 
100 – 120  IIII III  8 
120 – 140  IIII  5 
140 – 160  III  3 
160 – 180  IIII I  6 
180 – 200  III  3 
Total = 25  25 
Using this table, we can observe,
 Highest frequency (100 – 120) and lowest frequency (140 – 160) and (180 – 200)
 There are 6 children whose heights are in range 160 cm to 180 cm
 Midpoint = (140 + 160)/2 = 150
 Range = highest frequency – lowest frequency = 192 – 101 = 91
EXERCISE 1.5.2
 The marks scored by 40 candidates in an examination (out of 100) is given Below:
75, 65, 57, 50, 32, 54, 75, 67, 75, 88, 80
42, 40, 41, 34, 78, 43, 61, 42, 46, 68, 52,
43, 49, 59, 67, 34, 33, 87, 97, 47, 46, 54,
48, 45, 51, 47, 41, 43.
Prepare a frequency distribution table with the class size 10. Take the class intervals as (3039)…. And answer the following questions:
(i) Which class intervals have highest and lowest frequency?
(ii) Write the upper and lower limits of the class interval 3039.
(iii) What is the range of the given distribution?
Solution:
Lowest score = 32
Highest score = 97
Range = 97 – 32 = 65
Class interval  Tally marks  Frequency 
30 – 39  IIII  4 
40 – 49  IIII IIII IIII I  16 
50 – 59  IIII II  7 
60 – 69  IIII  5 
70 – 79  IIII  4 
80 – 89  III  3 
90 – 99  I  1 
Total = 40  40 
 Highest frequency = (40 49) an Lowest frequency = (9099)
 Upper limit = 39.5 and lowest limit = 29.5
 Range = Highest frequency – Lowest frequency = 97 – 32 = 65.
2. Prepare the frequency distribution table for the given set of scores:
39, 16, 30, 37, 53, 15, 16, 60, 58, 26, 28, 19, 20, 12, 14, 24, 59, 21, 57, 38, 25, 36, 34, 15, 25, 41, 52, 45, 52, 45, 60, 63, 18, 26, 43, 36,18, 27, 59, 63, 46, 48, 25, 33, 46, 27, 46, 42, 48, 35, 64, 24.
Take class intervals as [1020] [2030]… and answer the following:
(i) What does the frequency corresponding to the third class interval mean?
(ii) What is the size of each class interval? Find the midpoint of the class interval 3040.
(iii) What is the range of the given set of scores?
Solution:
Lowest score = 14
Highest score = 63
Range = 64 – 12 = 52
Class interval  Tally frequency  Frequency 
10 – 20  IIII IIII  9 
20 – 30  IIII IIII II  12 
30 – 40  IIII IIII  9 
40 – 50  IIII IIII  9 
50 – 60  IIII I  6 
60 70  IIII  4 
Total = 49  49 
i. The frequency corresponding to the third interval is 9.i.e., there are 9 scores ranging from 30 to 40
ii. The size of each interval is 10 and the midpoint of the interval (30 – 40) is, (30 + 40)/2 = 40/2 = 35.
iii. Range = Highest score – Lowest score = 64 12 = 52
1.5.3 Histogram
A histogram is a representation of a frequency distribution by means of rectangle whose widths represent class intervals and whose areas are proportional to the corresponding frequencies. In a histogram, frequency is plotted against class interval. Thus, a histogram is a two dimensional graphical representation of data. However, if the lengths of all the class intervals are the same, then the frequency is proportional to the height of the rectangle.
Construction of a histogram
Let us study, how to construct histograms taking some example:
Example 5: Draw the histogram of the following frequency distribution:
Class interval  frequency 
0 – 9  5 
10 – 19  8 
20 – 29  12 
30 – 39  18 
40 – 49  22 
50 – 59  10 
Solution:
The given distribution is in inclusive form. It should be converted into exclusive form. This can be done by applying a correction factor d/2, where,
d = (lower limit of a class) – (upper limit of a class before it)
Here, we have,
Actual upper limit = stated limit + d/2
Actual lower limit = stated limit – d/2
For example:
Consider the class limit 10 – 19. We get,
d = (lower limit of a class) – (upper limit of a class before it)
= 10 – 9 = 1
Hence, d = 1 or d/2 = 0.5
Now,
Actual upper limit = (stated upper limit) + d/2 = 19 + 0.5 = 19.5
Actual lower limit = stated limit – d/2 = 10 – 0.5 = 10.5
Converting into exclusive form, we get the table as below:
Stated class interval  Actual class interval  frequency 
0 – 9  0.5 – 9.5  5 
10 – 19  9.5 – 19.5  8 
20 – 29  19.5 – 29.5  12 
30 – 39  29.5 – 39.5  18 
40 – 49  39.5 – 49.5  22 
50 – 59  49.5 – 59.5  10 
Construction of a histogram:
 Draw xaxis and yaxis. Choose proper scale for x and y axis, say, on xaxis: 1cm = 10 and yaxis: 1cm = 5
 Mark the class intervals on xaxis; (0.5 – 9.5) , (9.5 – 19.5) and the like.
 Draw a second rectangle of height 8cm on the second class interval and follow the same procedure for the rest of the class intervals and the corresponding frequencies.
Then, the histogram takes the form as shown below:
Example 6:
Draw the histogram for the following frequency distribution.
Class interval  Frequency 
0 – 5  5 
15 – 10  8 
10 – 15  15 
15 – 20  4 
20 – 25  10 
Solution:
The given distribution is in exclusive form. So we can take the class intervals as (0 – 5), (5 – 10) etc., along the xaxis and frequency along yaxis. Choosing a proper scale, we can construct a histogram as explained in the previous example:
Exercise 1.5.3
 Draw a histogram to represent the following frequency distribution.
Class interval  Frequency 
20 – 25  5 
25 – 30  10 
30 – 35  18 
35 – 40  14 
40 – 45  12 
 Draw a histogram to represent the following frequency distribution.
Classintervals  Frequency 
10 – 19  7 
20 – 29  10 
30 – 39  20 
40 – 49  5 
50 – 59  15 
Solution:
Classintervals  Actual class interval  Frequency 
10 – 19  9.5 – 19.5  7 
20 – 29  19.5 – 29.5  10 
30 – 39  29.5 – 39.5  20 
40 – 49  39.5 – 49.5  5 
50 – 59  49.5 – 59.5  15 
1.5.4 Mean, Median, Mode
Now we study three important quantities associated with a statistical data. They give clear picture of the experiment. They are generally called measures of central tendencies.
Mean: an average is defined as the number that measures the central tendency of a given set of numbers.
Mean for ungrouped data: If x_{1}, x_{2}, x_{3}, … x_{N }are the values of N, then,
Mean = ^{(sum of all valuesof observations)}/_{(the number of observations) }
The sum of N values of x is represented by ∑x. Here ∑ stand for summation notation. Therefore,
X =^{ ∑x}/_{N }
Example 7: Find the mean of first six even natural numbers.
Solution:
The first six even natural numbers are 2, 4, 6, 8, 10, 12. There are six scores. Therefore, N = 6. The observations are x_{1 }= 2, x_{2 }= 4, x_{3 }= 6, x_{4 }= 8, x_{5 }= 10, x_{6 }= 12.
Hence,
∑x = 2 + 4 + 6 + 8 + 10 + 12 = 42
Hence the mean is given by,
X =^{ ∑x}/_{N } = ^{ 42}/_{6 } = 7
Example 8: Marks scored by Hari in 5 tests(out of 25 marks) are given below:
24, 22, 23, 23, 25. Find his average score.
Solution:
Marks scored by Hari in 5 tests. There are five scores. Therefore, N = 6. The observations are x_{1 }= 24, x_{2 }= 22, x_{3 }= 23, x_{4 }= 23, x_{5 }= 25.
Hence,
∑x = 24 + 22 + 23 + 23 + 25 = 117
Hence the mean is given by,
X =^{ ∑x}/_{N }
= ^{117}/_{7 }
=23.4
Mean of a grouped data:
Example 9: The number of goals scored by a hockey team in 20 matches is given here:
4, 6, 3, 2, 2, 4, 1, 5, 3, 0, 4, 5, 4, 5, 4, 0, 4, 3, 6, 4.
Find the mean.
Solution:
To find the mean, let us prepare a frequency distribution table first.
[Note: There are some scores which are repeated, so find the sum of all the scores, we have to multiply each score with its frequency and then find the sum.]
Scores  tally marks  frequency 
0  II  2 
1  I  1 
2  II  2 
3  III  3 
4  IIII II  7 
5  III  3 
6  II  2 
N = 20 
Scores(x)  frequency  fx 
0  2  2 
1  1  1 
2  2  4 
3  3  9 
4  7  28 
5  3  15 
6  2  12 
N = 20  ∑fx = 69 
Therefore, mean, X =^{ ∑fx}/_{N} = X =^{ 69}/_{20 }
= 3.45
Example 10: Find the mean for the given frequency distribution table:
class interval  frequency 
0 – 4  3 
5 – 9  5 
10 – 14  7 
15 – 19  4 
20 – 24  6 
N = 25 
Solution:
To find the mean, first we have to find the midpoint of each class interval. Mid point of 0 – 4 = ( 0 + 4) = 4/2 = 2
class interval  Mid point of CI (x)  frequency  fx 
0 – 4  2  3  6 
5 – 9  7  5  35 
10 – 14  12  7  84 
15 – 19  17  4  68 
20 – 24  22  6  132 
N = 25  ∑fx = 325 
Therefore, Mean, X =^{ ∑fx}/_{N }
X =^{ 325}/_{25 } = 13_{ }
Median:
To find the mean, add up the values in the data set and then divide by the number of values that you added. To find the median, list the values of the data set in numerical order and identify which value appears in the middle of the list.
Median for ungrouped data:
Example 11: Find the median of the data: 26, 31, 33, 37, 43, 8, 26, 33.
Solution:
Arranging scores in ascending order, we have 26, 31, 33, 37, 38,42, 43.
Here the number of terms is 7. The middle term is the 4^{th} one and it is 37. Therefore median is 37.
Example 12: Find the median of the data: 32, 30, 28, 31, 22, 26, 27, 21.
Solution:
Arranging in descending oder, we obtain 32, 31, 30, 28, 27, 26, 22, 21.
There are 8 terms.
Therefore, median is the average of the two middle terms, which are 27 and 28. Thus the median is ^{(27+28)}/_{2 }= 27.5
Median for a grouped data:
Example 13: Find the median for the grouped data
Class interval  frequency 
1 – 5  4 
6 – 10  3 
11 – 15  6 
16 – 20  5 
21 – 25  2 
N = 20 
Solution:
Class interval  frequency  Cumulative frequency 
1 – 5  4  4 
6 – 10  3  4 + 3 = 7 
11 – 15  6  7 + 6 = 13 
16 – 20  5  13 + 5 = 18 
21 – 25  2  18 + 2 = 20 
N = 20 
Observe that, the cumulative frequency corresponding to the last class interval is equal to N. Counting frequencies from first class interval downloads, we find that the 10^{th} score lies in the class interval (11 – 15). This is class interval (11 – 15) is called the median class.
Frequency corresponding to this is 6. Its actual lower limit is 10.5. The cumulative frequency above this class is 7.
We know, the formula to find the median,
Now,
Actual lower limit(L) = 10.5
Frequency of the median ( = 6
Cumulative frequency above the median class,
Size of the interval , i= 5
Therefore,
Median = 10.5 + (^{(57)}/_{6}) x 5 = 10.5 + 2.5 = 13.
Mode:
The mode is found by collecting and organizing the data in order to count the frequency of each result.
Mode for an ungrouped data:
Example 15: Find the mode for the data: 15, 20, 22, 25, 30, 20, 15, 20, 12, 20.
Solution:
Here 20 appear maximum times (4 times). Therefore, mode is 20.
Mode for grouped data:
For grouped data, the same having maximum frequency is the mode.
Example 17: Find the mode of the following data:
Number  12  13  14  15  16  17 
frequency  7  9  6  22  20  19 
Solution:
Here the maximum frequency is 22. Therefore, the number 15 corresponding to maximum frequency is the mode.
Exercise 1.5.4
 Runs scored by 10 batsmen in a one day cricket match are given. Find the average run scored
23, 54, 08, 94, 60, 18, 29, 44, 05, 86
Solution:
X =^{ ∑x}/_{N } = ^{ (23+54+08+94+60+18+29+44+05+86)}/_{10} = ^{421}/_{10 } = 42.1
_{ }
 Find the mean weight from the following table:
weight(kg)  29  30  31  32  33 
No. of children  2  1  4  3  5 
Solution:
Weight (kg)  No. of children (f)  fx 
29  2  58 
30  1  30 
31  4  124 
32  3  96 
33  5  165 
X =^{ ∑fx}/_{N } = ^{ (58+30+124+96+165)}/_{15} = ^{ 473}/_{15 }= 31.53_{ }
_{ }
 Calculate the mean for the following frequency distribution:
Marks  10 – 20  20 – 30  30 – 40  40 – 50  50 – 60  60 – 70  70 – 80 
frequency  3  7  10  6  8  2  4 
Solution:
Marks(x)  X (midpoint)  frequency(f)  fx 
10 – 20  15  3  45 
20 – 30  25  7  175 
30 – 40  35  10  350 
40 – 50  45  6  270 
50 – 60  55  8  440 
60 – 70  65  2  130 
70 – 80  75  4  300 
N = 40  ∑fx = 1710

X =^{ ∑fx}/_{N } =^{ 1710}/_{40 } = 42.75
 Calculate the mean for the following frequency distribution:
Marks  15 – 19  20 – 24  25 – 29  30 – 34  35 – 39  40 – 44 
frequency  6  5  9  12  6  2 
Solution:
Marks(x)  X (midpoint)  frequency(f)  fx 
15 – 19  17  6  102 
20 – 24  22  5  110 
25 – 29  27  9  243 
30 – 34  32  12  384 
35 – 39  37  6  222 
40 – 44  42  2  84 
N = 40  ∑fx = 1145

X =^{ ∑fx}/_{N } = ^{1145}/_{40 } = 28.625
 Find the median of the data: 15, 22, 9, 20, 6, 18, 11, 25, 14.
Solution:
Arranging in ascending order,
6, 9, 11, 14, 15,18, 20, 22, 25. Therefore the middle term is 15.
 Find the median of the data: 22, 28, 34, 49, 44, 57, 18, 10, 33, 41, 66, 59.
Solution:
Arranging in ascending order: 10, 18, 22, 28, 33, 34, 41, 44, 49, 57, 59, 66
Here, we have 2 middle terms, (34+41)/2 = 75/2 = 37.5
 Find the median for the following frequency distribution table:
class interval  110 – 119  120 – 129  130 – 139  140 – 149  150 – 159  160 – 169 
frequency  6  8  15  10  6  5 
Solution:
class interval  frequency  Cumulative frequency 
110 – 119  6  6 
120 – 129  8  14 
130 – 139  15  29 
140 – 149  10  39 
150 – 159  6  45 
160 – 169  5  50 
N = 50 
Here, L = 129.5
N = 50,
f_{c }= 14,
f_{m }= 15,
I = 10
Median =129.5 +[^{ (2514)}/_{15}] x10
= 129.5 + [^{ 11}/_{15}] x10
= 136. 83
 Find the median for the following frequency distribution table:
class interval  0 – 5  5 – 10  10 – 15  15 – 20  20 – 25  25 – 30 
frequency  5  3  9  10  8  5 
Solution:
class interval  frequency  Cumulative frequency 
0 – 5  5  5 
5 – 10  3  8 
10 – 15  9  17 
15 – 20  10  27 
20 – 25  8  35 
25 – 30  5  40 
N = 40 
Here, L = 15
N = 40,
f_{c }= 17,
f_{m }= 10,
I = 5
Median = 15 + [^{(2017)}/_{10}] x 5
= 15 +[^{ 3}/_{10}]x5
= 15 + 3/2
= 16.5
 Find the mode for the following data:
(i). 4, 3, 1, 5, 3, 7, 9, 6
Solution:
Mode = 3
(ii)22, 36, 18, 22, 20, 34, 22, 42, 46, 42
Solution:
Mode = 22
 Find the mode for the following data:
x  5  10  12  15  20  30  40 
f  4  8  11  13  16  12  9 
Solution:
Here the maximum frequency is 20.
One thought on “Statistics”
Comments are closed.