Frequency

1.1.Frequency

The number of times an event or observation that is being examined in a certain study or experiment occurs is known as its frequency. It is the total number of the observation or incident. Usually, a table or graph is used to express or present these. For instance, you might be asked, "How often do you exercise?" and responded once or twice a week, then this indicates how frequently the event "exercise" occurs for a single observation. Let's say a group of people are asked the same question, and their answers are noted and categorized using the subdivide or values as observations. We then write down the number of times that people chose or voted against the observation; these are called frequencies, and the whole table becomes the exercisers' frequency distribution table.

1.2. Frequency Distribution

A frequency distribution is a table or graph that, in a given study or experiment, shows the frequencies of all possible values for an event or observation. It follows the frequency's pattern. In essence, the frequency distribution is a function that indicates the frequency at which a specific value occurs. A function is how variables relate to one another. One can either group or ungroup the frequency distribution. ‘In ungrouped frequency distribution, frequency is supplied for discrete individual events or observations. Additionally, frequency is given for a collection of observations or events in groups. We refer to these groups as class intervals. Let's look at a hypothetical study that was done on a group of people who were exercising. We only collected information on their age and weekly exercise frequency, from which we generated two frequency distributions:- one grouped and the other simple.

Example of grouped Frequency Distribution

Times a week they exercise	Number of People (frequency)
0-1	5
1-2	4
2-3	6
>3	2

Interpretation: 5 people exercise 0-1 times a week, 4 people do it 1-2 times a week, 6 people exercise 2-3 times a week and 2 people do it more than 3 times.

Example of simple Frequency Distribution

Age	Number of People (frequency)
13	1
16	7
18	8
40	1

Interpretation: in our dataset, we have 1 person aged 13, 7 people aged 16, 8 people aged 18 and 1 person aged 40.

Frequency distribution helps us to structure the unstructured data and also arrange the structured ones better so that it is readily usable for analysis.

1.3.Class Interval

The observations in a grouped frequency distribution are first sorted into smaller groups and have frequencies put against them based on a shared characteristic. The format of these groupings is intervals. Class bounds are defined as the interval's two extremities. The terms "upper class limit" and "lower class limit" refer to the highest and lowest values in the interval, respectively. The difference between the two class bounds is the class interval. Class intervals are discontinuous, meaning that the upper and bottom bounds of one interval do not equal one another. Class intervals could have the same width or not.

Times a week they exercise (Class Interval)	Number of People (frequency)
0-1	5
2-3	4
4-5	6
>6	2

‘There are two types of classes- Open ended and closed ended.

1.3.1. Open-ended class: When one end of the interval is missing that is either the upper-class limit or lower-class limit is missing then they are called open-ended class. Example:<0 or >6

1.3.2. Closed-ended class: When both the ends of the class are present i.e., both upper- and lower-class limits then they are called closed ended class. Example: 0-1, 2-3,4-5.

1.4.Class Boundary

The maximum and lowest values that a class is capable of receiving define its bounds. The greatest value for an upper class boundary and the lowest value for a lower class boundary indicate the boundaries of a class. Because class boundaries are continuous, the upper and lower bounds of one class are equal to or comparable to one another.

Let d denotes the difference between the upper limit of previous class and lower limit of the next class. Then the upper boundary of the previous class is

Upper limit +(1/2) *d

And the lower boundary of the next class is

Lower limit -(1/2) *d”

For this let us consider an example of exercising but this time the interval would be based on month.

Times a week they exercise (Class-Interval)	Number of People (frequency)
0-5	19
6-10	10
11-15	20
16-20	12
21-25	10
26-30	5

CONVERTING CLASS-LIMITS TO CLASS-BOUNDARIES

Times a week they exercise (Class-Boundary)	Number of People (frequency)
0.5-5.5	19
5.5-10.5	10
10.5-15.5	20
15.5-20.5	12
20.5-25.5	10
25.5-30.5	5

Here, difference between the upper-class limit of previous class and the lower-class limit of the next class is 1. So, we substitute upper class limit +(1/2) *1 for the upper-class boundaries of all the classes and substitute lower class limit -(1/2) *1 for lower class boundaries for each class.

1.5.Mid Value

The mid-value of a class interval is the value that divides the entire interval into two equal pieces.

Mid-Value= (Lower Class Limit + Upper Class Limit)/2

= (Lower Class Boundary + Upper Class Boundary)/2’

Times a week they exercise (Class-Interval)	Mid-value
0-5	2.5
6-10	8
11-15	13
16-20	18
21-25	23
26-30	28

Times a week they exercise (Class-Interval)	Mid-value
0.5-5.5	2.5
5.5-10.5	8
10.5-15.5	13
15.5-20.5	18
20.5-25.5	23
25.5-30.5	28

Mid-Values calculated from class limits and class-boundaries of the same frequency distribution are always same.

1.6.Width of the class

A class's width is determined by the difference between its upper and lower bounds, not by its limitations. “Width Of Class= Upper Class Boundary- Lower Class Boundary”’

Times a week they exercise (Class-Boundary)	Width
0.5-5.5	5
5.5-10.5	5
10.5-15.5	5
15.5-20.5	5
20.5-25.5	5
25.5-30.5	5

1.7.Frequency Density of the class

The class frequency divided by the class width is known as the frequency density of a class.

“Frequency Density of a class= (frequency of the class/ Width of the class)”’

Times a week they exercise (Class-Boundary)	Frequency	Width	Frequency density
0.5-5.5	70	5	35
5.5-10.5	60	5	30
10.5-15.5	50	5	25
15.5-20.5	40	5	20
20.5-25.5	30	5	15
25.5-30.5	20	5	10

We compute the class frequency density when the classes have varying widths in order to obtain a proper understanding of the frequency for each class. Assume for the moment that we are estimating a nation's population and attempting to determine if the population is distributed fairly throughout the nation or which state has a high population density and requires population management. Assume for the moment that Madhya Pradesh has a larger population than Manipur. Is it therefore accurate to state that Manipur has greater population control than Madhya Pradesh? However, this may not be the case; in Manipur, for example, the population may be far lower than in the territory that MP covers., how to detect it? If we conclude just based on frequency, we might give a wrong suggestion. In that case we need to calculate frequency density. How dense is the population in a particular area?

i.e., population density of a state= Population in the state/ area of the state.

The state with high density needs to control their population growth and the one with low density wouldn’t need to.

1.8.Diagrammatic Representation of Frequency

1.8.1. Histogram

Histograms are straightforward two-dimensional graphs that show the complete dataset as vertical rectangular bars connected together (referred to as bins). The length of the bins indicates the size of the dataset within that range (frequency), and the width of the bins indicates the range of data (class-width).

1.8.2. Bar-Charts

Bar charts are two-dimensional graphs that display the complete dataset. They are made up of vertical rectangular bars that are not connected together, or bins. The bins have similar widths, are spaced equally apart, and have lengths that correspond to the frequencies they represent.

1.8.3. Frequency Curves

‘Frequency curves are smooth hand drawn curves through

all the data points(frequencies) which showcases the shape

of frequency distribution.’[13]

8.4. Frequency Polygon

‘Frequency Polygon is a smooth free-hand line curve drawn by joining the mid points of the intervals. They are drawn by joining the midpoints of the upper horizontal lines of each histogram.’[14]

1.9.Relative Frequency

‘Relative frequency is the ratio of individual frequency to total frequencies.[15]’ Suppose, we want to find what is the ratio or percentage of total people who exercises 1-2 times a week we shall take the help of relative frequencies.

Class-Boundary	Frequencies	Relative Frequencies
0.5-5.5	19	19/76
5.5-10.5	10	10/76
10.5-15.5	20	20/76
15.5-20.5	12	12/76
20.5-25.5	10	10/76
25.5-30.5	5	5/76

Here, total frequencies=N= 76

These relative frequencies expressed as percentage can be diagrammatically expressed as Pie-Charts.

1.9.1. Pie-Charts

‘Pie-Charts are mostly used to depict the percentage of each sector respective to the total. It can be used to depict the composition of the variable under consideration. Each slice represents the proportion of the corresponding group to the total.’[16]

1.10. Cumulative Frequency

‘Cumulative frequency is the total frequencies less than or greater than the particular class.

If a table consists of total frequencies less than the specified class is called less than cumulative frequency distribution.

If a table consists of total frequencies more than the specified class is called more than cumulative frequency distribution.’[17]

Example: Suppose we are to find –

people who exercise less than 3 times a week.

People who exercise more than 3 times a week.

Class-Interval	Frequency
0-5	19
6-10	10
11-15	20
16-20	12
21-25	10
26-30	5

Class-Interval	Less than Cumulative Frequency
0-5	19
6-10	29
11-15	49
16-20	61
21-25	71
26-30	76

In less than O-give we put the less than cumulative frequencies against the class-limits and the cumulative frequencies are counted by summing up all the frequencies till the frequency of that respective class.

Suppose CF5 denotes the cumulative frequency of the 5th class then,

CF5= f1+f2+f3+f4+f5[18]

i.e frequency of 1st class+ frequency of 2nd class+ frequency of 3rdclass+ frequency of 4th class+ frequency of 5th class.

Class-Interval	More than Cumulative Frequency
0-5	76
6-10	57
11-15	47
16-20	27
21 -25	15
26-30	5

In less than O-give we put the more than cumulative frequencies against the class-limits and the cumulative frequencies are counted by summing up all the frequencies till the frequency of that respective class.

Suppose CF5 denotes the cumulative frequency of the 5th class then,

CF5= N-f2-f3-f4-f5[19]

i.e., Total Frequency -frequency of 2nd class- frequency of 3rdclass -frequency of 4th class -frequency of 5th class

fn=N, in case of less than cumulative frequency, where

n= total no. of observations, fn is the frequency of nth class and

N= Total frequency.

f1=N, in case of more than cumulative frequency, where f1 is the frequency of 1st class and

N= Total frequency.

1.11. Cumulative Frequency Density is usually shown using O-give curves.

‘O-give curves are curves drawn taking cumulative frequencies in y-axis and class limits on x-axis. When we draw O-give curve for less than cumulative frequencies it is called less than O-give. In less than O-give the upper-class limits are taken along x-axis. When O-give curve is drawn taking more than cumulative frequency in y-axis and lower-class limits along x-axis, it is called more than o-give. The more-than O-give curves are downward sloping. It moves from top to bottom. The less-than O-give curves are upward sloping. It moves from bottom to top.’[20]

1.12. Summary Explanation on usage of frequency with an example:

Data Analysis is a method of analyzing data and gain insights from it. So, suppose we have taken data on consumption of fast-food among people of different age groups. The data is distributed over age groups width 10. Suppose, we wish to find the number of people in age group 15-25 are consuming fast-food, then we shall use frequency distribution. But if the width is unequal and we collect data based on generation and accordingly the data might differ due to the difference in width of the class. So, if we wish to find the consumption of fast-food among the Gen Z i.e., 10-25 age group and in order to avoid the error caused by width difference, we shall use frequency density. And if we want to find the percentage of Gen Z consuming fast-food relative to total respondents then we might use relative frequency. Now, if we wish to find number of people below age 40, we shall use less than cumulative frequency and if we wish to find number of people above age 12, then we shall use more than cumulative frequency.

1.13. Change Of Origin and Scale

‘Suppose we are dealing with simple frequency distribution or grouped frequency where the values in class intervals are very large. Then the mid-value would also be a large number which might make the further calculations quite difficult. In that case we can reduce the mid-value by changing the origin and scale. The origin and scale must remain constant for the entire table. ‘[21]

Let x denote the mid-value, a be the origin and h be the scale. Then the new value after changing the origin and scale would be (x-a)/h. we can change either origin or scale or both. Where ‘a’ can be any arbitrary value but usually mid value is taken for calculation and h is a common factor for all the mid-values/ (mid-values after change of origin) for easier calculations.

Class-limits (C.L)	mid-value (x)	mid-value(x’) (After change of origin and scale)	Frequency(f)
0-10	5	-2	11
10-20	15	-1	12
20-30	25	0	10
30-40	35	1	5
40-50	45	2	6

Here, a is taken as 25 which is the middle value of the mid-value column and then we find all being multiple of 10 so we divide it by 10 and we get the following values. When a is taken as middle value the sum of transformed mid-value would be equal to 0.[22]

Search This Blog

Jagsstat-Blog Statistics -An Introduction to Statistics

Frequency

Comments

Post a Comment

Popular posts from this blog

WHY STATISTICS?

Moments-Central and Raw Moments- Mean, kurtosis, Variance and Skewness

Averages-Measures of Central Tendency and its Types- Mean, Median, Mode