Data Classification



‘Data are information collected about individuals, objects, places etc. Any sort of information is called data. Information can be numerical, words, sentences etc. Information when collected are not always available in useable form especially if collector and analyst are different. They need to be classified first in order to carry out further steps. Data classification is a methodology where heterogeneous data are put into homogeneous groups based upon some common characteristics. These groups are known as class. For example- Gender, age-groups etc…Data are classified so that their similarities and differences can be easily detected. Vast amount of data can be easily analyzed once they are classified into smaller groups making the comparison easier.

There are major four fields based on which data classification is done:

a.       Geographical Classification: In this classification method, data are classified based on their geographical locations. For example, we want to study the population growth rate in India, we can collect data and classify them according to their states. These heads of classification can be states, countries, towns, wards etc…


b.      Chronological Classification: In this classification method, data are classified based on time. For example, we want to study the population growth rate in a particular region then we shall collect data and classify them based on years. Here, data are either arranged in ascending or descending order based upon the time. These heads of classification can be years, months, quarters, weeks, days etc…


c.       Qualitative Classification: In this classification method, data are classified based on certain qualities or attributes such as marital status, literacy etc…


d.      Quantitative Classification: In this classification method, data are classified based on certain quantitative characteristics such as age, weight, height etc…


Classification can be further divided in sub classes

5.1.Simple Classification: When classification is done based on one attribute, it is called simple classification. For example: literate and illiterate.


5.2.Manifold Classification: When classification is done based on more than one attributes, it is called manifold classification. For example: in addition to literacy, if we further want to classify based on genders.’



After the data are classified and arranged, now we have a structured data to move forward with. Classified data needs to be represented in a tabular form to bake it visually understandable. Table are data arrangement in rows (horizontal lines) and columns (vertical lines).



Different Parts of a table are-

1)      Title: it is a brief description of what the table contains in brief.

2)      Stub: It contains description of rows.

3)      Caption: It contains description of columns.

4)      Body: It contains the data.

5)      Footnote: It contains the additional information about the data.

6)      Source: Source from which the data has been collected.







Popular posts from this blog


Everyone is a born Statistician!