Types of Data

 

‘Data are oil that runs the machine named society. It is the information collected from various sources and stored in different formats in different locations. There are various types of data which are broadly classified in two types- Variables and Attributes.

 

3.1. VARIABLES AND ATTRIBUTES

‘Statistical information collected from a group of individuals or objects are basically of two types- quantitative and qualitative. Quantitative data are numerical data which are measurable and expressed in quantifiable units. Quantitative data are easier to understand and classify. For examples: Scores, salaries, height, weight, counts etc…

Qualitative data are non-numerical data and are not usually measurable nor expressed in quantifiable units. These are comparatively difficult to understand and classify. For examples: Beauty, Grades etc…

Quantitative data are called variables while Qualitative data are called attributes.

 

3.1.1. Quantitative data are of two types:

3.1.1.1.Discrete: The discrete data are countable, distinct data. They contain only whole numbers or integers.

Example: Number of people with age 20. The number of people can never be a fraction.

 

3.1.1.2.Continuous: Continuous data are data which can be put into range. They contain both whole and fractions. They aren’t distinct, countable numbers.

Example: age. Age can be 20 years, 20.5 i.e., 20 years 5 months etc.…

 

Continuous variables can further be divided into ratio and interval.

 

3.1.1.2.1.      Ratio: Ratio is a proportion of one value to other. Ratio has an absolute zero point or point of origin that acts as constraints for the variables.

For example: Ratio of people who prefer tea to coffee.

 

3.1.1.2.2.      Interval: Interval is a continuous set of values lying between two points. If both points are fixed then it is closed interval. If one is fixed while other is open then it is an open interval.

For example: Which age group does the average people belongs to who prefer coffee to tea.

 

3.1.2. Qualitative data are of two types:

3.1.2.1.Nominal: The word nominal denotes name. Nominal categorical data doesn’t follow any order. Nothing can be said about their order like whether one is less than or greater than other. They represent label provided to dataset. These are categorical data which assigns data under various category heads.

Example: gender, marital status etc.…

 

3.1.2.2.Ordinal: Ordinal data can be numbers or labels but they follow a natural order. They are scaled data. These data can be arranged in ascending and descending order. But arithmetic operators cannot be performed on Ordinal data.

Example: grades, Likert scaled data etc.…’[1]

Identification of type of data is one of the most important steps in analysis.

 Example: Suppose a survey was conducted on a group of individuals and a few variables and attributes were chosen on based of which observations were taken. Identify the variables and attributes.

Observations:

Age- 15,12.3,16.4,14,13.1,14,16.3,12,10,17,18.2

Grade- 9,5,9,8,6,7,10,6,5,12,12.

Gender-F, M, M, F, M, F, F, M, F, F, F.

Religion- Hindu, Hindu, Muslim, Christian, Hindu, Muslim, Hindu, Sikh, Muslim, Christian, Hindu.

Number of awards received: 5,4,3,0,1,2,3,2,8,10.

Solution:

Variables:

 Discrete: Number of awards received   ; Continuous: Age

Attributes:

Nominal: Gender; Ordinal: Grade

 

3.2. Data can be further classified in three different types based upon their arrangement.


3.2.1.      ‘Structured Data: Structured data are data arranged in rows and columns. These data are modified and organized in such a way that they can be easily read and understood. Example: Excel spreadsheet.


3.2.2.      Unstructured Data: Unstructured data are data which are not arranged in any particular order. These are not in any organized format. Example: String data, images, data from social media.

 

3.2.3.      Semi-Structured Data: Semi-Structured data are combination of both structured and unstructured data. Such as images are unstructured but if it is digital and contains structured information such as date, time, location then these data are semi-structured data.’[2]

 

Survey data can be a mixture of all these three categories and data types, so we need to know classifying data and then arrange them in an understandable form example tabular form.



[1] https://www.mygreatlearning.com/blog/types-of-data/

[2] https://www.geeksforgeeks.org/difference-between-structured-semi-structured-and-unstructured-data/

Comments

Popular posts from this blog

WHY STATISTICS?

Everyone is a born Statistician!

STORY TELLING WITH STATISTICS