Posts

Showing posts from November, 2023

Data Classification

Image
      ‘Data are information collected about individuals, objects, places etc. Any sort of information is called data. Information can be numerical, words, sentences etc. Information when collected are not always available in useable form especially if collector and analyst are different. They need to be classified first in order to carry out further steps. Data classification is a methodology where heterogeneous data are put into homogeneous groups based upon some common characteristics. These groups are known as class. For example- Gender, age-groups etc…Data are classified so that their similarities and differences can be easily detected. Vast amount of data can be easily analyzed once they are classified into smaller groups making the comparison easier. There are major four fields based on which data classification is done: a.        Geographical Classification: In this classification method, data are classified based on their geographical locations. For example, we want to stud

Types of Data

Image
  ‘Data are oil that runs the machine named society. It is the information collected from various sources and stored in different formats in different locations. There are various types of data which are broadly classified in two types- Variables and Attributes.   3.1. VARIABLES AND ATTRIBUTES ‘Statistical information collected from a group of individuals or objects are basically of two types- quantitative and qualitative. Quantitative data are numerical data which are measurable and expressed in quantifiable units. Quantitative data are easier to understand and classify. For examples: Scores, salaries, height, weight, counts etc… Qualitative data are non-numerical data and are not usually measurable nor expressed in quantifiable units. These are comparatively difficult to understand and classify. For examples: Beauty, Grades etc… Quantitative data are called variables while Qualitative data are called attributes.   3.1.1. Quantitative data are of two types: 3.1.1.1. Di

Data Collection

Image
  “As Charles Kettering says , a problem well-stated is a problem half solved. “ [1] Before moving to any further step, we need to state the problem in clear terms. ‘What is the problem?’, ‘What do we need to find out?’, ‘What is our objective?’. These things need to be very clear before we move to the next step. Without a clear set goal, we would move like a ship without radar. A data-analyst is rarely given a problem statement but a question and asked to find a solution to the question. For example, ‘Why are we losing customers?’, ‘How can we increase sale?’, ‘How to reduce crime?’, ‘How can we achieve our goals?’, ‘How to increase literacy?’, ‘why isn’t our plans working?’. All these questions have hidden problem statements such as- ‘We are losing customers’, ‘we are not able to increase sale’, ‘crime is increasing or is very high’, ‘we are unable to achieve goals’, ‘Literacy rate is very less’, ‘plans aren’t working accordingly.’ These statements are like saying that ‘X has fe