Data analysis or data analytics is defined as, “a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.”
On the scale of knowledge, cost and time the Data Analysis has six different types ranging from least to most complex; Descriptive, Exploratory, Inferential, Predictive, Casual and Mechanistic.
Descriptive (least amount of effort): The discipline of quantitatively describing the main features of a collection of data. In essence, it describes a set of data.
Exploratory: An approach to analyzing data sets to find previously unknown relationships.– Exploratory models are good for discovering new connections
Inferential: Aims to test theories about the nature of the world in general (or some part of it) based on samples of “subjects” taken from the world (or some part of it). That is, use a relatively small sample of data to say something about a bigger population.– Inference is commonly the goal of statistical models – Inference involves estimating both the quantity you care about and your uncertainty about your estimate – Inference depends heavily on both the population and the sampling scheme
Predictive: The various types of methods that analyze current and historical facts to make predictions about future events. In essence, it is to use the data on some objects to predict values for another object.
– The models predicts, but it does not mean that the independent variables cause – Accurate prediction depends heavily on measuring the right variables – Although there are better and worse prediction models, more data and a simple model works really well – Prediction is very hard, especially about the future references
Causal: To find out what happens to one variable when you change another. – Implementation usually requires randomized studies – There are approaches to inferring causation in non-randomized studies – Causal models are said to be the “gold standard” for data analysis
Mechanistic (most amount of effort): Understand the exact changes in variables that lead to changes in other variables for individual objects. – Incredibly hard to infer, except in simple situations – Usually modeled by a deterministic set of equations (physical/engineering science) – Generally the random component of the data is measurement error – If the equations are known but the parameters are not, they may be inferred with data analysis