Friday, March 4, 2016

Cheat sheet : Exploratory data analysis

Here is short version of exploratory data analysis

1. Variable Identification (categorical, continuous, etc)
2. Univariate Analysis
    a. categorical variable : Frequency of occurance (count). Bar chart for visualization
    b. continuous variable: Mean, media, mode, min and max. Histogram for visualization


3. Bi-variate Analysis
    a. Continuous & Continuous: Scatter plot to find out Correlation
Correlation varies between -1 and +1.

-1: perfect negative linear correlation
+1:perfect positive linear correlation and
0: No correlation

    b. Categorical & Categorical:
a. Two-way table: Have count and count% as metric
b. Stacked Column Chart:
c. Chi-Square Test: Need to read more on this but
Probability of 0: It indicates that both categorical variable are dependent
Probability of 1: It shows that both variables are independent.
c. Categorical & Continuous:
a. Z-Test/ T-Test:
b. ANOVA:  It assesses whether the average of more than two groups is statistically different.


..To be continued...


No comments:

Post a Comment