Stats Tests

When, ## Why, ### and How. to use different tests in R and python

** I am currently editing**

Tests of normality

*1. Summary /:The Shapiro–Wilk test tests the normality of a given distribution. Shapiro Wilk effectively compares two estimates of variance;

The null-hypothesis of this test is that the population is normally distributed. Thus, if the p-value is less than the chosen alpha level α, then the null hypothesis is rejected and there is evidence that the data tested are not from a normally distributed population; in other words, the data are not normal.

In R

$ shapiro.test(ItemName) Sample output In PYTHON

from scipy import stats np.random.seed(12345678) x = stats.norm.rvs(loc=5, scale=3, size=100) stats.shapiro(x_array_like) output (0.9772805571556091, 0.08144091814756393) This returns a float, which is the p-value for the hypothesis test.

Chi Squared

PCA

Tests For Variance

varTest

Tests comparing Variance

When to use this test Comparing two variances is useful in several cases, including: When you want to perform a two samples t-test to check the equality of the variances of the two samples When you want to compare the variability of a new measurement method to an old one. Does the new method reduce the variability of the measure?

Method 1 var.test(values ~ groups, data, alternative = "two.sided") Method 2 var.test(x, y, alternative = "two.sided") Preleminary test to check F-test assumptions

   **F TEST**

F-test is very sensitive to departure from the normal assumption. You need to check whether the data is normally distributed before using the F-test.

Shapiro-Wilk test can be used to test whether the normal assumption holds. It’s also possible to use Q-Q plot (quantile-quantile plot) to graphically evaluate the normality of a variable. Q-Q plot draws the correlation between a given sample and the normal distribution.

If there is doubt about normality, the better choice is to use Levene’s test or Fligner-Killeen test, which are less sensitive to departure from normal assumption Compute F-test

F-test

res.ftest <- var.test(len ~ supp, data = my_data) res.ftest

Example output The p-value of F-test is p = 0.2331433 which is greater than the significance level 0.05. In conclusion, there is no significant difference between the two variances

Stats Tests

When, ## Why, ### and How. to use different tests in R and python

Tests of normality

Tests For Variance

Tests comparing Variance

Method 1 var.test(values ~ groups, data, alternative = "two.sided") **Method 2 ** var.test(x, y, alternative = "two.sided") Preleminary test to check F-test assumptions

F-test

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Method 1 var.test(values ~ groups, data, alternative = "two.sided") Method 2 var.test(x, y, alternative = "two.sided") Preleminary test to check F-test assumptions