-
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathhypothesis_testing.html
More file actions
628 lines (516 loc) · 27.1 KB
/
hypothesis_testing.html
File metadata and controls
628 lines (516 loc) · 27.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Hypothesis-testing</title>
<meta content="" name="description">
<meta content="" name="keywords">
<!-- Favicons -->
<link href="assets/img/Favicon-1.png" rel="icon">
<link href="assets/img/Favicon-1.png" rel="apple-touch-icon">
<!-- Google Fonts -->
<link href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,700,700i|Raleway:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i,500,500i,600,600i,700,700i" rel="stylesheet">
<!-- Vendor CSS Files -->
<link href="assets/vendor/aos/aos.css" rel="stylesheet">
<link href="assets/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="assets/vendor/bootstrap-icons/bootstrap-icons.css" rel="stylesheet">
<link href="assets/vendor/boxicons/css/boxicons.min.css" rel="stylesheet">
<link href="assets/vendor/glightbox/css/glightbox.min.css" rel="stylesheet">
<link href="assets/vendor/swiper/swiper-bundle.min.css" rel="stylesheet">
<!-- Creating a python code section-->
<link rel="stylesheet" href="assets/css/prism.css">
<script src="assets/js/prism.js"></script>
<!-- Template Main CSS File -->
<link href="assets/css/style.css" rel="stylesheet">
<!-- To set the icon, visit https://fontawesome.com/account-->
<script src="https://kit.fontawesome.com/5d25c1efd3.js" crossorigin="anonymous"></script>
<!-- end of icon-->
<script type="text/javascript" async
src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.7/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<!-- =======================================================
* Template Name: iPortfolio
* Updated: Sep 18 2023 with Bootstrap v5.3.2
* Template URL: https://bootstrapmade.com/iportfolio-bootstrap-portfolio-websites-template/
* Author: BootstrapMade.com
* License: https://bootstrapmade.com/license/
======================================================== -->
</head>
<body>
<!-- ======= Mobile nav toggle button ======= -->
<i class="bi bi-list mobile-nav-toggle d-xl-none"></i>
<!-- ======= Header ======= -->
<header id="header">
<div class="d-flex flex-column">
<div class="profile">
<img src="assets/img/myphoto.jpeg" alt="" class="img-fluid rounded-circle">
<h1 class="text-light"><a href="index.html">Arun</a></h1>
<div class="social-links mt-3 text-center">
<a href="https://www.linkedin.com/in/arunp77/" class="linkedin"><i class="bx bxl-linkedin"></i></a>
<a href="https://github.com/arunp77" class="github"><i class="bx bxl-github"></i></a>
<a href="https://twitter.com/arunp77_" class="twitter"><i class="bx bxl-twitter"></i></a>
<a href="https://www.instagram.com/arunp77/" class="instagram"><i class="bx bxl-instagram"></i></a>
<a href="https://arunp77.medium.com/" class="medium"><i class="bx bxl-medium"></i></a>
</div>
</div>
<nav id="navbar" class="nav-menu navbar">
<ul>
<li><a href="index.html#hero" class="nav-link scrollto active"><i class="bx bx-home"></i> <span>Home</span></a></li>
<li><a href="index.html#about" class="nav-link scrollto"><i class="bx bx-user"></i> <span>About</span></a></li>
<li><a href="index.html#resume" class="nav-link scrollto"><i class="bx bx-file-blank"></i> <span>Resume</span></a></li>
<li><a href="index.html#portfolio" class="nav-link scrollto"><i class="bx bx-book-content"></i> <span>Portfolio</span></a></li>
<li><a href="index.html#skills-and-tools" class="nav-link scrollto"><i class="bx bx-wrench"></i> <span>Skills and Tools</span></a></li>
<li><a href="index.html#language" class="nav-link scrollto"><i class="bi bi-menu-up"></i> <span>Languages</span></a></li>
<li><a href="index.html#awards" class="nav-link scrollto"><i class="bi bi-award-fill"></i> <span>Awards</span></a></li>
<li><a href="index.html#professionalcourses" class="nav-link scrollto"><i class="bx bx-book-alt"></i> <span>Professional Certification</span></a></li>
<li><a href="index.html#publications" class="nav-link scrollto"><i class="bx bx-news"></i> <span>Publications</span></a></li>
<li><a href="index.html#extra-curricular" class="nav-link scrollto"><i class="bx bx-rocket"></i> <span>Extra-Curricular Activities</span></a></li>
<!-- <li><a href="#contact" class="nav-link scrollto"><i class="bx bx-envelope"></i> <span>Contact</span></a></li> -->
</ul>
</nav><!-- .nav-menu -->
</div>
</header><!-- End Header -->
<main id="main">
<!-- ======= Breadcrumbs ======= -->
<section class="breadcrumbs">
<div class="container">
<div class="d-flex justify-content-between align-items-center">
<h2>Hypothesis-testing</h2>
<ol>
<li><a href="machine-learning.html" class="clickable-box">Content section</a></li>
<li><a href="index.html#portfolio" class="clickable-box">Portfolio section</a></li>
</ol>
</div>
</div>
</section><!-- End Breadcrumbs -->
<section class="inner-page">
<div class="container">
<h2>Definition of Hypothesis testing</h2>
Hypothesis testing is a formal process of statistical analysis using inferential statistics. The goal of hypothesis testing is to compare populations or assess relationships between variables using samples.
Hypotheses, or predictions, are tested using statistical tests. Statistical tests also estimate sampling errors so that valid inferences can be made.
<p>Statistical tests can be: parametric or non-parametic.</p>
<h4>1. Parametric tests</h4>
parametric tests are considered more statistically powerful because they are more likely to detect an effect if one exists. Parametric tests make assumptions that include the following:
<ul>
<li>the population that the sample comes from follows a normal distribution of scores</li>
<li>the sample size is large enough to represent the population</li>
<li>the variances, a measure of variability, of each group being compared are similar</li>
</ul>
<P><b>Example:</b> t-test, ANOVA, Regression analysis, Pearson correlation coefficient.</P>
<p><b>A short description on each of these examples:</b></p>
<ol>
<li><b><a href="t_test.html" target="_blank">t-test</a>: </b> Used to compare the means of two groups when the data are normally distributed and the variances of the two groups are equal.</li>
<li><b><a href="ANOVA.html" target="_blank">ANOVA</a>: </b> Used to compare the means of three or more groups when the data are normally distributed and the variances of the groups are equal.</li>
<li><b>Regression analysis:</b> Used to model the relationship between two or more variables when the data are normally distributed and the assumptions of the model are met.</li>
<li><b>Pearson correlation coefficient: </b>Used to measure the strength and direction of the linear relationship between two continuous variables when the data are normally distributed.</li>
</ol>
<h4>2. <a href="non_parametic_test.html" target="_blank">Non-parametric tests</a></h4>
When your data violates any of these assumptions, non-parametric tests are more suitable. Non-parametric tests are called “distribution-free tests” because they don’t assume anything about the distribution of the population data.
<p><b>Example:</b> Wilcoxon signed-rank test, Mann-Whitney U test, Kruskal-Wallis test, Spearman correlation coefficient</p>
<ul>
<li><b>Wilcoxon signed-rank test:</b> Used to compare the medians of two related samples when the data are not normally distributed.</li>
<li><b>Mann-Whitney U test:</b> Used to compare the medians of two independent groups when the data are not normally distributed.</li>
<li><b>Kruskal-Wallis test:</b> Used to compare the medians of three or more groups when the data are not normally distributed.</li>
<li><b>Spearman correlation coefficient:</b> Used to measure the strength and direction of the monotonic relationship between two continuous variables when the data are not normally distributed.</li>
</ul>
<h3>Differents forms of statistics tests</h3>
<p><b>1. comparison test:</b> Comparison tests assess whether there are differences in means, medians or rankings of scores of two or more groups.
To decide which test suits your aim, consider whether your data meets the conditions necessary for parametric tests, the number of samples, and the levels of measurement of your variables.</p>
<table>
<thead>
<tr>
<th>Comparison test </th>
<th>Parametric? </th>
<th>What’s being compared? </th>
<th>Samples</th>
</tr>
</thead>
<tbody>
<tr>
<td>t-test</td>
<td>Yes</td>
<td>Means</td>
<td>2 samples</td>
</tr>
<tr>
<td>ANOVA</td>
<td>Yes</td>
<td>Means</td>
<td>3+ samples</td>
</tr>
<tr>
<td>Mood’s median</td>
<td>No</td>
<td>Medians</td>
<td>2+ samples</td>
</tr>
<tr>
<td>Wilcoxon signed-rank</td>
<td>No</td>
<td>Distributions</td>
<td>2 samples</td>
</tr>
<tr>
<td>Wilcoxon rank-sum (Mann-Whitney U)</td>
<td>No</td>
<td>Sums of rankings</td>
<td>2 samples</td>
</tr>
<tr>
<td>Kruskal-Wallis H</td>
<td>No</td>
<td>Mean rankings</td>
<td>3+ samples</td>
</tr>
</tbody>
</table>
<p><b>2. correlation test: </b>Correlation tests determine the extent to which two variables are associated. Although Pearson’s r is the most statistically powerful test, Spearman’s r is appropriate for interval and ratio variables when the data doesn’t follow a normal distribution. The chi square test of independence is the only test that can be used with nominal variables.</p>
<table>
<thead>
<tr>
<th>Correlation test</th>
<th>Parametric? </th>
<th>Variables</th>
</tr>
</thead>
<tbody>
<tr>
<td>Pearson’s r</td>
<td>Yes</td>
<td>Interval/ratio variables</td>
</tr>
<tr>
<td>Spearman’s r</td>
<td>No</td>
<td>Ordinal/interval/ratio variables</td>
</tr>
<tr>
<td>Chi square test of independence</td>
<td>No</td>
<td>Nominal/ordinal variables</td>
</tr>
</tbody>
</table>
<p><b>3. regression test: </b>Regression tests demonstrate whether changes in predictor variables cause changes in an outcome variable. You can decide which regression test to use based on the number and types of variables you have as predictors and outcomes. Most of the commonly used regression tests are parametric. If your data is not normally distributed, you can perform data transformations.
Data transformations help you make your data normally distributed using mathematical operations, like taking the square root of each value.</p>
<table>
<thead>
<tr>
<th>Regression test </th>
<th>Predictor </th>
<th>Outcome</th>
</tr>
</thead>
<tbody>
<tr>
<td>Simple linear regression</td>
<td>1 interval/ratio variable</td>
<td>1 interval/ratio variable</td>
</tr>
<tr>
<td>Multiple linear regression</td>
<td>2+ interval/ratio variable(s)</td>
<td>1 interval/ratio variable</td>
</tr>
<tr>
<td>Logistic regression</td>
<td>1+ any variable(s)</td>
<td>1 binary variable</td>
</tr>
<tr>
<td>Nominal regression</td>
<td>1+ any variable(s)</td>
<td>1 nominal variable</td>
</tr>
<tr>
<td>Ordinal regression</td>
<td>1+ any variable(s)</td>
<td>1 ordinal variable</td>
</tr>
</tbody>
</table>
<p><b>Degrees of Freedom:</b> Degrees of freedom, often represented by v or df, is the number of independent pieces of information used to calculate a statistic.
It’s calculated as the sample size minus the number of restrictions.</p>
<div class="content-section">
<h2 id="step-by-step-guide-to-hypothesis-testing">
Step-by-step guide to hypothesis testing
</h2>
<ol>
<li>
<strong>Formulate the null and alternative hypotheses</strong>
<p>
The null hypothesis (denoted <em>H₀</em>) is the hypothesis that there is no significant difference or relationship.
The alternative hypothesis (denoted <em>Hₐ</em>) states that a significant difference or relationship exists.
</p>
<ul>
<li>
Example: In a clinical trial, the null hypothesis might state that there is no difference between a new drug and a placebo,
while the alternative hypothesis states that the new drug is more effective.
</li>
<li>
<a href="#null-and-alternative-hypothesis" class="btn-link">Go to detailed section</a>
</li>
</ul>
</li>
<li>
<strong>Choose a significance level</strong>
<p>
The significance level (denoted α) represents the probability of rejecting the null hypothesis when it is actually true.
A commonly used value is 0.05.
</p>
<ul>
<li>
<a href="#significance-level" class="btn-link">Go to detailed section</a>
</li>
</ul>
</li>
<li>
<strong>Select an appropriate statistical test</strong>
<p>
The choice of statistical test depends on the type of data, number of groups, and assumptions of the test.
For example, a t-test compares two means, while ANOVA compares more than two.
</p>
<ul>
<li>
<a href="#selecting-a-appropriate-statistical-test" class="btn-link">Go to detailed section</a>
</li>
</ul>
</li>
<li>
<strong>Calculate the test statistic</strong>
<p>
The test statistic measures how far the observed sample result deviates from the null hypothesis.
</p>
<p><strong>t-test formula:</strong></p>
<p>
$$ t = \frac{\bar{x} - \mu}{s / \sqrt{n}} $$
</p>
<p>
where <em>𝑥̄</em> is the sample mean, <em>μ</em> the population mean, <em>s</em> the sample standard deviation,
and <em>n</em> the sample size.
</p>
<p><strong>z-test formula:</strong></p>
<p>
$$ z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}} $$
</p>
<p>
The z-test is used when the population standard deviation is known or the sample size is large.
</p>
</li>
<li>
<strong>Calculate the p-value</strong>
<p>
The p-value represents the probability of observing a test statistic as extreme as the one calculated,
assuming the null hypothesis is true.
</p>
<ul>
<li>
A small p-value indicates strong evidence against the null hypothesis.
</li>
<li>
If p < α, the null hypothesis is rejected.
</li>
<li>
If p ≥ α, the null hypothesis is not rejected.
</li>
</ul>
<p>
The p-value depends on the sampling distribution (t, z, chi-square, or F) used in the test.
</p>
</li>
<li>
<strong>Interpret the results</strong>
<p>
If the p-value is less than the significance level, the result is statistically significant and the null hypothesis is rejected.
Otherwise, there is insufficient evidence to reject the null hypothesis.
</p>
</li>
</ol>
<div class="example-section">
<h4>Example on step by step analysis</h4>
<p>To illustrate this process, consider the following scenario.</p>
<p>
Suppose we want to test whether there is a difference in the average height of men and women.
We collect a sample of 100 men and 100 women and measure their heights.
The goal is to determine whether the observed difference in sample means is statistically significant.
</p>
<ul>
<li>
<strong>Null hypothesis (H₀):</strong>
μ₁ = μ₂
<br>
(There is no difference in height between men and women.)
</li>
<li>
<strong>Alternative hypothesis (H₁):</strong>
μ₁ ≠ μ₂
<br>
(There is a difference in height between men and women.)
</li>
<li>
<strong>Significance level:</strong>
α = 0.05
</li>
<li>
<strong>Statistical test:</strong>
A two-sample t-test is used to compare the means of the two groups.
</li>
<li>
<strong>Test statistic:</strong>
<p>
The test statistic for a two-sample t-test is:
</p>
<p>
$$t= \frac{\bar{x}_1-\bar{x}_2}{s/\sqrt{n_1+n_2}}$$
</p>
<p>
where x̄₁ and x̄₂ are the sample means, s is the pooled standard deviation,
and n₁ and n₂ are the sample sizes.
</p>
<p>
Suppose the sample mean height for men is 175 cm and for women is 162 cm.
Let the sample standard deviations be s₁ = 6 cm and s₂ = 5 cm.
</p>
<p>
The pooled standard deviation is:
</p>
<p>
$$
s= \sqrt{\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n2-2}} = 5.524
$$
</p>
<p>
Substituting into the test statistic:
</p>
<p>
t = (175 − 162) / (5.524 / √(100 + 100)) = 12.215
</p>
</li>
<li>
<strong>P-value:</strong>
<p>
The p-value represents the probability of observing a t-statistic as extreme as 12.215
assuming the null hypothesis is true.
Using a t-distribution with 198 degrees of freedom, the p-value is far less than 0.05.
</p>
</li>
<li>
<strong>Conclusion:</strong>
<p>
Since the p-value is smaller than the significance level (0.05),
we reject the null hypothesis and conclude that there is a statistically
significant difference in average height between men and women.
</p>
</li>
</ul>
</div>
<hr>
<h2 id="null-and-alternative-hypothesis">Null and Alternative Hypothesis</h2>
<p>
The null hypothesis (<strong>H₀</strong>) states that there is no effect or no difference,
while the alternative hypothesis (<strong>Hₐ</strong>) states that a meaningful effect or difference exists.
</p>
<ul>
<li>The null hypothesis is assumed true unless evidence suggests otherwise.</li>
<li>The hypotheses must be mutually exclusive and collectively exhaustive.</li>
</ul>
<p><strong>Example:</strong></p>
<ul>
<li>H₀ ⇒ The new drug has no effect on blood pressure.</li>
<li>Hₐ ⇒ The new drug has a significant effect on blood pressure.</li>
</ul>
<p>Alternatively, suppose a researcher is interested in whether there is a difference in job satisfaction between men and women.
They could formulate the null and alternative hypotheses as follows:</p>
<ul>
<li>\(H_0\) ⇒ There is no significant difference in job satisfaction between men and women.</li>
<li>\(H_a\) ⇒ There is a significant difference in job satisfaction between men and women.</li>
</ul>
<p></p>
<p>
The null and alternative hypotheses can be one-tailed or two-tailed, depending on the direction of the expected difference or relationship between the variables.
</p>
<p>A one-tailed hypothesis predicts the direction of the effect (e.g., the new drug will lower blood pressure), while a two-tailed hypothesis does not predict the
direction of the effect (e.g., there is a difference in job satisfaction between men and women).</p>
<p>In summary, formulating the null and alternative hypotheses is a critical step in hypothesis testing, as it defines the research question and the direction of the analysis.
The hypotheses must be mutually exclusive and collectively exhaustive, and their formulation depends on the research question and the expected relationship or difference
between the variables being studied.
</p>
<hr>
<h2 id="significance-level">Significance Level (α)</h2>
<p>
The significance level (α) represents the probability of committing a Type I error —
rejecting a true null hypothesis.
</p>
<ul>
<li>Common values: 0.05 or 0.01</li>
<li>Lower α reduces false positives but may increase false negatives</li>
<li>Choice depends on context, risk, and required confidence</li>
</ul>
<hr>
<h2 id="selecting-a-appropriate-statistical-test">Selecting an Appropriate Statistical Test</h2>
<p>
Choosing the correct statistical test depends on multiple factors:
</p>
<ul>
<li><strong>Type of data:</strong> Continuous, categorical, or ordinal</li>
<li><strong>Sample size:</strong> Small samples may require non-parametric tests</li>
<li><strong>Number of groups:</strong> Two groups vs multiple groups</li>
<li><strong>Assumptions:</strong> Normality, equal variance, independence</li>
<li><strong>Research question:</strong> Difference, relationship, or association</li>
</ul>
<p>
Selecting the correct test ensures valid conclusions and reliable statistical inference.
</p>
</div>
<div class="box">
<div class="project-card modern">
<div class="project-header">
<span class="project-tag">Statistics • Python</span>
<h3>Hypothesis Testing in Python</h3>
</div>
<p class="project-description">
A structured, example-driven project designed to build a deep understanding of
hypothesis testing — from statistical intuition to real-world implementation in Python.
The focus is on clarity, assumptions, and correct interpretation of results.
</p>
<ul class="project-features">
<li>Null vs alternative hypotheses</li>
<li>t-tests, z-tests, chi-square tests</li>
<li>Confidence intervals & effect size</li>
<li>Assumptions, pitfalls, and best practices</li>
</ul>
<div class="project-footer">
<a href="https://github.com/arunp77/statistics-and-atmospheric-data/tree/main/projects/hypothesis_testing"
target="_blank" class="project-link">
View on GitHub →
</a>
</div>
</div>
</div>
</div>
</section>
</main><!-- End #main -->
<!-- ======= Footer ======= -->
<footer id="footer">
<div class="container">
<div class="copyright">
© Copyright <strong><span>Arun</span></strong>
</div>
</div>
</footer><!-- End Footer -->
<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="assets/vendor/purecounter/purecounter_vanilla.js"></script>
<script src="assets/vendor/aos/aos.js"></script>
<script src="assets/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="assets/vendor/glightbox/js/glightbox.min.js"></script>
<script src="assets/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="assets/vendor/swiper/swiper-bundle.min.js"></script>
<script src="assets/vendor/typed.js/typed.umd.js"></script>
<script src="assets/vendor/waypoints/noframework.waypoints.js"></script>
<script src="assets/vendor/php-email-form/validate.js"></script>
<!-- Template Main JS File -->
<script src="assets/js/main.js"></script>
</body>
</html>