Skip to content

Commit ecddb6f

Browse files
committed
Notes 16-26 for 334
1 parent a775b79 commit ecddb6f

18 files changed

Lines changed: 1793 additions & 11 deletions

notes/courses/CSCI-UA-310/R-1.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
---
2+
title: Recitation 1 - Mathematical Background
3+
date: 2026-01-23
4+
---
5+
6+
## Roadmap
7+
8+
This recitation serves as a **mathematical toolbox** review. We cover the essential discrete mathematics concepts required for analyzing algorithm correctness and runtime.
9+
10+
1. **Induction**: The standard template for proofs.
11+
2. **Summations**: Arithmetic and Geometric progressions.
12+
3. **Logarithms**: Identities and approximations.
13+
4. **Asymptotic Bounds**: Proving limits.
14+
15+
---
16+
17+
## 1. Mathematical Induction
18+
19+
Induction is the primary tool for proving the correctness of loop invariants and properties of recursive algorithms.
20+
21+
### 1.1 The Template
22+
23+
To prove a statement $P(n)$ is true for all integers $n \ge k$:
24+
25+
1. **Base Case**: Prove that $P(k)$ is true.
26+
2. **Inductive Hypothesis (IH)**: Assume $P(n)$ is true for some arbitrary $n \ge k$.
27+
3. **Inductive Step**: Show that $P(n) \implies P(n+1)$.
28+
4. **Conclusion**: By the principle of mathematical induction, $P(n)$ is true for all $n \ge k$.
29+
30+
### 1.2 Example: Sum of Cubes
31+
32+
**Claim**: $\sum_{i=1}^n i^3 = \left( \frac{n(n+1)}{2} \right)^2$.
33+
34+
**Proof**:
35+
36+
* **Base Case ($n=1$)**:
37+
LHS: $1^3 = 1$.
38+
RHS: $\left( \frac{1(2)}{2} \right)^2 = 1^2 = 1$.
39+
LHS = RHS.
40+
* **Inductive Hypothesis**: Assume $\sum_{i=1}^k i^3 = \left( \frac{k(k+1)}{2} \right)^2$.
41+
* **Inductive Step**: Consider $n = k+1$.
42+
$$\sum_{i=1}^{k+1} i^3 = \left( \sum_{i=1}^k i^3 \right) + (k+1)^3$$
43+
Substitute IH:
44+
$$= \left( \frac{k(k+1)}{2} \right)^2 + (k+1)^3$$
45+
$$= \frac{k^2(k+1)^2}{4} + \frac{4(k+1)^3}{4}$$
46+
$$= \frac{(k+1)^2}{4} \left( k^2 + 4(k+1) \right)$$
47+
$$= \frac{(k+1)^2}{4} (k^2 + 4k + 4)$$
48+
$$= \frac{(k+1)^2 (k+2)^2}{4} = \left( \frac{(k+1)(k+2)}{2} \right)^2$$
49+
This matches the formula for $n=k+1$.
50+
* **Conclusion**: The claim holds for all $n \ge 1$.
51+
52+
---
53+
54+
## 2. Summations
55+
56+
You will frequently encounter these series in runtime analysis.
57+
58+
### 2.1 Arithmetic Progression
59+
60+
An arithmetic series increases by a constant amount $d$.
61+
$$\sum_{i=0}^{n-1} (a + id) = \frac{n}{2}(a + a_{last})$$
62+
**Standard Case**: Sum of integers $1$ to $n$:
63+
$$\sum_{i=1}^n i = \frac{n(n+1)}{2} = \Theta(n^2)$$
64+
where $a_{last} = a + (n-1)d$.
65+
*Usage*: Appears in Insertion Sort worst-case analysis (nested loops).
66+
67+
### 2.2 Geometric Progression
68+
69+
A geometric series changes by a constant ratio $r$.
70+
$$\sum_{i=0}^n r^i = \frac{r^{n+1} - 1}{r - 1}$$
71+
72+
* If $r < 1$ (decreasing series): The sum converges to $\frac{1}{1-r}$, which is $\Theta(1)$.
73+
* If $r > 1$ (increasing series): The sum is dominated by the last term, $\Theta(r^n)$.
74+
75+
*Usage*: Appears in recursion trees where the work per level grows or shrinks geometrically.
76+
77+
### 2.3 Harmonic Series
78+
79+
$$H_n = \sum_{i=1}^n \frac{1}{i} = \ln n + \gamma \approx \ln n$$
80+
where $\gamma$ is the Euler-Mascheroni constant.
81+
$$H_n = \Theta(\log n)$$
82+
*Usage*: Appears in Randomized QuickSort analysis.
83+
84+
---
85+
86+
## 3. Logarithms
87+
88+
In computer science, $\log n$ usually denotes $\log_2 n$ (binary logarithm).
89+
90+
### Key Identities
91+
92+
1. $a = b^{\log_b a}$
93+
2. $\log_c(ab) = \log_c a + \log_c b$
94+
3. $\log_b a = \frac{\log_c a}{\log_c b}$ (Change of base)
95+
4. $a^{\log_b n} = n^{\log_b a}$ (Important for Master Theorem)
96+
97+
**Stirling's Approximation**:
98+
$$n! \approx \sqrt{2\pi n} \left(\frac{n}{e}\right)^n$$
99+
$$\log(n!) = \Theta(n \log n)$$
100+
101+
---
102+
103+
## References
104+
105+
* **CLRS**: Appendix A (Summations), Appendix B (Sets, Etc.), Chapter 3 (Growth of Functions).

notes/courses/CSCI-UA-310/R-2.md

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
---
2+
title: Recitation 2 - Exact Analysis & Recurrences
3+
date: 2026-01-30
4+
---
5+
6+
## Roadmap
7+
8+
In this recitation, we perform a precise runtime analysis of a simple algorithm (`FindMax`) to understand the difference between exact instruction counting and asymptotic analysis. We also solve a common recurrence found in Binary Search.
9+
10+
1. **Exact Analysis**: `FindMax` algorithm.
11+
2. **Comparison**: $\log n$ vs $n$.
12+
3. **Recurrence**: $T(n) = T(n/2) + c$.
13+
14+
---
15+
16+
## 1. Runtime of FindMax
17+
18+
### 1.1 The Algorithm
19+
20+
```text
21+
FindMax(A)
22+
1. max = A[1]
23+
2. for i = 2 to A.length
24+
3. if A[i] > max
25+
4. max = A[i]
26+
5. return max
27+
28+
```
29+
30+
### 1.2 Exact Cost Analysis
31+
32+
We assign a cost $c_k$ to line $k$.
33+
34+
* Line 1: Executed 1 time. Cost $c_1$.
35+
* Line 2: The loop header tests $i$ from $2$ to $n$. Executed $n$ times. Cost $c_2 n$.
36+
* Line 3: The body runs $n-1$ times. Cost $c_3(n-1)$.
37+
* Line 4: Assignment runs $n-1$ times (depends on data). Worst case: $n-1$ times (strictly increasing array). Best case: 0 times (strictly decreasing). Cost $c_4(n-1)$.
38+
* Line 5: Executed 1 time. Cost $c_5$.
39+
40+
**Total Worst-Case Time**:
41+
$$ T(n) = c_1 + c_2 n + c_3(n-1) + c_4(n-1) + c_5 $$
42+
$$ T(n) = (c_2 + c_3 + c_4)n + (c_1 - c_3 - c_4 + c_5) $$
43+
This is of the form $an + b$, which is linear, $\Theta(n)$.
44+
Asymptotic analysis allows us to skip the detailed $c_k$ accounting and jump straight to the linear structure.
45+
46+
---
47+
48+
## 2. Comparing Growth: $\log n$ vs $n$
49+
50+
**Claim**: $\log n < n$ for all $n \ge 1$.
51+
52+
**Proof (Induction)**:
53+
54+
* **Base Case**: $n=1$. $\log 1 = 0 < 1$. True.
55+
* **Hypothesis**: Assume $\log k < k$.
56+
* **Step**: Prove for $n = k+1$.
57+
We know $\log(k+1) < \log k + 1$.
58+
By IH, $\log k < k$.
59+
Therefore, $\log(k+1) < k + 1$.
60+
61+
---
62+
63+
## 3. The Binary Search Recurrence
64+
65+
Consider the recurrence:
66+
$$ T(n) = T(n/2) + c $$
67+
This arises in algorithms like **Binary Search**, where we do constant work to discard half the input.
68+
69+
### 3.1 Recursion Tree
70+
71+
* Level 0: Cost $c$. Problem size $n$.
72+
* Level 1: Cost $c$. Problem size $n/2$.
73+
* Level 2: Cost $c$. Problem size $n/4$.
74+
* ...
75+
* Level $h$: Cost $c$. Problem size $1$.
76+
77+
Height $h = \log_2 n$.
78+
Total Cost = (Number of levels) $\times$ (cost per level)
79+
$$ T(n) \approx c \log_2 n = \Theta(\log n) $$
80+
81+
### 3.2 Substitution Method
82+
83+
Guess $T(n) \le d \log n$.
84+
$$
85+
\begin{align*}
86+
T(n) &= T(n/2) + c \\\\
87+
&\le d \log(n/2) + c \\\\
88+
&= d(\log n - 1) + c \\\\
89+
&= d \log n - d + c
90+
\end{align*}
91+
$$
92+
We need $-d + c \le 0$, so $d \ge c$.
93+
The guess holds.
94+
95+
---
96+
97+
## References
98+
99+
* **CLRS**: Chapter 2 (Analysis of Algorithms), Chapter 4 (Recurrences).

notes/courses/CSCI-UA-310/R-3.md

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
---
2+
title: Recitation 3 - Merge Sort Variants & QuickSort
3+
date: 2026-02-06
4+
---
5+
6+
## Roadmap
7+
8+
We explore variations of standard algorithms to deepen our understanding of recurrences and worst-case scenarios.
9+
10+
1. **Unequal Split Merge Sort**: Analyzing a 1/3-2/3 split.
11+
2. **QuickSort Worst Case**: Constructing an input that triggers $O(n^2)$ behavior.
12+
13+
---
14+
15+
## 1. Merge Sort: Unequal Split
16+
17+
Standard Merge Sort splits $n$ into $n/2$ and $n/2$. What if we split into $n/3$ and $2n/3$?
18+
19+
### 1.1 The Recurrence
20+
21+
The algorithm still sorts two subarrays and merges them in linear time.
22+
$$T(n) = T(n/3) + T(2n/3) + cn$$
23+
24+
### 1.2 Recursion Tree Analysis
25+
26+
* **Work per level**:
27+
Level 0: $cn$.
28+
Level 1: $c(n/3) + c(2n/3) = cn$.
29+
Level $k$: Sum of costs is $cn$.
30+
* **Depth**:
31+
The tree is unbalanced.
32+
* **Minimum Depth**: Along the $n/3$ branch. $n/3^h = 1 \implies h = \log_3 n$.
33+
* **Maximum Depth**: Along the $2n/3$ branch. $n(2/3)^h = 1 \implies h = \log_{3/2} n$.
34+
35+
Since the work per level is $cn$ and the max depth is logarithmic ($\log_{3/2} n \approx 1.7 \log_2 n$), the total time is still:
36+
$$T(n) = \Theta(n \log n)$$
37+
The base of the log affects the constant factor, but not the asymptotic class.
38+
39+
---
40+
41+
## 2. QuickSort Worst-Case Construction
42+
43+
We want to construct an input of size $n$ that causes a deterministic QuickSort (pivot = last element) to run in $\Omega(n^2)$.
44+
45+
### 2.1 The Goal
46+
47+
We need the pivot to always be the maximum (or minimum) of the current subarray. This produces partitions of size $k-1$ and $0$.
48+
49+
### 2.2 Construction Strategy
50+
51+
Let's work backwards from the desired execution trace.
52+
53+
* **Step 1**: We want the pivot (last element) to be the largest, say $n$.
54+
Array: $[ \dots, n ]$. Partition leaves $[ \dots ]$ and empty.
55+
* **Step 2**: In the remaining array of size $n-1$, we want the last element to be the largest remaining, $n-1$.
56+
Array: $[ \dots, n-1, n ]$.
57+
58+
**Resulting Input**: Sorted array $[1, 2, 3, \dots, n]$.
59+
Trace:
60+
61+
1. Pivot $n$. Partition: $[1, \dots, n-1]$ vs [].
62+
2. Pivot $n-1$. Partition: $[1, \dots, n-2]$ vs [].
63+
3. ...
64+
65+
**Reverse Sorted Input**: $[n, n-1, \dots, 1]$ with pivot = last ($1$).
66+
67+
1. Pivot $1$. Partition: [] vs $[n, \dots, 2]$.
68+
2. Pivot $2$. Partition: [] vs $[n, \dots, 3]$.
69+
This also yields $O(n^2)$.
70+
71+
To protect against this, we use **Randomized QuickSort**, which makes it impossible to design a single "killer input".
72+
73+
---
74+
75+
## References
76+
77+
* **CLRS**: Chapter 4 (Recurrences), Chapter 7 (Quicksort).

notes/courses/MATH-UA-334/13-hypothesis-testing.md

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -59,13 +59,10 @@ Suppose we have a sample $X\_1, \dots, X\_n \sim \mathcal{N}(\mu, \sigma^2)$ wit
5959

6060
Whenever we make a decision using a statistical test, we risk making one of two distinct types of errors:
6161

62-
1. **$\alpha$ - Type I Error (False Positive):** We incorrectly reject the null hypothesis $H\_0$ when it is actually true.
63-
- The probability of committing a Type I Error is called the **Significance Level**, denoted by $\alpha$.
64-
- $\alpha = \prob(\text{Output } H\_1 \mid H\_0 \text{ is true})$.
65-
2. **$\beta$ - Type II Error (False Negative):** We incorrectly accept the null hypothesis $H\_0$ when the alternative $H\_1$ is actually true.
66-
- The probability of a Type II error is denoted by $\beta$.
67-
- The **Power** of the test is defined as $1 - \beta$, which is the probability of correctly rejecting $H\_0$ when $H\_1$ is true.
68-
- $1 - \beta = \prob(\text{Output } H\_1 \mid H\_1 \text{ is true})$.
62+
| Truth \ Output | $H\_0$ | $H\_1$ |
63+
| :--- | :--- | :--- |
64+
| **$H\_0$ is true** | Correct decision; $\prob(\text{Output } H\_0 \mid H\_0 \text{ is true}) = 1 - \alpha$ | **Type I Error** (False Positive); $\alpha = \prob(\text{Output } H\_1 \mid H\_0 \text{ is true})$ |
65+
| **$H\_1$ is true** | **Type II Error** (False Negative); $\beta = \prob(\text{Output } H\_0 \mid H\_1 \text{ is true})$ | Correct decision (**Power**); $1 - \beta = \prob(\text{Output } H\_1 \mid H\_1 \text{ is true})$ |
6966

7067
In rigorous statistical practice, it is mathematically impossible to simultaneously minimize both $\alpha$ and $\beta$ for a fixed sample size $n$. The standard frequentist paradigm dictates that we fix the significance level $\alpha$ at a pre-determined, strictly controlled threshold (such as $0.05$ or $0.01$) and then actively seek the specific test that maximizes the statistical power $1 - \beta$.
7168

File renamed without changes.

notes/courses/MATH-UA-334/15-generalized-lr.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Generalized LR Tests
2+
title: Generalized LR Test
33
date: 2026-03-23
44
---
55

@@ -221,4 +221,4 @@ Here, the full space has $n$ parameters and the null space has $1$ parameter. By
221221
## References
222222

223223
1. Rice, J. A. (2007). *Mathematical Statistics and Data Analysis* (3rd ed.). Thomson Brooks/Cole.
224-
2. Han, Y. (2026). Lecture 15: Generalized LR Test.
224+
2. Han, Y. (2026). Lecture 15: Generalized LR Test.

0 commit comments

Comments
 (0)