Skip to content

Commit 6bae62d

Browse files
author
aryan
committed
polished content & added some svg images
1 parent 4288484 commit 6bae62d

52 files changed

Lines changed: 894 additions & 345 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

content/exercises/graded-assignments/statistics-1/W1GA1.md

Lines changed: 105 additions & 113 deletions
Large diffs are not rendered by default.

content/exercises/graded-assignments/statistics-1/W2GA1.md

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,10 @@
11
---
2-
title: Week 2 Graded Assignment Solution
2+
title: Week 2 Graded Assignment
33
weight: 2
4-
tags:
5-
- statistics
64
categories:
75
- Statistics Graded Assignment
8-
series:
9-
- Statistics Graded Assignment
10-
excludeSearch: false
11-
width: wide
126
---
137

14-
Here are all the questions and their solutions from the PDF **Statistics for Data Science-1, Week-2 Graded Assignment Solution**[^1]:
15-
168
---
179

1810
## 1. Which of the following statements is/are incorrect?
Lines changed: 164 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -1,191 +1,236 @@
11
---
2-
title: Week 3 Graded Assignment Solution
2+
title: Week 3 Graded Assignment
33
weight: 3
4-
tags:
5-
- statistics
64
categories:
75
- Statistics Graded Assignment
8-
series:
9-
- Statistics Graded Assignment
10-
excludeSearch: false
11-
width: wide
126
---
137

14-
Here are all the questions and their solutions from the PDF **Statistics for Data Science-1, Week-2 Graded Assignment Solution**[^1]:
15-
168
---
179

18-
## 1. Which of the following statements is/are incorrect?
10+
**1. The numbers a, b, c, d have frequencies (x + 6), (x + 2), (x − 3) and x respectively. If their mean is m, find the value of x. (Enter the value as next highest integer)**
1911

20-
**Options:**
21-
(a) To represent the share of a particular category, bar chart is the most appropriate graphical representation.
22-
(b) The multiplication of the total number of observations and relative frequency of a particular observation should be equal to the frequency of that observation.
23-
(c) Mean can be defined for a categorical variable.
24-
(d) Mode of a categorical variable is the widest slice in a pie chart.
25-
26-
**Answer:** a, c
2712
**Solution:**
28-
To show the share of a particular category, a pie chart is the most appropriate graphical representation. Thus, option (a) is incorrect.
29-
Relative frequency for the ith observation is \$ Rf_i = f_i / N \$, so \$ f_i = Rf_i \times N \$. Thus, option (b) is correct.
30-
Mean cannot be defined for categorical data as meaningful mathematical operations are not possible. Thus, option (c) is incorrect.
31-
In a pie chart, the widest slice corresponds to the mode (highest frequency). Thus, option (d) is correct.
32-
Therefore, options (a) and (c) are correct (as the question asks for incorrect statements).
3313

34-
---
14+
$$
15+
\frac{a(x + 6) + b(x + 2) + c(x − 3) + dx}{(x + 6) + (x + 2) + (x − 3) + x} = m
16+
$$
3517

36-
## 2. If the exam is for a total of 500 marks, then what is the aggregate distribution of marks in Physics, Maths and Biology?
18+
$$
19+
\frac{ax + 6a + bx + 2b + cx − 3c + dx}{4x + 5} = m
20+
$$
3721

38-
(Refer to Figure 2.1.G, which shows: Physics 35%, Maths 18%, Biology 10%)
22+
$$
23+
ax + bx + cx + dx + 6a + 2b − 3c = m(4x + 5) = (4m)x + 5m
24+
$$
3925

40-
**Answer:** 315
41-
**Solution:**
42-
Physics: \$ 500 \times 0.35 = 175 \$
43-
Maths: \$ 500 \times 0.18 = 90 \$
44-
Biology: \$ 500 \times 0.10 = 50 \$
45-
Aggregate: \$ 175 + 90 + 50 = 315 \$
26+
$$
27+
(a + b + c + d − 4m)x = 5m − 6a − 2b + 3c
28+
$$
4629

47-
---
30+
$$
31+
x = \frac{5m − 6a − 2b + 3c}{a + b + c + d − 4m}
32+
$$
4833

49-
## 3. Choose the correct statement(s):
34+
Suppose, we substitute values of a, b, c, d and m as 2, 7, 9, 17 and 6.88 respectively,
5035

51-
**Options:**
52-
(a) The pie chart is misleading because it does not obey the area principle.
53-
(b) The pie chart has round off errors.
54-
(c) The pie chart is not a misleading graph.
55-
(d) The slices of pie chart adds up to 100%.
36+
$$
37+
x = \frac{(5 \times 6.88) − (6 \times 2) − (2 \times 7) + (3 \times 9)}{2 + 7 + 9 + 17 − (4 \times 6.88)} = 4.73
38+
$$
5639

57-
**Answer:** c, d
58-
**Solution:**
59-
The pie chart obeys the area principle and the slices add up to 100%. Thus, options (c) and (d) are correct.
40+
Hence, x = 5[^1].
6041

6142
---
6243

63-
## 4. What is the combined relative frequency of the academy A, B and D?
44+
**2. What is the mean of the original dataset? (Correct up to 2 decimal place accuracy)**
6445

65-
(Refer to Table 2.1.G: Academy C has 50 players, E has 75 players; total 200 players.)
66-
67-
**Answer:** 0.375 (Range: 0.370, 0.380)
6846
**Solution:**
69-
Relative frequency for C: \$ 50/200 = 0.25 \$
70-
Relative frequency for E: \$ 75/200 = 0.375 \$
71-
Combined relative frequency for A, B, D: \$ 1 - (0.25 + 0.375) = 0.375 \$
47+
Let the sum of all the observations of noted dataset be $T$ and for the original dataset be $T'$.
7248

73-
---
49+
$$
50+
\text{Mean} = \frac{T}{N} = m \implies T = m \times N
51+
$$
7452

75-
## 5. Median of the given data is:
53+
$$
54+
T' = T - p + x
55+
$$
7656

77-
**Options:**
78-
(a) Academy C
79-
(b) Academy E
80-
(c) Academy D
81-
(d) Median is not defined for the given data
82-
(e) Insufficient data
57+
$$
58+
\text{Mean for original dataset} = \frac{T'}{N}
59+
$$
8360

84-
**Answer:** d
85-
**Solution:**
86-
The data is nominal and cannot be ordered, so median is not defined.
61+
Suppose, N = 8, m = 13, s = 8, x = 18, p = 13:
8762

88-
---
63+
$$
64+
T = 13 \times 8 = 104
65+
$$
8966

90-
## 6. Mode of the given data is:
67+
$$
68+
T' = 104 - 13 + 18 = 109
69+
$$
9170

92-
**Options:**
93-
(a) Academy C
94-
(b) Academy E
95-
(c) Academy D
96-
(d) Mode is not defined for the given data
97-
(e) Insufficient data
71+
$$
72+
\text{Mean for original dataset} = \frac{109}{8} = 13.625
73+
$$
9874

99-
**Answer:** b
100-
**Solution:**
101-
Academy E has the highest frequency (75), so it is the mode.
75+
[^1]
10276

10377
---
10478

105-
## 7. Which of the following graphical representations is appropriate for the number of players in each academy for the given data in Table 2.1.G?
106-
107-
**Options:**
108-
(a) Bar chart
109-
(b) Pie chart
110-
(c) Pareto chart
111-
(d) Both bar chart and pareto chart
79+
**3. What is the sample variance of the original dataset? (Correct up to 2 decimal place accuracy)**
11280

113-
**Answer:** d
11481
**Solution:**
115-
Bar chart and Pareto chart are both appropriate for showing counts. Pie chart is for proportions.
82+
Sample variance,
11683

117-
---
84+
$$
85+
s^2 = \frac{\sum(x_i - \bar{x})^2}{N-1}
86+
$$
11887

119-
## 8. The data of number of students sharing the same rank is collected. Which of the following is/are suitable to represent the collected data?
88+
Let $\sum x_i^2 = A$ for noted dataset and for the original dataset be $B$.
12089

121-
**Options:**
122-
(a) (plot with missing baseline)
123-
(b) (plot with correct baseline and order)
124-
(c) (plot with incorrect order of categories)
90+
$$
91+
B = A - p^2 + x^2
92+
$$
12593

126-
**Answer:** b
127-
**Solution:**
128-
Option (b) correctly preserves the order and is not misleading.
94+
where,
95+
96+
$$
97+
A = \left(\frac{s^2 + N m^2}{N-1}\right) \times (N-1)
98+
$$
99+
100+
$$
101+
\text{Sample variance for the original dataset} = \frac{B}{N-1} - \frac{(T')^2}{N(N-1)}
102+
$$
103+
104+
Suppose, N = 8, m = 13, s = 8, x = 18, p = 13:
105+
106+
$$
107+
A = \left(\frac{8^2 + 8 \times 13^2}{7}\right) \times 7 = 1800
108+
$$
109+
110+
$$
111+
B = 1800 - 13^2 + 18^2 = 1955
112+
$$
113+
114+
$$
115+
\text{Sample variance} = \frac{1955}{7} - \frac{109^2}{8 \times 7} = 67.125
116+
$$
117+
118+
[^1]
129119

130120
---
131121

132-
## 9. Choose the correct statement about categorical data:
122+
**4. Let the data $x_1, x_2, ..., x_n$ represent the retail prices in rupees of a certain commodity in n randomly selected shops in a particular city. What will be the sample variance in the retail prices, if c rupees is added to all the retail prices? (Correct up to 2 decimal place accuracy)**
123+
124+
**Solution:**
125+
If $c$ rupees is added to all retail prices, new prices $y_i = x_i + c$.
126+
127+
$$
128+
\text{New variance} = \text{Old variance}
129+
$$
133130

134-
**Options:**
135-
(a) Categorical data have measurement units.
136-
(b) Categorical data can take numerical values, but no meaningful mathematical operations can be performed on it.
137-
(c) Categorical data is quantitative in nature.
138-
(d) All of the above
131+
Example: n = 6, observations = 46, 34, 82, 37, 83, 66
132+
133+
$$
134+
\text{Mean} = \frac{46 + 34 + 82 + 37 + 83 + 66}{6} = 58
135+
$$
136+
137+
$$
138+
\text{Sample variance} = \frac{(46-58)^2 + (34-58)^2 + (82-58)^2 + (37-58)^2 + (83-58)^2 + (66-58)^2}{5} = 485.2
139+
$$
140+
141+
[^1]
142+
143+
---
144+
145+
**5. Suppose, we have n observations such that $x_1, x_2, ..., x_n$. Calculate 10th, 50th and 100th percentiles?**
139146

140-
**Answer:** b
141147
**Solution:**
142-
Categorical data can be coded numerically, but no meaningful mathematical operations can be performed.
148+
To find the sample 100p percentile of a dataset of size n:
149+
150+
1. Arrange the data in ascending order.
151+
2. If np is not integer, take the smallest integer greater than np. The data value in that position is the sample 100p percentile.
152+
3. If np is integer, take the average of values in positions np and np+1.
153+
154+
Example: n = 7, observations = 31, 36, 25, 34, 115, 108, 88
155+
Ascending order: 25, 31, 34, 36, 88, 108, 115
156+
157+
- 10th percentile: np = 0.7 → 1st observation = 25
158+
- 50th percentile: np = 3.5 → 4th observation = 36
159+
- 100th percentile: np = 7 → last observation = 115[^1]
143160

144161
---
145162

146-
## 10. How many students have secured B grade?
163+
**6. Calculate the Inter Quartile Range (IQR) of the data.**
164+
165+
**Solution:**
166+
IQR = Q3 − Q1
167+
168+
- Q1: p = 0.25, np = 1.75 → Q1 = 31
169+
- Q3: p = 0.75, np = 5.25 → Q3 = 108
147170

148-
(Refer to Figure 2.2.G: B grade 32.5% of 80 students.)
171+
IQR = 108 − 31 = 77[^1]
172+
173+
---
174+
175+
**7. How many outliers are there?**
149176

150-
**Answer:** 26
151177
**Solution:**
152-
\$ 80 \times 0.325 = 26 \$
178+
Outliers < Q1 − 1.5 × IQR or > Q3 + 1.5 × IQR
179+
180+
- Q1 = 31, Q3 = 108, IQR = 77
181+
- Lower bound: 31 − (1.5 × 77) = −84.5
182+
- Upper bound: 108 + (1.5 × 77) = 223.5
183+
184+
No observations outside these bounds. Hence, no outliers[^1].
153185

154186
---
155187

156-
## 11. What is the ratio of the students secured C grade to the students secured A grade?
188+
**8. In a deck, there are cards numbered 1 to n such that the number of cards of a given number is the same as the number on the card. Which of the following statement(s) is/are true about the mean and mode of the numbers on this deck of card?**
189+
190+
a. Mode is n.
191+
b. Mean is $\frac{2n + 1}{3}$.
192+
c. Mode is n − 1.
193+
d. Mean is n.
194+
e. Mean is $\frac{n + 1}{2}$.
195+
f. Mode is not defined for this data.
157196

158-
(Figure 2.2.G: C grade 22.5%, A grade 25% of 80 students.)
197+
**Answer:** a, b
159198

160-
**Answer:** 0.9
161199
**Solution:**
162-
C grade: \$ 80 \times 0.225 = 18 \$
163-
A grade: \$ 80 \times 0.25 = 20 \$
164-
Ratio: \$ 18/20 = 0.9 \$
200+
Number (xi), Frequency (fi): 1:1, 2:2, ..., n:n
165201

166-
---
202+
- Mode = n
203+
- Total observations = $1 + 2 + ... + n = \frac{n(n+1)}{2}$
204+
- Sum = $1^2 + 2^2 + ... + n^2 = \frac{n(n+1)(2n+1)}{6}$
205+
- Mean = $\frac{n(n+1)(2n+1)/6}{n(n+1)/2} = \frac{2n+1}{3}$
167206

168-
This is the complete set of questions and solutions from the PDF[^1].
207+
Example for n = 42: Mode = 42, Mean = 28.33[^1].
169208

170-
<div style="text-align: center">⁂</div>
209+
---
171210

172-
[^1]: Week_2_Graded_Solution.pdf
211+
**9. Figure 3.1.G shows a stem and leaf plot of the ratings (out of 100) of an actor’s performance in different movies. What is the Inter Quartile Range (IQR) (Correct up to 1 decimal point accuracy)?**
173212

174-
[^2]: https://www.scribd.com/document/687483981/Week-2-Graded-Assignment-Solution
213+
**Solution:**
214+
n = 10
175215

176-
[^3]: https://www.scribd.com/document/768404514/IIT-Madras-Week-2-Graded-Assignments
216+
- Q1 = 3rd observation = 72
217+
- Q3 = 8th observation = 87
218+
- IQR = 87 − 72 = 15[^1]
177219

178-
[^4]: https://www.studocu.com/in/document/indian-institute-of-technology-madras/programming-and-data-science/week-2-graded-solution-bs-ds/82822211
220+
---
179221

180-
[^5]: https://gradedassignments.github.io/iit-madras-graded-assignments/
222+
**10. What is the median rating, if x points are added to all of his ratings and then converted to y points? (Correct up to 2 decimal point accuracy)**
181223

182-
[^6]: https://www.youtube.com/watch?v=aI1a91rzTrs
224+
**Solution:**
225+
Median of original data (10 observations) = mean of 5th and 6th = (75 + 78)/2 = 76.5
183226

184-
[^7]: https://groups.google.com/a/nptel.iitm.ac.in/g/ma1001-discuss/c/_lVR3xXnj5M
227+
- If x points added: median = 76.5 + x
228+
- If then converted to y points: median = $(76.5 + x) \times \frac{y}{100}$
185229

186-
[^8]: https://iitmdatascience.com/term2
230+
Example: x = 3, y = 40
231+
Median = (76.5 + 3) × 0.4 = 31.8[^1]
187232

188-
[^9]: https://www.studocu.com/in/document/indian-institute-of-technology-madras/iitm-online-degree-data-science-and-programming/week-2-graded-assignment/105815343
233+
<div style="text-align: center">⁂</div>
189234

190-
[^10]: https://www.youtube.com/watch?v=6EPGq4-zDV8
235+
[^1]: Week_3_Graded_Solution.pdf
191236

content/exercises/graded-assignments/statistics-1/W4GA1.md

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,10 @@
11
---
2-
title: Week 4 Graded Assignment Solution
2+
title: Week 4 Graded Assignment
33
weight: 4
4-
tags:
5-
- statistics
64
categories:
75
- Statistics Graded Assignment
8-
series:
9-
- Statistics Graded Assignment
10-
excludeSearch: false
11-
width: wide
126
---
137

14-
Here are all the questions and their solutions from the attached PDF **Week_4_Graded_Solution.pdf**[^1]:
15-
168
---
179

1810
## **Questions 1–6: Sales Data Analysis**

0 commit comments

Comments
 (0)