Skip to content

Comments

Assignment 3#3

Open
Yutong2002 wants to merge 6 commits intomainfrom
assignment-3
Open

Assignment 3#3
Yutong2002 wants to merge 6 commits intomainfrom
assignment-3

Conversation

@Yutong2002
Copy link
Owner

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

I added the full implementation of clustering and resampling methods to the Wine dataset. This includes loading and visualizing the data & relationship, applying scaler, implementing K-means clustering, and performing bootstrap resampling to estimate the confidence interval of the mean color intensity.

What did you learn from the changes you have made?

I learned how sensitive clustering algorithms are to data scaling and how the choice of k could affect cluster structure. I also learned how bootstrapping can be used to estimate sampling variability without relying on distributional assumptions.

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

I considered experimenting with PCA visualization for clearer cluster separation and compare with K-mean clustering

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

N/A

How were these changes tested?

All script are tested in Juptyter notebook to ensure the proper generation of results.

A reference to a related issue in your repository (if applicable)

N/A

Checklist

  • I can confirm that my changes are working as intended

Copy link

@PatelVishakh PatelVishakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assignment 3 Complete! Great work.

Recommended changes:

Q2) Although there is visible correlation, this would not help us differentiate between species but detecting if clusters occur in the our data can be used as an indicator of different species.
Q4) I) Should describe the Elbow method too.
Q5)i) By taking many random samples from our data (with replacement), we see how the mean varies. This helps us know if our mean is stable or if it changes a lot.
Q5)iii) Need to mention the conf interval and its interpretation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants