Welcome to my Python for Machine Learning learning repository.
This repository documents my hands-on journey from fundamental Python programming to machine learning-ready coding, utilizing Jupyter Notebooks, practice sets, and modular mini-projects.
Core Python programming concepts:
- Control flow (if-else statements, for loops, while loops, nested loops, loop control mechanisms)
- Data structures (lists, tuples, sets, frozensets, dictionaries, strings)
- String operations and practice exercises
- Functions and arguments
- Lambda functions, list comprehensions, and dictionary comprehensions
- Sequence sum patterns
- Modules and operators
- Decorators
- Namespace and scope management
- Classification of Python errors
- Implementation of try, except, else, and finally blocks
- Creating and handling custom exceptions
- Text file operations (read, write, append) and context management using the
withstatement - Binary file operations
- Serialization and deserialization
- Pickling and unpickling Python objects
Object-Oriented Programming principles:
- Classes and objects (Part 1 & 2)
- Reference variables and user-defined data types
- Inheritance hierarchies
- Encapsulation and data hiding
- Abstraction
- Polymorphism
- Object aggregation and the
super()function
Practice notebooks dedicated to:
- Fundamental logic and level-1 problem solving
- List and dictionary manipulation exercises
- List comprehension practice
- Decorator practicals
- OOP practice
- Exception handling practicals
Functional mini-projects developed during the learning phase:
- Standard Calculator and Calculator V2
- ATM System simulation
- Library Management project
- DinosaursPedia
- Google Account Creation & Login simulation
Fundamental array computing:
- Array creation and attributes
- Basic indexing and slicing
- Iteration and array reshaping
- Array stacking and splitting
Dedicated notebooks for reinforcing core NumPy concepts.
In-depth exploration of advanced array operations and mathematical computing:
- Advanced Indexing: Techniques for complex array selection and multi-dimensional slicing.
- Array Broadcasting: Operational rules, implementation examples, and computational error resolution.
- Handling Missing Values: Identification, filtering, and management of NaN/null data points within numerical arrays.
- Plotting Graphs: Integrating array data with visualization operations.
- Set Functions: Advanced operations including union, intersection, and unique value extraction on arrays.
- Extra Methods (Part 1 & 2): Comprehensive coverage of specialized NumPy utility functions for extended statistical and mathematical operations.
Comprehensive coverage of the Pandas library for data manipulation and analysis, organized into sub-folders:
Series in Pandas/
- Creation and structural understanding (Part 1 & 2)
- Indexing and slicing
- Math methods
- Plotting with built-in plot methods
DataFrames in Pandas/
- Introduction, creation, and structural understanding
- Filtering data and adding columns
- Editing the index and using Python functionality within Pandas
- Selecting columns, rows, and combined selections
- Math methods and statistical operations
- GroupBy: Aggregation, transformation, and group-level operations (Part 1 & 2), including hands-on exercises
- Important DataFrame methods (reference notebook)
Practice in Pandas/
- Real-world dataset practice (YouTube channel analytics)
A curated collection of real-world CSV datasets used across Pandas and NumPy notebooks:
| Dataset | Description |
|---|---|
batsman_runs_ipl.csv |
IPL batsman run statistics |
bollywood.csv |
Bollywood movie data |
deliveries.csv |
IPL ball-by-ball delivery data |
diabetes.csv |
Diabetes patient health metrics |
global_top2000.csv |
Global top 2000 companies |
imdb-top-1000.csv |
IMDB top 1000 movies |
ipl-matches.csv |
IPL match results |
kohli_ipl.csv |
Virat Kohli IPL performance stats |
movies.csv |
General movies dataset |
subs.csv |
Subscriber data |
| Tool | Purpose |
|---|---|
| Python 3 | Core programming language |
| Jupyter Notebook | Interactive development environment |
| NumPy | Numerical computing |
| Pandas | Data manipulation and analysis |
| Matplotlib | Data visualization |
| Git & GitHub | Version control and hosting |
To establish a robust foundation in Python programming tailored for Data Science and Machine Learning, bridging the gap between theoretical syntax and real-world analytical projects.
Maintained by Ayush Suthar