From e58db77f5646d36019c7517d5630ffac95447e9f Mon Sep 17 00:00:00 2001 From: Arpit Kumar <137612747+amazingak1@users.noreply.github.com> Date: Mon, 13 Jan 2025 00:00:53 +0530 Subject: [PATCH 1/4] Create multilayer-perceptron.md --- .../neural-networks/multilayer-perceptron.md | 133 ++++++++++++++++++ 1 file changed, 133 insertions(+) create mode 100644 docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md diff --git a/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md b/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md new file mode 100644 index 00000000..5efa6b1d --- /dev/null +++ b/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md @@ -0,0 +1,133 @@ +![image](https://github.com/user-attachments/assets/018c5462-5977-415f-8600-65f5560722fd) + +# Multilayer Perceptron (MLP) + +--- + +## **What is a Multilayer Perceptron (MLP)?** + +A **Multilayer Perceptron (MLP)** is a type of **artificial neural network (ANN)** that consists of multiple layers of neurons, designed to learn and map relationships between input data and output predictions. It is a foundational building block of deep learning. + +### **Key Characteristics of MLP**: +- **Fully Connected Layers**: Each neuron in one layer is connected to every neuron in the next layer. +- **Non-linear Activation Functions**: Introduces non-linearity to help the model learn complex patterns. +- **Supervised Learning**: Typically trained using labeled data with **backpropagation** and optimization algorithms like **Stochastic Gradient Descent (SGD)** or **Adam**. + +--- + +## **Architecture of MLP** + +An MLP consists of three main types of layers: + +1. **Input Layer**: + - Accepts the input features (e.g., pixels of an image, numerical data). + - Each neuron corresponds to one input feature. + +2. **Hidden Layers**: + - Perform intermediate computations to learn the patterns and relationships in data. + - Can have one or more layers depending on the complexity of the problem. + +3. **Output Layer**: + - Produces the final prediction. + - The number of neurons corresponds to the number of output classes (for classification tasks) or a single neuron for regression tasks. + +### **Flow of Data in MLP**: +1. **Linear transformation**: \( z = W \cdot x + b \) + - \( W \): Weight matrix + - \( x \): Input + - \( b \): Bias +2. **Non-linear activation**: \( a = f(z) \), where \( f \) is an activation function (e.g., ReLU, sigmoid, or tanh). + +--- + +## **Applications of Multilayer Perceptron** + +### **Classification**: +- Handwritten digit recognition (e.g., MNIST dataset). +- Sentiment analysis of text. +- Image classification for small datasets. + +### **Regression**: +- Predicting house prices based on features like area, location, etc. +- Forecasting time-series data like stock prices or weather. + +### **Healthcare**: +- Disease diagnosis based on patient records. +- Predicting patient outcomes in hospitals. + +### **Finance**: +- Fraud detection in credit card transactions. +- Risk assessment and loan approval. + +### **Speech and Audio**: +- Voice recognition. +- Music genre classification. + +--- + +## **Key Concepts in MLP** + +### **1. Activation Functions**: +- Introduce non-linearity to the model, enabling it to learn complex patterns. +- Commonly used: + - **ReLU (Rectified Linear Unit)**: \( f(x) = \max(0, x) \) + - **Sigmoid**: \( f(x) = \frac{1}{1 + e^{-x}} \) + - **Tanh**: \( f(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} \) + +### **2. Loss Functions**: +- Measures the difference between predicted and actual values. +- Common examples: + - **Mean Squared Error (MSE)**: Used for regression. + - **Categorical Crossentropy**: Used for classification. + +### **3. Backpropagation**: +- A technique used to compute gradients for updating weights. +- Consists of: + 1. **Forward pass**: Calculate the output. + 2. **Backward pass**: Compute gradients using the chain rule. + 3. **Weight update**: Optimize weights using an optimizer. + +### **4. Optimizers**: +- Algorithms that adjust weights to minimize the loss function. +- Examples: **SGD**, **Adam**, **RMSprop**. + +--- + +## **Advantages of MLP** +- Can model non-linear relationships between inputs and outputs. +- Versatile for solving both classification and regression problems. +- Ability to approximate any continuous function (Universal Approximation Theorem). + +--- + +## **Limitations of MLP** +- Computationally expensive for large datasets. +- Prone to overfitting if not regularized properly. +- Less effective for image or sequential data without specialized architectures (e.g., CNNs, RNNs). + +--- + +## **Code Example: Implementing MLP Using Keras** + +```python +from tensorflow.keras.models import Sequential +from tensorflow.keras.layers import Dense + +# Build the MLP +model = Sequential([ + Dense(128, activation='relu', input_shape=(20,)), # Input layer (20 features) + Dense(64, activation='relu'), # Hidden layer + Dense(1, activation='sigmoid') # Output layer (binary classification) +]) + +# Compile the model +model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) + +# Summary of the model +model.summary() +``` +--- + +# **Applications in Real-world Projects** +* Use MLP for datasets where data is in tabular or vector format (e.g., CSV files). +* Fine-tune the architecture by adjusting the number of neurons and layers based on your dataset. From 7768f5ac8718522e95aac839d6642b89b16a4aff Mon Sep 17 00:00:00 2001 From: Arpit Kumar <137612747+amazingak1@users.noreply.github.com> Date: Mon, 13 Jan 2025 00:12:32 +0530 Subject: [PATCH 2/4] Update multilayer-perceptron.md --- .../neural-networks/multilayer-perceptron.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md b/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md index 5efa6b1d..62412ed7 100644 --- a/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md +++ b/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md @@ -6,7 +6,7 @@ ## **What is a Multilayer Perceptron (MLP)?** -A **Multilayer Perceptron (MLP)** is a type of **artificial neural network (ANN)** that consists of multiple layers of neurons, designed to learn and map relationships between input data and output predictions. It is a foundational building block of deep learning. +A **Multilayer Perceptron (MLP)** is a type of **artificial neural network (ANN)** that consists of multiple layers of neurons, designed to learn and map relationships between input data and output predictions. It is foundational building block for Deep Learning. ### **Key Characteristics of MLP**: - **Fully Connected Layers**: Each neuron in one layer is connected to every neuron in the next layer. @@ -21,7 +21,7 @@ An MLP consists of three main types of layers: 1. **Input Layer**: - Accepts the input features (e.g., pixels of an image, numerical data). - - Each neuron corresponds to one input feature. + - Every neuron corresponds to one input feature . 2. **Hidden Layers**: - Perform intermediate computations to learn the patterns and relationships in data. @@ -49,7 +49,7 @@ An MLP consists of three main types of layers: ### **Regression**: - Predicting house prices based on features like area, location, etc. -- Forecasting time-series data like stock prices or weather. +- Forecasting time series data like stock prices or weather, etc. ### **Healthcare**: - Disease diagnosis based on patient records. @@ -68,14 +68,14 @@ An MLP consists of three main types of layers: ## **Key Concepts in MLP** ### **1. Activation Functions**: -- Introduce non-linearity to the model, enabling it to learn complex patterns. +- Introduced non linearity to model, enabling it to learn complex patterns. - Commonly used: - **ReLU (Rectified Linear Unit)**: \( f(x) = \max(0, x) \) - **Sigmoid**: \( f(x) = \frac{1}{1 + e^{-x}} \) - **Tanh**: \( f(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} \) ### **2. Loss Functions**: -- Measures the difference between predicted and actual values. +- It measures the difference between predicted and actual values. - Common examples: - **Mean Squared Error (MSE)**: Used for regression. - **Categorical Crossentropy**: Used for classification. @@ -88,7 +88,7 @@ An MLP consists of three main types of layers: 3. **Weight update**: Optimize weights using an optimizer. ### **4. Optimizers**: -- Algorithms that adjust weights to minimize the loss function. +- Algorithms that adjusts weights to minimize the loss function to improve model. - Examples: **SGD**, **Adam**, **RMSprop**. --- From d1280a66403ec346dd4b2179538b08f48b8972ce Mon Sep 17 00:00:00 2001 From: Arpit Kumar <137612747+amazingak1@users.noreply.github.com> Date: Mon, 13 Jan 2025 14:43:56 +0530 Subject: [PATCH 3/4] Update neural-networks --- .../recurrent-neural-networks.md\nl" | 129 ++++++++++++++++++ 1 file changed, 129 insertions(+) create mode 100644 "docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md\nl" diff --git "a/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md\nl" "b/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md\nl" new file mode 100644 index 00000000..db8954e4 --- /dev/null +++ "b/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md\nl" @@ -0,0 +1,129 @@ +# Recurrent Neural Networks (RNN) + +--- + +## **What is a Recurrent Neural Network (RNN)?** + +A **Recurrent Neural Network (RNN)** is a type of artificial neural network designed for modeling **sequential data**. Unlike traditional feedforward networks, RNNs have the capability to remember information from previous time steps, making them well-suited for tasks involving temporal or sequential relationships. + +### **Key Characteristics of RNN**: +- **Sequential Processing**: Processes inputs sequentially, one step at a time. +- **Memory Capability**: Uses hidden states to store information about previous steps. +- **Shared Weights**: The same weights are applied across all time steps, reducing complexity. + +--- + +## **Architecture of RNN** + +### **Components of RNN**: +1. **Input Layer**: + - Accepts sequential input data (e.g., time-series data, text, or audio signals). + +2. **Hidden Layer with Recurrence**: + - Maintains a **hidden state** h_t, which is updated at each time step based on the input and the previous hidden state. + - Formula: +\[ + h_t = f(W_h \cdot h_{t-1} + W_x \cdot x_t + b) +\] + Where: + - h_t: Current hidden state. + - h_{t-1}: Previous hidden state. + - x_t: Input at time step t. + - W_h, W_x: Weight matrices. + - b: Bias. + - f: Activation function (e.g., tanh or ReLU). + +3. **Output Layer**: + - Produces output based on the current hidden state. + - Formula: +\[ + y_t = g(W_y \cdot h_t + c) +\] + Where: + - y_t: Output at time step t. + - W_y: Output weight matrix. + - c: Output bias. + - g: Activation function (e.g., softmax or sigmoid). + +--- + +## **Types of RNNs** + +### **1. Vanilla RNN**: +- Standard RNN that processes sequential data using the hidden state. +- Struggles with long-term dependencies due to **vanishing gradient problems**. + +### **2. Long Short-Term Memory (LSTM)**: +- A specialized type of RNN that can learn long-term dependencies by using **gates** to control the flow of information. +- Components: + - **Forget Gate**: Decides what to forget. + - **Input Gate**: Decides what to store. + - **Output Gate**: Controls the output. + +### **3. Gated Recurrent Unit (GRU)**: +- A simplified version of LSTM that combines the forget and input gates into a single **update gate**. + +--- + +## **Applications of RNN** + +### **1. Natural Language Processing (NLP)**: +- Text generation (e.g., predictive typing, chatbots). +- Sentiment analysis. +- Language translation. + +### **2. Time-Series Analysis**: +- Stock price prediction. +- Weather forecasting. +- Energy demand forecasting. + +### **3. Speech and Audio Processing**: +- Speech-to-text transcription. +- Music generation. + +### **4. Video Analysis**: +- Video captioning. +- Action recognition. + +--- + +## **Advantages of RNN** +- Can handle sequential and time-dependent data. +- Shared weights reduce model complexity. +- Effective for tasks with context dependencies, such as language modeling. + +--- + +## **Limitations of RNN** +- **Vanishing Gradient Problem**: + - Makes it difficult to learn long-term dependencies. +- Computationally expensive for long sequences. +- Struggles with parallelization compared to other architectures like CNNs. + +--- + +## **Code Example: Implementing RNN Using Keras** + +```python +from tensorflow.keras.models import Sequential +from tensorflow.keras.layers import SimpleRNN, Dense + +# Build the RNN model +model = Sequential([ + SimpleRNN(128, activation='tanh', input_shape=(10, 1)), # 10 timesteps, 1 feature + Dense(1, activation='sigmoid') # Output layer (binary classification) +]) + +# Compile the model +model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) + +# Summary of the model +model.summary() +``` +--- + +# **Applications in Real-world Projects** + +Use RNN for tasks involving sequential data where past information impacts the future. + +Prefer LSTM or GRU over vanilla RNN for learning long-term dependencies. \ No newline at end of file From 10553dec820a88b7208928e5fc0273864f6f2e3b Mon Sep 17 00:00:00 2001 From: Arpit Kumar <137612747+amazingak1@users.noreply.github.com> Date: Mon, 13 Jan 2025 20:08:33 +0530 Subject: [PATCH 4/4] Create recurrent-neural-networks.md --- .../recurrent-neural-networks.md | 128 ++++++++++++++++++ 1 file changed, 128 insertions(+) create mode 100644 docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md diff --git a/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md b/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md new file mode 100644 index 00000000..221ab80f --- /dev/null +++ b/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md @@ -0,0 +1,128 @@ +# Recurrent Neural Networks (RNN) + +--- + +## **What is a Recurrent Neural Network (RNN)?** + +A **Recurrent Neural Network (RNN)** is a type of artificial neural network designed for modeling **sequential data**. Unlike traditional feedforward networks, RNNs have the capability to remember information from previous time steps, making them well-suited for tasks involving temporal or sequential relationships. + +### **Key Characteristics of RNN**: +- **Sequential Processing**: Processes inputs sequentially, one step at a time. +- **Memory Capability**: Uses hidden states to store information about previous steps. +- **Shared Weights**: The same weights are applied across all time steps, reducing complexity. + +--- + +## **Architecture of RNN** + +### **Components of RNN**: +1. **Input Layer**: + - Accepts sequential input data (e.g., time-series data, text, or audio signals). + +2. **Hidden Layer with Recurrence**: + - Maintains a **hidden state** \( h_t \), which is updated at each time step based on the input and the previous hidden state. + - Formula: + \[ + h_t = f(W_h \cdot h_{t-1} + W_x \cdot x_t + b) + \] + Where: + - \( h_t \): Current hidden state. + - \( h_{t-1} \): Previous hidden state. + - \( x_t \): Input at time step \( t \). + - \( W_h, W_x \): Weight matrices. + - \( b \): Bias. + - \( f \): Activation function (e.g., tanh or ReLU). + +3. **Output Layer**: + - Produces output based on the current hidden state. + - Formula: + \[ + y_t = g(W_y \cdot h_t + c) + \] + Where: + - \( y_t \): Output at time step \( t \). + - \( W_y \): Output weight matrix. + - \( c \): Output bias. + - \( g \): Activation function (e.g., softmax or sigmoid). + +--- + +## **Types of RNNs** + +### **1. Vanilla RNN**: +- Standard RNN that processes sequential data using the hidden state. +- Struggles with long-term dependencies due to **vanishing gradient problems**. + +### **2. Long Short-Term Memory (LSTM)**: +- A specialized type of RNN that can learn long-term dependencies by using **gates** to control the flow of information. +- Components: + - **Forget Gate**: Decides what to forget. + - **Input Gate**: Decides what to store. + - **Output Gate**: Controls the output. + +### **3. Gated Recurrent Unit (GRU)**: +- A simplified version of LSTM that combines the forget and input gates into a single **update gate**. + +--- + +## **Applications of RNN** + +### **1. Natural Language Processing (NLP)**: +- Text generation (e.g., predictive typing, chatbots). +- Sentiment analysis. +- Language translation. + +### **2. Time-Series Analysis**: +- Stock price prediction. +- Weather forecasting. +- Energy demand forecasting. + +### **3. Speech and Audio Processing**: +- Speech-to-text transcription. +- Music generation. + +### **4. Video Analysis**: +- Video captioning. +- Action recognition. + +--- + +## **Advantages of RNN** +- Can handle sequential and time-dependent data. +- Shared weights reduce model complexity. +- Effective for tasks with context dependencies, such as language modeling. + +--- + +## **Limitations of RNN** +- **Vanishing Gradient Problem**: + - Makes it difficult to learn long-term dependencies. +- Computationally expensive for long sequences. +- Struggles with parallelization compared to other architectures like CNNs. + +--- + +## **Code Example: Implementing RNN Using Keras** + +```python +from tensorflow.keras.models import Sequential +from tensorflow.keras.layers import SimpleRNN, Dense + +# Build the RNN model +model = Sequential([ + SimpleRNN(128, activation='tanh', input_shape=(10, 1)), # 10 timesteps, 1 feature + Dense(1, activation='sigmoid') # Output layer (binary classification) +]) + +# Compile the model +model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) + +# Summary of the model +model.summary() +``` + +--- + +# **Applications in Real-world Projects** +* Use RNN for tasks involving sequential data where past information impacts the future. +* Prefer LSTM or GRU over vanilla RNN for learning long-term dependencies.