diff --git a/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md b/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md new file mode 100644 index 00000000..62412ed7 --- /dev/null +++ b/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md @@ -0,0 +1,133 @@ +![image](https://github.com/user-attachments/assets/018c5462-5977-415f-8600-65f5560722fd) + +# Multilayer Perceptron (MLP) + +--- + +## **What is a Multilayer Perceptron (MLP)?** + +A **Multilayer Perceptron (MLP)** is a type of **artificial neural network (ANN)** that consists of multiple layers of neurons, designed to learn and map relationships between input data and output predictions. It is foundational building block for Deep Learning. + +### **Key Characteristics of MLP**: +- **Fully Connected Layers**: Each neuron in one layer is connected to every neuron in the next layer. +- **Non-linear Activation Functions**: Introduces non-linearity to help the model learn complex patterns. +- **Supervised Learning**: Typically trained using labeled data with **backpropagation** and optimization algorithms like **Stochastic Gradient Descent (SGD)** or **Adam**. + +--- + +## **Architecture of MLP** + +An MLP consists of three main types of layers: + +1. **Input Layer**: + - Accepts the input features (e.g., pixels of an image, numerical data). + - Every neuron corresponds to one input feature . + +2. **Hidden Layers**: + - Perform intermediate computations to learn the patterns and relationships in data. + - Can have one or more layers depending on the complexity of the problem. + +3. **Output Layer**: + - Produces the final prediction. + - The number of neurons corresponds to the number of output classes (for classification tasks) or a single neuron for regression tasks. + +### **Flow of Data in MLP**: +1. **Linear transformation**: \( z = W \cdot x + b \) + - \( W \): Weight matrix + - \( x \): Input + - \( b \): Bias +2. **Non-linear activation**: \( a = f(z) \), where \( f \) is an activation function (e.g., ReLU, sigmoid, or tanh). + +--- + +## **Applications of Multilayer Perceptron** + +### **Classification**: +- Handwritten digit recognition (e.g., MNIST dataset). +- Sentiment analysis of text. +- Image classification for small datasets. + +### **Regression**: +- Predicting house prices based on features like area, location, etc. +- Forecasting time series data like stock prices or weather, etc. + +### **Healthcare**: +- Disease diagnosis based on patient records. +- Predicting patient outcomes in hospitals. + +### **Finance**: +- Fraud detection in credit card transactions. +- Risk assessment and loan approval. + +### **Speech and Audio**: +- Voice recognition. +- Music genre classification. + +--- + +## **Key Concepts in MLP** + +### **1. Activation Functions**: +- Introduced non linearity to model, enabling it to learn complex patterns. +- Commonly used: + - **ReLU (Rectified Linear Unit)**: \( f(x) = \max(0, x) \) + - **Sigmoid**: \( f(x) = \frac{1}{1 + e^{-x}} \) + - **Tanh**: \( f(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} \) + +### **2. Loss Functions**: +- It measures the difference between predicted and actual values. +- Common examples: + - **Mean Squared Error (MSE)**: Used for regression. + - **Categorical Crossentropy**: Used for classification. + +### **3. Backpropagation**: +- A technique used to compute gradients for updating weights. +- Consists of: + 1. **Forward pass**: Calculate the output. + 2. **Backward pass**: Compute gradients using the chain rule. + 3. **Weight update**: Optimize weights using an optimizer. + +### **4. Optimizers**: +- Algorithms that adjusts weights to minimize the loss function to improve model. +- Examples: **SGD**, **Adam**, **RMSprop**. + +--- + +## **Advantages of MLP** +- Can model non-linear relationships between inputs and outputs. +- Versatile for solving both classification and regression problems. +- Ability to approximate any continuous function (Universal Approximation Theorem). + +--- + +## **Limitations of MLP** +- Computationally expensive for large datasets. +- Prone to overfitting if not regularized properly. +- Less effective for image or sequential data without specialized architectures (e.g., CNNs, RNNs). + +--- + +## **Code Example: Implementing MLP Using Keras** + +```python +from tensorflow.keras.models import Sequential +from tensorflow.keras.layers import Dense + +# Build the MLP +model = Sequential([ + Dense(128, activation='relu', input_shape=(20,)), # Input layer (20 features) + Dense(64, activation='relu'), # Hidden layer + Dense(1, activation='sigmoid') # Output layer (binary classification) +]) + +# Compile the model +model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) + +# Summary of the model +model.summary() +``` +--- + +# **Applications in Real-world Projects** +* Use MLP for datasets where data is in tabular or vector format (e.g., CSV files). +* Fine-tune the architecture by adjusting the number of neurons and layers based on your dataset. diff --git a/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md b/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md new file mode 100644 index 00000000..221ab80f --- /dev/null +++ b/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md @@ -0,0 +1,128 @@ +# Recurrent Neural Networks (RNN) + +--- + +## **What is a Recurrent Neural Network (RNN)?** + +A **Recurrent Neural Network (RNN)** is a type of artificial neural network designed for modeling **sequential data**. Unlike traditional feedforward networks, RNNs have the capability to remember information from previous time steps, making them well-suited for tasks involving temporal or sequential relationships. + +### **Key Characteristics of RNN**: +- **Sequential Processing**: Processes inputs sequentially, one step at a time. +- **Memory Capability**: Uses hidden states to store information about previous steps. +- **Shared Weights**: The same weights are applied across all time steps, reducing complexity. + +--- + +## **Architecture of RNN** + +### **Components of RNN**: +1. **Input Layer**: + - Accepts sequential input data (e.g., time-series data, text, or audio signals). + +2. **Hidden Layer with Recurrence**: + - Maintains a **hidden state** \( h_t \), which is updated at each time step based on the input and the previous hidden state. + - Formula: + \[ + h_t = f(W_h \cdot h_{t-1} + W_x \cdot x_t + b) + \] + Where: + - \( h_t \): Current hidden state. + - \( h_{t-1} \): Previous hidden state. + - \( x_t \): Input at time step \( t \). + - \( W_h, W_x \): Weight matrices. + - \( b \): Bias. + - \( f \): Activation function (e.g., tanh or ReLU). + +3. **Output Layer**: + - Produces output based on the current hidden state. + - Formula: + \[ + y_t = g(W_y \cdot h_t + c) + \] + Where: + - \( y_t \): Output at time step \( t \). + - \( W_y \): Output weight matrix. + - \( c \): Output bias. + - \( g \): Activation function (e.g., softmax or sigmoid). + +--- + +## **Types of RNNs** + +### **1. Vanilla RNN**: +- Standard RNN that processes sequential data using the hidden state. +- Struggles with long-term dependencies due to **vanishing gradient problems**. + +### **2. Long Short-Term Memory (LSTM)**: +- A specialized type of RNN that can learn long-term dependencies by using **gates** to control the flow of information. +- Components: + - **Forget Gate**: Decides what to forget. + - **Input Gate**: Decides what to store. + - **Output Gate**: Controls the output. + +### **3. Gated Recurrent Unit (GRU)**: +- A simplified version of LSTM that combines the forget and input gates into a single **update gate**. + +--- + +## **Applications of RNN** + +### **1. Natural Language Processing (NLP)**: +- Text generation (e.g., predictive typing, chatbots). +- Sentiment analysis. +- Language translation. + +### **2. Time-Series Analysis**: +- Stock price prediction. +- Weather forecasting. +- Energy demand forecasting. + +### **3. Speech and Audio Processing**: +- Speech-to-text transcription. +- Music generation. + +### **4. Video Analysis**: +- Video captioning. +- Action recognition. + +--- + +## **Advantages of RNN** +- Can handle sequential and time-dependent data. +- Shared weights reduce model complexity. +- Effective for tasks with context dependencies, such as language modeling. + +--- + +## **Limitations of RNN** +- **Vanishing Gradient Problem**: + - Makes it difficult to learn long-term dependencies. +- Computationally expensive for long sequences. +- Struggles with parallelization compared to other architectures like CNNs. + +--- + +## **Code Example: Implementing RNN Using Keras** + +```python +from tensorflow.keras.models import Sequential +from tensorflow.keras.layers import SimpleRNN, Dense + +# Build the RNN model +model = Sequential([ + SimpleRNN(128, activation='tanh', input_shape=(10, 1)), # 10 timesteps, 1 feature + Dense(1, activation='sigmoid') # Output layer (binary classification) +]) + +# Compile the model +model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) + +# Summary of the model +model.summary() +``` + +--- + +# **Applications in Real-world Projects** +* Use RNN for tasks involving sequential data where past information impacts the future. +* Prefer LSTM or GRU over vanilla RNN for learning long-term dependencies. diff --git "a/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md\nl" "b/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md\nl" new file mode 100644 index 00000000..db8954e4 --- /dev/null +++ "b/docs/algorithms/deep-learning/neural-networks/recurrent-neural-networks.md\nl" @@ -0,0 +1,129 @@ +# Recurrent Neural Networks (RNN) + +--- + +## **What is a Recurrent Neural Network (RNN)?** + +A **Recurrent Neural Network (RNN)** is a type of artificial neural network designed for modeling **sequential data**. Unlike traditional feedforward networks, RNNs have the capability to remember information from previous time steps, making them well-suited for tasks involving temporal or sequential relationships. + +### **Key Characteristics of RNN**: +- **Sequential Processing**: Processes inputs sequentially, one step at a time. +- **Memory Capability**: Uses hidden states to store information about previous steps. +- **Shared Weights**: The same weights are applied across all time steps, reducing complexity. + +--- + +## **Architecture of RNN** + +### **Components of RNN**: +1. **Input Layer**: + - Accepts sequential input data (e.g., time-series data, text, or audio signals). + +2. **Hidden Layer with Recurrence**: + - Maintains a **hidden state** h_t, which is updated at each time step based on the input and the previous hidden state. + - Formula: +\[ + h_t = f(W_h \cdot h_{t-1} + W_x \cdot x_t + b) +\] + Where: + - h_t: Current hidden state. + - h_{t-1}: Previous hidden state. + - x_t: Input at time step t. + - W_h, W_x: Weight matrices. + - b: Bias. + - f: Activation function (e.g., tanh or ReLU). + +3. **Output Layer**: + - Produces output based on the current hidden state. + - Formula: +\[ + y_t = g(W_y \cdot h_t + c) +\] + Where: + - y_t: Output at time step t. + - W_y: Output weight matrix. + - c: Output bias. + - g: Activation function (e.g., softmax or sigmoid). + +--- + +## **Types of RNNs** + +### **1. Vanilla RNN**: +- Standard RNN that processes sequential data using the hidden state. +- Struggles with long-term dependencies due to **vanishing gradient problems**. + +### **2. Long Short-Term Memory (LSTM)**: +- A specialized type of RNN that can learn long-term dependencies by using **gates** to control the flow of information. +- Components: + - **Forget Gate**: Decides what to forget. + - **Input Gate**: Decides what to store. + - **Output Gate**: Controls the output. + +### **3. Gated Recurrent Unit (GRU)**: +- A simplified version of LSTM that combines the forget and input gates into a single **update gate**. + +--- + +## **Applications of RNN** + +### **1. Natural Language Processing (NLP)**: +- Text generation (e.g., predictive typing, chatbots). +- Sentiment analysis. +- Language translation. + +### **2. Time-Series Analysis**: +- Stock price prediction. +- Weather forecasting. +- Energy demand forecasting. + +### **3. Speech and Audio Processing**: +- Speech-to-text transcription. +- Music generation. + +### **4. Video Analysis**: +- Video captioning. +- Action recognition. + +--- + +## **Advantages of RNN** +- Can handle sequential and time-dependent data. +- Shared weights reduce model complexity. +- Effective for tasks with context dependencies, such as language modeling. + +--- + +## **Limitations of RNN** +- **Vanishing Gradient Problem**: + - Makes it difficult to learn long-term dependencies. +- Computationally expensive for long sequences. +- Struggles with parallelization compared to other architectures like CNNs. + +--- + +## **Code Example: Implementing RNN Using Keras** + +```python +from tensorflow.keras.models import Sequential +from tensorflow.keras.layers import SimpleRNN, Dense + +# Build the RNN model +model = Sequential([ + SimpleRNN(128, activation='tanh', input_shape=(10, 1)), # 10 timesteps, 1 feature + Dense(1, activation='sigmoid') # Output layer (binary classification) +]) + +# Compile the model +model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) + +# Summary of the model +model.summary() +``` +--- + +# **Applications in Real-world Projects** + +Use RNN for tasks involving sequential data where past information impacts the future. + +Prefer LSTM or GRU over vanilla RNN for learning long-term dependencies. \ No newline at end of file