diff --git a/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md b/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md new file mode 100644 index 00000000..62412ed7 --- /dev/null +++ b/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md @@ -0,0 +1,133 @@ +![image](https://github.com/user-attachments/assets/018c5462-5977-415f-8600-65f5560722fd) + +# Multilayer Perceptron (MLP) + +--- + +## **What is a Multilayer Perceptron (MLP)?** + +A **Multilayer Perceptron (MLP)** is a type of **artificial neural network (ANN)** that consists of multiple layers of neurons, designed to learn and map relationships between input data and output predictions. It is foundational building block for Deep Learning. + +### **Key Characteristics of MLP**: +- **Fully Connected Layers**: Each neuron in one layer is connected to every neuron in the next layer. +- **Non-linear Activation Functions**: Introduces non-linearity to help the model learn complex patterns. +- **Supervised Learning**: Typically trained using labeled data with **backpropagation** and optimization algorithms like **Stochastic Gradient Descent (SGD)** or **Adam**. + +--- + +## **Architecture of MLP** + +An MLP consists of three main types of layers: + +1. **Input Layer**: + - Accepts the input features (e.g., pixels of an image, numerical data). + - Every neuron corresponds to one input feature . + +2. **Hidden Layers**: + - Perform intermediate computations to learn the patterns and relationships in data. + - Can have one or more layers depending on the complexity of the problem. + +3. **Output Layer**: + - Produces the final prediction. + - The number of neurons corresponds to the number of output classes (for classification tasks) or a single neuron for regression tasks. + +### **Flow of Data in MLP**: +1. **Linear transformation**: \( z = W \cdot x + b \) + - \( W \): Weight matrix + - \( x \): Input + - \( b \): Bias +2. **Non-linear activation**: \( a = f(z) \), where \( f \) is an activation function (e.g., ReLU, sigmoid, or tanh). + +--- + +## **Applications of Multilayer Perceptron** + +### **Classification**: +- Handwritten digit recognition (e.g., MNIST dataset). +- Sentiment analysis of text. +- Image classification for small datasets. + +### **Regression**: +- Predicting house prices based on features like area, location, etc. +- Forecasting time series data like stock prices or weather, etc. + +### **Healthcare**: +- Disease diagnosis based on patient records. +- Predicting patient outcomes in hospitals. + +### **Finance**: +- Fraud detection in credit card transactions. +- Risk assessment and loan approval. + +### **Speech and Audio**: +- Voice recognition. +- Music genre classification. + +--- + +## **Key Concepts in MLP** + +### **1. Activation Functions**: +- Introduced non linearity to model, enabling it to learn complex patterns. +- Commonly used: + - **ReLU (Rectified Linear Unit)**: \( f(x) = \max(0, x) \) + - **Sigmoid**: \( f(x) = \frac{1}{1 + e^{-x}} \) + - **Tanh**: \( f(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} \) + +### **2. Loss Functions**: +- It measures the difference between predicted and actual values. +- Common examples: + - **Mean Squared Error (MSE)**: Used for regression. + - **Categorical Crossentropy**: Used for classification. + +### **3. Backpropagation**: +- A technique used to compute gradients for updating weights. +- Consists of: + 1. **Forward pass**: Calculate the output. + 2. **Backward pass**: Compute gradients using the chain rule. + 3. **Weight update**: Optimize weights using an optimizer. + +### **4. Optimizers**: +- Algorithms that adjusts weights to minimize the loss function to improve model. +- Examples: **SGD**, **Adam**, **RMSprop**. + +--- + +## **Advantages of MLP** +- Can model non-linear relationships between inputs and outputs. +- Versatile for solving both classification and regression problems. +- Ability to approximate any continuous function (Universal Approximation Theorem). + +--- + +## **Limitations of MLP** +- Computationally expensive for large datasets. +- Prone to overfitting if not regularized properly. +- Less effective for image or sequential data without specialized architectures (e.g., CNNs, RNNs). + +--- + +## **Code Example: Implementing MLP Using Keras** + +```python +from tensorflow.keras.models import Sequential +from tensorflow.keras.layers import Dense + +# Build the MLP +model = Sequential([ + Dense(128, activation='relu', input_shape=(20,)), # Input layer (20 features) + Dense(64, activation='relu'), # Hidden layer + Dense(1, activation='sigmoid') # Output layer (binary classification) +]) + +# Compile the model +model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) + +# Summary of the model +model.summary() +``` +--- + +# **Applications in Real-world Projects** +* Use MLP for datasets where data is in tabular or vector format (e.g., CSV files). +* Fine-tune the architecture by adjusting the number of neurons and layers based on your dataset.