diff --git a/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md b/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md
new file mode 100644
index 00000000..62412ed7
--- /dev/null
+++ b/docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md
@@ -0,0 +1,133 @@
+![image](https://github.com/user-attachments/assets/018c5462-5977-415f-8600-65f5560722fd)
+
+# Multilayer Perceptron (MLP)
+
+---
+
+## **What is a Multilayer Perceptron (MLP)?**
+
+A **Multilayer Perceptron (MLP)** is a type of **artificial neural network (ANN)** that consists of multiple layers of neurons, designed to learn and map relationships between input data and output predictions. It is  foundational building block for Deep Learning.
+
+### **Key Characteristics of MLP**:
+- **Fully Connected Layers**: Each neuron in one layer is connected to every neuron in the next layer.
+- **Non-linear Activation Functions**: Introduces non-linearity to help the model learn complex patterns.
+- **Supervised Learning**: Typically trained using labeled data with **backpropagation** and optimization algorithms like **Stochastic Gradient Descent (SGD)** or **Adam**.
+
+---
+
+## **Architecture of MLP**
+
+An MLP consists of three main types of layers:
+
+1. **Input Layer**:
+   - Accepts the input features (e.g., pixels of an image, numerical data).
+   - Every neuron corresponds to one input feature .
+
+2. **Hidden Layers**:
+   - Perform intermediate computations to learn the patterns and relationships in data.
+   - Can have one or more layers depending on the complexity of the problem.
+
+3. **Output Layer**:
+   - Produces the final prediction.
+   - The number of neurons corresponds to the number of output classes (for classification tasks) or a single neuron for regression tasks.
+
+### **Flow of Data in MLP**:
+1. **Linear transformation**: \( z = W \cdot x + b \)  
+   - \( W \): Weight matrix  
+   - \( x \): Input  
+   - \( b \): Bias  
+2. **Non-linear activation**: \( a = f(z) \), where \( f \) is an activation function (e.g., ReLU, sigmoid, or tanh).
+
+---
+
+## **Applications of Multilayer Perceptron**
+
+### **Classification**:
+- Handwritten digit recognition (e.g., MNIST dataset).
+- Sentiment analysis of text.
+- Image classification for small datasets.
+
+### **Regression**:
+- Predicting house prices based on features like area, location, etc.
+- Forecasting time series data like stock prices or weather, etc.
+
+### **Healthcare**:
+- Disease diagnosis based on patient records.
+- Predicting patient outcomes in hospitals.
+
+### **Finance**:
+- Fraud detection in credit card transactions.
+- Risk assessment and loan approval.
+
+### **Speech and Audio**:
+- Voice recognition.
+- Music genre classification.
+
+---
+
+## **Key Concepts in MLP**
+
+### **1. Activation Functions**:
+- Introduced non linearity to model, enabling it to learn complex patterns.
+- Commonly used:
+  - **ReLU (Rectified Linear Unit)**: \( f(x) = \max(0, x) \)
+  - **Sigmoid**: \( f(x) = \frac{1}{1 + e^{-x}} \)
+  - **Tanh**: \( f(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} \)
+
+### **2. Loss Functions**:
+- It measures the difference between predicted and actual values.
+- Common examples:
+  - **Mean Squared Error (MSE)**: Used for regression.
+  - **Categorical Crossentropy**: Used for classification.
+
+### **3. Backpropagation**:
+- A technique used to compute gradients for updating weights.
+- Consists of:
+  1. **Forward pass**: Calculate the output.
+  2. **Backward pass**: Compute gradients using the chain rule.
+  3. **Weight update**: Optimize weights using an optimizer.
+
+### **4. Optimizers**:
+- Algorithms that adjusts weights to minimize the loss function to improve model.
+- Examples: **SGD**, **Adam**, **RMSprop**.
+
+---
+
+## **Advantages of MLP**
+- Can model non-linear relationships between inputs and outputs.
+- Versatile for solving both classification and regression problems.
+- Ability to approximate any continuous function (Universal Approximation Theorem).
+
+---
+
+## **Limitations of MLP**
+- Computationally expensive for large datasets.
+- Prone to overfitting if not regularized properly.
+- Less effective for image or sequential data without specialized architectures (e.g., CNNs, RNNs).
+
+---
+
+## **Code Example: Implementing MLP Using Keras**
+
+```python
+from tensorflow.keras.models import Sequential
+from tensorflow.keras.layers import Dense
+
+# Build the MLP
+model = Sequential([
+    Dense(128, activation='relu', input_shape=(20,)),  # Input layer (20 features)
+    Dense(64, activation='relu'),                     # Hidden layer
+    Dense(1, activation='sigmoid')                    # Output layer (binary classification)
+])
+
+# Compile the model
+model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
+
+# Summary of the model
+model.summary()
+```
+---
+
+# **Applications in Real-world Projects**
+* Use MLP for datasets where data is in tabular or vector format (e.g., CSV files).
+* Fine-tune the architecture by adjusting the number of neurons and layers based on your dataset.