Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 133 additions & 0 deletions docs/algorithms/deep-learning/neural-networks/multilayer-perceptron.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
![image](https://github.com/user-attachments/assets/018c5462-5977-415f-8600-65f5560722fd)

# Multilayer Perceptron (MLP)

---

## **What is a Multilayer Perceptron (MLP)?**

A **Multilayer Perceptron (MLP)** is a type of **artificial neural network (ANN)** that consists of multiple layers of neurons, designed to learn and map relationships between input data and output predictions. It is foundational building block for Deep Learning.

### **Key Characteristics of MLP**:
- **Fully Connected Layers**: Each neuron in one layer is connected to every neuron in the next layer.
- **Non-linear Activation Functions**: Introduces non-linearity to help the model learn complex patterns.
- **Supervised Learning**: Typically trained using labeled data with **backpropagation** and optimization algorithms like **Stochastic Gradient Descent (SGD)** or **Adam**.

---

## **Architecture of MLP**

An MLP consists of three main types of layers:

1. **Input Layer**:
- Accepts the input features (e.g., pixels of an image, numerical data).
- Every neuron corresponds to one input feature .

2. **Hidden Layers**:
- Perform intermediate computations to learn the patterns and relationships in data.
- Can have one or more layers depending on the complexity of the problem.

3. **Output Layer**:
- Produces the final prediction.
- The number of neurons corresponds to the number of output classes (for classification tasks) or a single neuron for regression tasks.

### **Flow of Data in MLP**:
1. **Linear transformation**: \( z = W \cdot x + b \)
- \( W \): Weight matrix
- \( x \): Input
- \( b \): Bias
2. **Non-linear activation**: \( a = f(z) \), where \( f \) is an activation function (e.g., ReLU, sigmoid, or tanh).

---

## **Applications of Multilayer Perceptron**

### **Classification**:
- Handwritten digit recognition (e.g., MNIST dataset).
- Sentiment analysis of text.
- Image classification for small datasets.

### **Regression**:
- Predicting house prices based on features like area, location, etc.
- Forecasting time series data like stock prices or weather, etc.

### **Healthcare**:
- Disease diagnosis based on patient records.
- Predicting patient outcomes in hospitals.

### **Finance**:
- Fraud detection in credit card transactions.
- Risk assessment and loan approval.

### **Speech and Audio**:
- Voice recognition.
- Music genre classification.

---

## **Key Concepts in MLP**

### **1. Activation Functions**:
- Introduced non linearity to model, enabling it to learn complex patterns.
- Commonly used:
- **ReLU (Rectified Linear Unit)**: \( f(x) = \max(0, x) \)
- **Sigmoid**: \( f(x) = \frac{1}{1 + e^{-x}} \)
- **Tanh**: \( f(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} \)

### **2. Loss Functions**:
- It measures the difference between predicted and actual values.
- Common examples:
- **Mean Squared Error (MSE)**: Used for regression.
- **Categorical Crossentropy**: Used for classification.

### **3. Backpropagation**:
- A technique used to compute gradients for updating weights.
- Consists of:
1. **Forward pass**: Calculate the output.
2. **Backward pass**: Compute gradients using the chain rule.
3. **Weight update**: Optimize weights using an optimizer.

### **4. Optimizers**:
- Algorithms that adjusts weights to minimize the loss function to improve model.
- Examples: **SGD**, **Adam**, **RMSprop**.

---

## **Advantages of MLP**
- Can model non-linear relationships between inputs and outputs.
- Versatile for solving both classification and regression problems.
- Ability to approximate any continuous function (Universal Approximation Theorem).

---

## **Limitations of MLP**
- Computationally expensive for large datasets.
- Prone to overfitting if not regularized properly.
- Less effective for image or sequential data without specialized architectures (e.g., CNNs, RNNs).

---

## **Code Example: Implementing MLP Using Keras**

```python
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Build the MLP
model = Sequential([
Dense(128, activation='relu', input_shape=(20,)), # Input layer (20 features)
Dense(64, activation='relu'), # Hidden layer
Dense(1, activation='sigmoid') # Output layer (binary classification)
])

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Summary of the model
model.summary()
```
---

# **Applications in Real-world Projects**
* Use MLP for datasets where data is in tabular or vector format (e.g., CSV files).
* Fine-tune the architecture by adjusting the number of neurons and layers based on your dataset.
Loading