An AI-Powered System for Flawless Semiconductor Quality Control
Live Demo • Our Journey • Key Features • Tech Stack • Results
Click the badge above or follow this link to try the live application: https://silicon-sentinel-qr94pmvkxeykxenptcsyb2.streamlit.app/
This version, hosted on Streamlit Community Cloud, uses our lightweight yolov8n model, which is optimized for fast and efficient performance on free hardware.
A second, more powerful version of this app using our larger
yolov8smodel is also available. You can explore the code and deployment instructions for it on therender-deploymentbranch of this repository.
Silicon Sentinel is a state-of-the-art computer vision pipeline built to tackle one of the most critical challenges in the semiconductor industry: automated defect detection. By leveraging a fine-tuned YOLOv8 model on a hyper-realistic, custom-generated synthetic dataset, this project provides a scalable and highly accurate solution for identifying microscopic flaws like scratches, particles, and blobs on silicon wafers.
This project is a testament to the iterative nature of building real-world AI. Our model was not built in a single step but was forged through a cycle of testing, diagnosing failures, and engineering targeted solutions.
V1: The Naive Model - A Fragile Start
Our first model was trained on a simple, clean dataset. It learned to detect basic defects but failed when shown anything new, like a "blob" defect.
💡 **Lesson Learned:** A model's ability to generalize depends entirely on the diversity of its training data.
V2: The Overeager Model - A New Flaw Emerges
We rebuilt the dataset with more variety, including blobs. The model could now see all defect types, but it became "trigger-happy," hallucinating defects on perfectly clean wafers (false positives).
💡 **Lesson Learned:** An AI must be taught what a defect *is not*. Training on "negative" (clean) examples is critical to prevent false alarms.
V3: The Ultimate Sentinel - The Final, Robust Model
Our previous model was still not perfect. It confused the background with scratches, missed tiny particles, and couldn't distinguish blobs from particle clusters. This final iteration was a targeted strike against these specific failures.
- Hyper-Realistic Data: We engineered our final dataset with multiple, varied background textures, curved/wavy scratches, tiny "dust-speck" particles, and large, irregular "smudge" blobs to eliminate ambiguity.
- A Bigger Brain: We upgraded from the lightweight `YOLOv8n` to the more powerful `YOLOv8s` model to better learn subtle patterns in our complex data.
- More Patient Training: We increased the training time to 75 epochs, giving the more powerful model the time it needed to learn properly.
✅ **The Result:** A reliable and intelligent model that correctly identifies a wide range of defects. The journey demonstrates a realistic workflow for tackling complex computer vision challenges.
- Hyper-Realistic Synthetic Data: A data engine that creates thousands of training examples with varied backgrounds and highly distinct defect types.
- Multi-Class Defect Recognition: Accurately identifies and classifies 3 primary defect types:
scratch,particle, andblob. - State-of-the-Art Accuracy: Employs fine-tuned YOLOv8 models with heavy data augmentation.
- End-to-End & Reproducible: A complete pipeline from data creation to model training, documented for easy replication.
| Python | PyTorch | YOLOv8 | OpenCV | NumPy | Colab |
|---|---|---|---|---|---|
The final model demonstrates a powerful ability to identify various defects across challenging scenarios. The examples below showcase its capability to detect complex, overlapping patterns of scratches and particles.
Prediction on "All Defects" Wafer
Prediction on "Particles" Wafer
Click here for instructions to run this project yourself.
-
Clone the Repository
git clone [https://github.com/Ritviks21/Silicon-Sentinel.git](https://github.com/Ritviks21/Silicon-Sentinel.git) cd Silicon-Sentinel -
Install Dependencies
pip install -r requirements.txt
-
Train the Model Run the provided Google Colab notebook to generate the data, split it, and train the model.
Contributions are welcome! This project is a continuous effort, and there are many ways to help it grow. If you'd like to contribute, please feel free to fork the repository and submit a pull request.
-
Improve the Dataset:
- Add new, challenging defect types (e.g., "stains," "corrosion," "edge-ring").
- Enhance the realism of the existing defect generation functions.
-
Experiment with Models:
- Train larger, more powerful models (e.g.,
YOLOv8-L,YOLOv8-X). - Experiment with different hyperparameters to improve accuracy.
- Train larger, more powerful models (e.g.,
-
Enhance the Application:
- Add new features to the live demo, such as a results summary or the ability to adjust the confidence threshold in the UI.
- Improve the user interface and user experience.