diff --git a/docs/ML/projects/OCR.ipynb b/docs/ML/projects/OCR.ipynb new file mode 100644 index 00000000..b95cb24f --- /dev/null +++ b/docs/ML/projects/OCR.ipynb @@ -0,0 +1,483 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "accelerator": "GPU" + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "_I13dfVEjVZ5" + }, + "source": [ + "\n", + "\n", + "```\n", + "# This is formatted as code\n", + "```\n", + "\n", + "# Optical Character Recognition using Tesseract" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ydvTlknjrAHU" + }, + "source": [ + "# By Aditya Singh\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "wpWkMPfNj66y" + }, + "source": [ + "Optical Character Recognition(OCR) has been a popular task in Computer Vision. Tesseract is the most open-source software available for OCR. It was initially developed by HP as a tool in C++.\n", + "Since 2006 it is developed by Google. The original software is available as a command-line tool for windows. We are living in a python world. Because of its popularity,the tool is also available in python--developed and maintained as an opensource project." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "GDZYVGrJkorH" + }, + "source": [ + "### Step1. Install Pytesseract and tesseract-OCR in Google Colab." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "uWwpI-24_Nob", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "553cd595-0a23-4026-afb4-bb1139d66dec" + }, + "source": [ + "!sudo apt install tesseract-ocr" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Reading package lists... Done\n", + "Building dependency tree... Done\n", + "Reading state information... Done\n", + "The following additional packages will be installed:\n", + " tesseract-ocr-eng tesseract-ocr-osd\n", + "The following NEW packages will be installed:\n", + " tesseract-ocr tesseract-ocr-eng tesseract-ocr-osd\n", + "0 upgraded, 3 newly installed, 0 to remove and 49 not upgraded.\n", + "Need to get 4,816 kB of archives.\n", + "After this operation, 15.6 MB of additional disk space will be used.\n", + "Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 tesseract-ocr-eng all 1:4.00~git30-7274cfa-1.1 [1,591 kB]\n", + "Get:2 http://archive.ubuntu.com/ubuntu jammy/universe amd64 tesseract-ocr-osd all 1:4.00~git30-7274cfa-1.1 [2,990 kB]\n", + "Get:3 http://archive.ubuntu.com/ubuntu jammy/universe amd64 tesseract-ocr amd64 4.1.1-2.1build1 [236 kB]\n", + "Fetched 4,816 kB in 2s (2,904 kB/s)\n", + "debconf: unable to initialize frontend: Dialog\n", + "debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 78, <> line 3.)\n", + "debconf: falling back to frontend: Readline\n", + "debconf: unable to initialize frontend: Readline\n", + "debconf: (This frontend requires a controlling tty.)\n", + "debconf: falling back to frontend: Teletype\n", + "dpkg-preconfigure: unable to re-open stdin: \n", + "Selecting previously unselected package tesseract-ocr-eng.\n", + "(Reading database ... 123614 files and directories currently installed.)\n", + "Preparing to unpack .../tesseract-ocr-eng_1%3a4.00~git30-7274cfa-1.1_all.deb ...\n", + "Unpacking tesseract-ocr-eng (1:4.00~git30-7274cfa-1.1) ...\n", + "Selecting previously unselected package tesseract-ocr-osd.\n", + "Preparing to unpack .../tesseract-ocr-osd_1%3a4.00~git30-7274cfa-1.1_all.deb ...\n", + "Unpacking tesseract-ocr-osd (1:4.00~git30-7274cfa-1.1) ...\n", + "Selecting previously unselected package tesseract-ocr.\n", + "Preparing to unpack .../tesseract-ocr_4.1.1-2.1build1_amd64.deb ...\n", + "Unpacking tesseract-ocr (4.1.1-2.1build1) ...\n", + "Setting up tesseract-ocr-eng (1:4.00~git30-7274cfa-1.1) ...\n", + "Setting up tesseract-ocr-osd (1:4.00~git30-7274cfa-1.1) ...\n", + "Setting up tesseract-ocr (4.1.1-2.1build1) ...\n", + "Processing triggers for man-db (2.10.2-1) ...\n" + ] + } + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ujL16dZ2_O-3", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "781a1fc8-5b33-4e9f-fcca-6718dea17723" + }, + "source": [ + "!pip install pytesseract" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Collecting pytesseract\n", + " Downloading pytesseract-0.3.13-py3-none-any.whl.metadata (11 kB)\n", + "Requirement already satisfied: packaging>=21.3 in /usr/local/lib/python3.10/dist-packages (from pytesseract) (24.1)\n", + "Requirement already satisfied: Pillow>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from pytesseract) (10.4.0)\n", + "Downloading pytesseract-0.3.13-py3-none-any.whl (14 kB)\n", + "Installing collected packages: pytesseract\n", + "Successfully installed pytesseract-0.3.13\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zlsiJdwnkyx7" + }, + "source": [ + "### Step2. import libraries" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "QKJh7JjTAqzO" + }, + "source": [ + "import pytesseract\n", + "import shutil\n", + "import os\n", + "import random\n", + "try:\n", + " from PIL import Image\n", + "except ImportError:\n", + " import Image" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yjaf9bJ3k4GC" + }, + "source": [ + "### Step3. Upload Image to the Colab" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "8zmc-K_nAyg1", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 73 + }, + "outputId": "b60c904c-7496-4e8f-e964-80dff99843ef" + }, + "source": [ + "from google.colab import files\n", + "\n", + "uploaded = files.upload()" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "text/html": [ + "\n", + " \n", + " \n", + " Upload widget is only available when the cell has been executed in the\n", + " current browser session. Please rerun this cell to enable.\n", + " \n", + " " + ] + }, + "metadata": {} + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Saving quotes-about-life-love-yourself.jpg to quotes-about-life-love-yourself.jpg\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MDdbltgIlF3Q" + }, + "source": [ + "## Step4. Text Extraction" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "HaM3cMUDA_Ma" + }, + "source": [ + "import pytesseract\n", + "import shutil\n", + "import os\n", + "import random\n", + "try:\n", + " from PIL import Image\n", + "except ImportError:\n", + " import Image\n", + "import requests\n", + "from io import BytesIO\n", + "\n", + "# Fetch the image from the URL\n", + "response = requests.get('https://becomingunbusy.com/wp-content/uploads/2019/05/quotes-about-life-love-yourself.jpg', stream=True)\n", + "response.raw.decode_content = True # Ensure correct decoding\n", + "\n", + "# Open the image using BytesIO to handle the image data\n", + "image = Image.open(BytesIO(response.content))\n", + "\n", + "# Now you can use pytesseract to extract text\n", + "extractedInformation = pytesseract.image_to_string(image)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "D552cuoflHvx" + }, + "source": [ + "### Step5. Printing the extracted information" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Key-3vILBNUd", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "f87373b9-2b31-4f2c-9e7f-efbe451e37e9" + }, + "source": [ + "print(extractedInformation)" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "KAMEN ARIE NA I\n", + "love yourself enough to no longer\n", + "\n", + "just dream of a better life, let it be the day\n", + "\n", + "TONE Hae\n", + "\n", + " \n", + "\f\n" + ] + } + ] + } + ] +} \ No newline at end of file diff --git a/docs/ML/projects/quotes-about-life-love-yourself.jpg b/docs/ML/projects/quotes-about-life-love-yourself.jpg new file mode 100644 index 00000000..e89cdcef Binary files /dev/null and b/docs/ML/projects/quotes-about-life-love-yourself.jpg differ diff --git a/product detection/Product_Detection.ipynb b/product detection/Product_Detection.ipynb new file mode 100644 index 00000000..64b654df --- /dev/null +++ b/product detection/Product_Detection.ipynb @@ -0,0 +1,389 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "language_info": { + "name": "python" + } + }, + "cells": [ + { + "cell_type": "code", + "source": [ + "# Install the required dependencies\n", + "!pip install opencv-python opencv-python-headless\n", + "\n", + "# Download the YOLOv3 pre-trained model weights\n", + "!wget https://pjreddie.com/media/files/yolov3.weights\n", + "!wget https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/yolov3.cfg\n", + "!wget https://raw.githubusercontent.com/pjreddie/darknet/master/data/coco.names\n" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "jKona4B6ng4V", + "outputId": "b043c4c6-147e-4dda-e7d1-5b30c3833ba2" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Requirement already satisfied: opencv-python in /usr/local/lib/python3.11/dist-packages (4.10.0.84)\n", + "Requirement already satisfied: opencv-python-headless in /usr/local/lib/python3.11/dist-packages (4.10.0.84)\n", + "Requirement already satisfied: numpy>=1.21.2 in /usr/local/lib/python3.11/dist-packages (from opencv-python) (1.26.4)\n", + "--2025-01-16 06:20:16-- https://pjreddie.com/media/files/yolov3.weights\n", + "Resolving pjreddie.com (pjreddie.com)... 162.0.215.52\n", + "Connecting to pjreddie.com (pjreddie.com)|162.0.215.52|:443... connected.\n", + "HTTP request sent, awaiting response... 200 OK\n", + "Length: 248007048 (237M) [application/octet-stream]\n", + "Saving to: ‘yolov3.weights’\n", + "\n", + "yolov3.weights 100%[===================>] 236.52M 12.3MB/s in 18s \n", + "\n", + "2025-01-16 06:20:35 (12.9 MB/s) - ‘yolov3.weights’ saved [248007048/248007048]\n", + "\n", + "--2025-01-16 06:20:35-- https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/yolov3.cfg\n", + "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...\n", + "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.\n", + "HTTP request sent, awaiting response... 200 OK\n", + "Length: 8342 (8.1K) [text/plain]\n", + "Saving to: ‘yolov3.cfg’\n", + "\n", + "yolov3.cfg 100%[===================>] 8.15K --.-KB/s in 0s \n", + "\n", + "2025-01-16 06:20:35 (58.2 MB/s) - ‘yolov3.cfg’ saved [8342/8342]\n", + "\n", + "--2025-01-16 06:20:35-- https://raw.githubusercontent.com/pjreddie/darknet/master/data/coco.names\n", + "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n", + "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n", + "HTTP request sent, awaiting response... 200 OK\n", + "Length: 625 [text/plain]\n", + "Saving to: ‘coco.names’\n", + "\n", + "coco.names 100%[===================>] 625 --.-KB/s in 0s \n", + "\n", + "2025-01-16 06:20:35 (28.8 MB/s) - ‘coco.names’ saved [625/625]\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "import cv2\n", + "import numpy as np\n", + "\n", + "# Load YOLO model\n", + "net = cv2.dnn.readNet(\"yolov3.weights\", \"yolov3.cfg\")\n", + "layer_names = net.getLayerNames()\n", + "output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers()]\n", + "\n", + "# Load class labels (from coco.names)\n", + "with open(\"coco.names\", \"r\") as f:\n", + " classes = [line.strip() for line in f.readlines()]\n", + "\n", + "# Function to perform object detection\n", + "def detect_objects(image_path):\n", + " # Load image\n", + " img = cv2.imread(image_path)\n", + " height, width, channels = img.shape\n", + "\n", + " # Prepare the image for YOLO\n", + " blob = cv2.dnn.blobFromImage(img, 0.00392, (416, 416), (0, 0, 0), True, crop=False)\n", + " net.setInput(blob)\n", + " outputs = net.forward(output_layers)\n", + "\n", + " # Initialize lists for detected objects\n", + " class_ids = []\n", + " confidences = []\n", + " boxes = []\n", + "\n", + " # Iterate over all detections\n", + " for out in outputs:\n", + " for detection in out:\n", + " scores = detection[5:]\n", + " class_id = np.argmax(scores)\n", + " confidence = scores[class_id]\n", + "\n", + " if confidence > 0.5: # Confidence threshold\n", + " center_x = int(detection[0] * width)\n", + " center_y = int(detection[1] * height)\n", + " w = int(detection[2] * width)\n", + " h = int(detection[3] * height)\n", + "\n", + " # Coordinates for the bounding box\n", + " x = int(center_x - w / 2)\n", + " y = int(center_y - h / 2)\n", + "\n", + " # Save the detection details\n", + " boxes.append([x, y, w, h])\n", + " confidences.append(float(confidence))\n", + " class_ids.append(class_id)\n", + "\n", + " # Apply non-maxima suppression to remove overlapping boxes\n", + " indexes = cv2.dnn.NMSBoxes(boxes, confidences, score_threshold=0.5, nms_threshold=0.4)\n", + "\n", + " # Draw the bounding boxes and labels\n", + " detected_objects = []\n", + " if len(indexes) > 0:\n", + " for i in indexes.flatten():\n", + " x, y, w, h = boxes[i]\n", + " label = str(classes[class_ids[i]])\n", + " confidence = confidences[i]\n", + " detected_objects.append((label, confidence))\n", + " cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)\n", + " cv2.putText(img, f\"{label} {confidence:.2f}\", (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)\n", + "\n", + " return img, detected_objects\n", + "\n", + "# Display results\n", + "from google.colab import files\n", + "from IPython.display import Image\n", + "\n", + "# Upload an image\n", + "uploaded = files.upload()\n", + "\n", + "# Get image path\n", + "image_path = list(uploaded.keys())[0]\n", + "\n", + "# Run detection\n", + "result_img, detected_objects = detect_objects(image_path)\n", + "\n", + "# Save and display the result\n", + "output_path = \"result.jpg\"\n", + "cv2.imwrite(output_path, result_img)\n", + "Image(output_path)\n", + "\n", + "# Print detected objects and confidence\n", + "print(\"Detected Objects and Confidence Scores:\")\n", + "for label, confidence in detected_objects:\n", + " print(f\"{label}: {confidence:.2f}\")\n" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 108 + }, + "id": "P1krlvdqnv19", + "outputId": "7a7b00d2-e6c8-49cc-bbc3-13a003e9850c" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "text/html": [ + "\n", + " \n", + " \n", + " Upload widget is only available when the cell has been executed in the\n", + " current browser session. Please rerun this cell to enable.\n", + " \n", + " " + ] + }, + "metadata": {} + }, + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Saving Screenshot 2024-12-24 112646.png to Screenshot 2024-12-24 112646.png\n", + "Detected Objects and Confidence Scores:\n", + "bird: 0.98\n" + ] + } + ] + } + ] +} \ No newline at end of file diff --git a/product detection/Readme.md b/product detection/Readme.md new file mode 100644 index 00000000..a5db9e20 --- /dev/null +++ b/product detection/Readme.md @@ -0,0 +1,27 @@ +# Product Detection + +## Overview +This repository provides a solution for product detection using machine learning and computer vision techniques. The project leverages deep learning models to detect and classify products from images or video streams. It is suitable for applications in inventory management, retail analytics, and e-commerce. + +## Features +- **Product Detection**: Identifies and locates products within images and videos. +- **Product Classification**: Classifies detected products based on pre-trained models. +- **Real-Time Processing**: Supports real-time video processing for dynamic environments. +- **Customizable**: Allows fine-tuning to specific product datasets and use cases. + +## Requirements +- Python 3.x +- OpenCV +- TensorFlow or PyTorch (depending on model used) +- Other dependencies listed in `requirements.txt` + +## Installation + +Follow these steps to get started with the project. + +1. Clone the repository to your local machine: + ```bash + git clone https://github.com/username/product-detection.git + cd product-detection + +2. pip install -r requirements.txt