diff --git a/README.md b/README.md index 0919465f..e1d1e618 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,7 @@
-

DeepCamera

-

Edge AI for Smart Camera Systems

+

DeepCamera โ€” Open-Source AI Camera Skills Platform

-

Transform any camera into an intelligent monitoring system with state-of-the-art AI capabilities

+

DeepCamera's open-source skills give your cameras AI โ€” VLM scene analysis, object detection, person re-identification, all running locally with models like Qwen, DeepSeek, SmolVLM, and LLaVA. Built on proven facial recognition, RE-ID, fall detection, and CCTV/NVR surveillance monitoring, the skill catalog extends these machine learning capabilities with modern AI. All inference runs locally for maximum privacy.

@@ -25,60 +24,9 @@ --- -

- -### ๐Ÿ›ก๏ธ Introducing [SharpAI Aegis](https://www.sharpai.org) โ€” Desktop App for DeepCamera - -**Use DeepCamera's AI skills through a desktop app with LLM-powered setup, agent chat, and smart alerts โ€” connected to your mobile via Discord / Telegram / Slack.** - -[SharpAI Aegis](https://www.sharpai.org) is the desktop companion for DeepCamera. It uses LLM to automatically set up your environment, configure camera skills, and manage the full AI pipeline โ€” no manual Docker or CLI required. It also adds an intelligent agent layer: persistent memory, agentic chat with your cameras, AI video generation, voice (TTS), and conversational messaging via Discord / Telegram / Slack. - -[**๐Ÿ“ฆ Download SharpAI Aegis**](https://www.sharpai.org) - -
- - - - - - -
-

Browse & Run VLMs Locally from HuggingFace

-SharpAI Aegis โ€” Browse and download vision language models from HuggingFace -

Download and run SmolVLM2, Qwen-VL, LFM2.5, LLaVA, MiniCPM-V directly on your machine. Even a Mac M1 Mini 8GB works.

-
-

Chat with your AI Security Agent

-SharpAI Aegis โ€” Ask your agent what happened and get real answers -

Ask "anyone entering the room?" โ€” Aegis searches your footage and gives you a real answer with timestamps and clips.

-
- ---- - ---- - -## ๐ŸŽฏ Overview - -DeepCamera is an **open-source AI skill platform** that transforms any camera into an intelligent monitoring system. It provides a growing catalog of pluggable AI skills โ€” from real-time object detection and person re-identification to VLM scene analysis, interactive segmentation, and smart home automation. - -Each skill is a self-contained module with its own model, parameters, and communication protocol. Skills are installed, configured, and orchestrated through [SharpAI Aegis](https://www.sharpai.org) โ€” the desktop companion that adds LLM-powered setup, agent chat, and smart alerts. - -Building on DeepCamera's proven open-source facial recognition, person re-identification (RE-ID), fall detection, and CCTV/NVR surveillance monitoring, the skill catalog extends these machine learning capabilities with modern AI โ€” from VLM scene understanding to SAM2 segmentation and DINOv3 visual grounding. All inference runs locally on your device for maximum privacy. - -### Core Capabilities - -- ๐Ÿ” **Detection** โ€” YOLO object detection, DINOv3 open-vocabulary grounding, person re-identification (ReID) -- ๐Ÿง  **Analysis** โ€” VLM scene understanding of recorded clips, SAM2 interactive segmentation -- ๐ŸŽจ **Transformation** โ€” Depth Anything v2 real-time depth maps -- ๐Ÿท๏ธ **Annotation** โ€” AI-assisted dataset creation with COCO export -- ๐Ÿ“ท **Camera Providers** โ€” Eufy, Reolink, Tapo (RTSP/ONVIF) -- ๐Ÿ“บ **Streaming** โ€” Multi-camera RTSP โ†’ WebRTC via go2rtc -- ๐Ÿ’ฌ **Channels** โ€” Matrix, LINE, Signal messaging for the Clawdbot agent -- โšก **Automation** โ€” MQTT, webhooks, Home Assistant triggers -- ๐Ÿ  **Integrations** โ€” Bidirectional Home Assistant bridge - ## ๐Ÿงฉ Skill Catalog -Every skill lives in [`skills/`](skills/) with a `SKILL.md` manifest, `requirements.txt`, and working Python script. See the [Skill Development Guide](docs/skill-development.md) to build your own. +Each skill is a self-contained module with its own model, parameters, and [communication protocol](docs/skill-development.md). See the [Skill Development Guide](docs/skill-development.md) and [Platform Parameters](docs/skill-params.md) to build your own. | Category | Skill | What It Does | |----------|-------|--------------| @@ -105,389 +53,112 @@ Every skill lives in [`skills/`](skills/) with a `SKILL.md` manifest, `requireme - [ ] **Custom skill packaging** โ€” community-contributed skills via GitHub - [ ] **GPU-optimized containers** โ€” one-click Docker deployment per skill -## ๐Ÿš€ Applications - -### 1. Person Recognition for Intruder Detection -Advanced intruder detection using self-supervised person recognition (REID) technology. [Source code](https://github.com/SharpAI/DeepCamera/blob/master/src/yolov7_reid/src/detector_cpu.py) - -**Key Technologies:** -- Yolov7 Tiny (COCO pretrained) for person detection -- FastReID ResNet50 for feature extraction -- Milvus vector database for self-supervised learning -- Integration with Home-Assistant for smart home automation - -```bash -pip3 install sharpai-hub -sharpai-cli yolov7_reid start -``` - -### 2. Local Facial Recognition -Secure, locally-deployed facial recognition system for intruder detection. All data stays on your device. -```bash -sharpai-cli local_deepcamera start -``` - -### 3. Cloud-Based Facial Recognition -Free cloud-powered facial recognition system: -```bash -sharpai-cli login -sharpai-cli device register -sharpai-cli deepcamera start -``` - -### 4. Screen Monitor for Child Safety -Monitor laptop screens using AI-powered feature extraction and local storage. Perfect for ensuring online safety for kids and teens. -```bash -sharpai-cli screen_monitor start -``` - -### 5. Basic Person Detection -Simple and efficient person detection system: -```bash -sharpai-cli yolov7_person_detector start -``` - -## ๐Ÿ“ฆ Installation Guide - -### Prerequisites -- Docker (Latest version) -- Python (v3.6 - v3.10) -- Internet connection for initial setup - -### Quick Start -1. Install SharpAI-Hub: -```bash -pip3 install sharpai-hub -``` - -2. Start desired application (example using yolov7_reid): -```bash -sharpai-cli yolov7_reid start -``` - -### Important URLs -- Docker Desktop UI: http://localhost:8000 -- Home-Assistant: http://localhost:8123 -- Labelstudio: http://localhost:8080 +## ๐Ÿš€ Getting Started with [SharpAI Aegis](https://www.sharpai.org) -
-

๐Ÿ“ฑ Supported Devices

- -#### Edge AI Hardware -- Nvidia Jetson - - Nano (ReComputer j1010) - - Xavier AGX -- Single Board Computers - - Raspberry Pi 4GB/8GB -- Desktop/Laptop - - MacOS - - Windows - - Ubuntu -- MCU Cameras - - ESP32 CAM - - ESP32-S3-Eye - -#### Compatible Cameras -- RTSP Cameras (Lorex/Amrest/DoorBell) -- Blink Camera -- IMOU Camera -- Google Nest (Indoor/Outdoor) -
+The easiest way to run DeepCamera's AI skills. Aegis connects everything โ€” cameras, models, skills, and you. -# Application 1: Self-supervised person recognition(REID) for intruder detection -SharpAI yolov7_reid is an open source python application leverages AI technologies to detect intruder with traditional surveillance camera. Source code is [here](https://github.com/SharpAI/DeepCamera/blob/master/src/yolov7_reid/src/detector_cpu.py) -It leverages Yolov7 as person detector, FastReID for person feature extraction, Milvus the local vector database for self-supervised learning to identity unseen person, Labelstudio to host image locally and for further usage such as label data and train your own classifier. It also integrates with Home-Assistant to empower smart home with AI technology. -In Simple terms yolov7_reid is a person detector. - - --
- Machine learning technologies - - - Yolov7 Tiny, pretrained from COCO dataset - - FastReID ResNet50 - - Vector Database Milvus for self-supervised learning -
--
- Supported Devices - - - Nvidia Jetson - - [Nano (ReComputer j1010)](https://www.seeedstudio.com/Jetson-10-1-H0-p-5335.html) - - Xavier AGX - - Single Board Computer (SBC) - - Raspberry Pi 4GB - - Raspberry Pi 8GB - - Intel X64 - - MacOS - - Windows - - Ubuntu - - MCU Camera - - ESP32 CAM - - ESP32-S3-Eye - - Tested Cameras/CCTV/NVR - - RTSP Camera (Lorex/Amrest/DoorBell) - - Blink Camera - - IMOU Camera - - Google Nest (Indoor/Outdoor) -
+- ๐Ÿ“ท **Connect cameras in seconds** โ€” add RTSP/ONVIF cameras, webcams, or iPhone cameras for a quick test +- ๐Ÿค– **Built-in local LLM & VLM** โ€” llama-server included, no separate setup needed +- ๐Ÿ“ฆ **One-click skill deployment** โ€” install skills from the catalog with AI-assisted troubleshooting +- ๐Ÿ”ฝ **One-click HuggingFace downloads** โ€” browse and run Qwen, DeepSeek, SmolVLM, LLaVA, MiniCPM-V +- ๐Ÿ“Š **Find the best VLM for your machine** โ€” benchmark models on your own hardware with HomeSec-Bench +- ๐Ÿ’ฌ **Talk to your guard** โ€” via Telegram, Discord, or Slack. Ask what happened, tell it what to watch for, get AI-reasoned answers with footage. - +
+[**๐Ÿ“ฆ Download SharpAI Aegis โ†’**](https://www.sharpai.org) -## Installation Guide -``` -pip3 install sharpai-hub -sharpai-cli yolov7_reid start -``` +
+ + + + + + +
+

Run Local VLMs from HuggingFace โ€” Even on Mac Mini 8GB

+SharpAI Aegis โ€” Browse and run local VLM models for AI camera video analysis +

Download and run SmolVLM2, Qwen-VL, LLaVA, MiniCPM-V locally. Your AI security camera agent sees through these eyes.

+
+

Chat with Your AI Camera Agent

+SharpAI Aegis โ€” LLM-powered agentic security camera chat +

"Who was at the door?" โ€” Your agent searches footage, reasons about what happened, and answers with timestamps and clips.

+
+ + +## ๐Ÿ“Š HomeSec-Bench โ€” How Secure Is Your Local AI? + +**HomeSec-Bench** is a 131-test security benchmark that measures how well your local AI performs as a security guard. It tests what matters: Can it detect a person in fog? Classify a break-in vs. a delivery? Resist prompt injection? Route alerts correctly at 3 AM? + +Run it on your own hardware to know exactly where your setup stands. + +| Area | Tests | What's at Stake | +|------|-------|-----------------| +| Scene Understanding | 35 | Person detection in fog, rain, night IR, sun glare | +| Security Classification | 12 | Telling a break-in from a raccoon | +| Tool Use & Reasoning | 16 | Correct tool calls with accurate parameters | +| Prompt Injection Resistance | 4 | Adversarial attacks that try to disable your guard | +| Privacy Compliance | 3 | PII leak prevention, illegal surveillance refusal | +| Alert Routing | 5 | Right message, right channel, right time | + +### Results: Local vs. Cloud vs. Hybrid + +
HomeSec-Bench benchmark results โ€” local Qwen 4B vs cloud GPT-5.2 vs hybrid + +Running on a **Mac M1 Mini 8GB**: local Qwen3.5-4B scores **39/54** (72%), cloud GPT-5.2 scores **46/48** (96%), and the hybrid config reaches **53/54** (98%). All 35 VLM test images are **AI-generated** โ€” no real footage, fully privacy-compliant. + +๐Ÿ“„ [Read the Paper](docs/paper/home-security-benchmark.pdf) ยท ๐Ÿ”ฌ [Run It Yourself](skills/analysis/home-security-benchmark/) ยท ๐Ÿ“‹ [Test Scenarios](skills/analysis/home-security-benchmark/fixtures/) + +--- + +## ๐Ÿ“ฆ More Applications -
-

Prerequisites

- 1. Docker (Latest version)
- 2. Python (v3.6 to v3.10 will work fine) -
-
-

Step-by-step guide

` with the camera entity ID we obtained in Step 9. If you have multiple cameras then keep adding the `entity_id` under `images_processing`.** - - -``` -stream: - ll_hls: true - part_duration: 0.75 - segment_duration: 6 - -image_processing: - - platform: sharpai - source: - - entity_id: camera. - scan_interval: 1 -``` - -If you have multiple cameras then after changing the 'entity_id' the code will become similar to this: - -``` -stream: - ll_hls: true - part_duration: 0.75 - segment_duration: 6 - -image_processing: - - platform: sharpai - source: - - entity_id: camera.192_168_29_44 - - entity_id: camera.192_168_29_45 - - entity_id: camera.192_168_29_46 - - entity_id: camera.192_168_29_47 - scan_interval: 1 -``` - -12) At `home-assistant` homepage `http://localhost:8123` select `Developer Tools`. Look for and click `Check Configuration` under `Configuration Validation`. If everything went well then it must show "Configuration Valid'. Click `Restart`.Now go to the `container` tab of docker, click three vertical dots under `Actions` and press restart. Open the `Overview` tab of `home-assitant`. If you see `Image Processing` beside your cameras and below it `Sharp IP_ADDRESS_OF_YOUR_CAMERA`, then congrats. Everything is working as expected. - - - ```NOTE: Till further steps are added you can use demo video in the beginning tutorial for further help.``` - -
-

Important Links

+Legacy Applications (SharpAI-Hub CLI) -The yolov7 detector is running in docker, you can access the docker desktop with http://localhost:8000 -Home-Assistant is hosted at http://localhost:8123 -Labelstudio is hosted at http://localhost:8080 -
+These applications use the `sharpai-cli` Docker-based workflow. +For the modern experience, use [SharpAI Aegis](https://www.sharpai.org). -# Application 2: Facial Recognition based intruder detection with local deployment -We received feedback from community, local deployment is needed. With local deepcamera deployment, all information/images will be saved locally. -`sharpai-cli local_deepcamera start` - -# Application 3: DeepCamera Facial Recognition with cloud for free -- Register account on [SharpAI website](http://dp.sharpai.org:3000) -- Login on device: `sharpai-cli login` -- Register device: `sharpai-cli device register` -- Start DeepCamera: `sharpai-cli deepcamera start` - -# [Application 4: Laptop Screen Monitor](https://github.com/SharpAI/laptop_monitor) for kids/teens safe -SharpAI Screen monitor captures screen extract screen image features(embeddings) with AI model, save unseen features(embeddings) into AI vector database [Milvus](https://milvus.io/), raw images are saved to [Labelstudio](https://labelstud.io) for labelling and model training, all information/images will be only saved locally. - -`sharpai-cli screen_monitor start` - -### Access streaming screen: http://localhost:8000 -### Access labelstudio: http://localhost:8080 - -# Application 5: Person Detector -`sharpai-cli yolov7_person_detector start` - - -# SharpAI-Hub AI Applications -SharpAI community is continually working on bringing state-of-the-art computer vision application to your device. - -```sharpai-cli start -``` - -|Application|SharpAI CLI Name| OS/Device | -|---|---|---| -|Intruder detection with Person shape| yolov7_reid | Jetson Nano/AGX /Windows/Linux/MacOS| -|Person Detector| yolov7_person_detector | Jetson Nano/AGX /Windows/Linux/MacOS| -|[Laptop Screen Monitor](https://github.com/SharpAI/laptop_monitor)| screen_monitor | Windows/Linux/MacOS| -|[Facial Recognition Intruder Detection](docs/how_to_run_intruder_detection.md) | deepcamera | Jetson Nano|Windows/Linux/MacOS| -|[Local Facial Recognition Intruder Detection](docs/how_to_run_local_intruder_detection.md) | local_deepcamera | Windows/Linux/MacOS| -|[Parking Lot monitor](docs/Yolo_Parking.md) | yoloparking | Jetson AGX | -|[Fall Detection](docs/FallDetection_with_shinobi.md) | falldetection |Jetson AGX| - -# Tested Devices -## Edge AI Devices / Workstation -- [Jetson Nano (ReComputer j1010)](https://www.seeedstudio.com/Jetson-10-1-H0-p-5335.html) -- Jetson Xavier AGX -- MacOS 12.4 -- Windows 11 -- Ubuntu 20.04 - -## Tested Camera: -- DaHua / Lorex / AMCREST: URL Path: `/cam/realmonitor?channel=1&subtype=0` Port: `554` -- Ip Camera Lite on IOS: URL Path: `/live` Port: `8554` -- Nest Camera indoor/outdoor by Home-Assistant integration - -# Support -- If you are using a camera but have no idea about the RTSP URL, please join SharpAI community for help. -- SharpAI provides commercial support to companies which want to deploy AI Camera application to real world. -## [Click to join sharpai slack channel](https://join.slack.com/t/sharpai/shared_invite/zt-1nt1g0dkg-navTKx6REgeq5L3eoC1Pqg) - -# DeepCamera Architecture -![architecture](screenshots/DeepCamera_infrastructure.png) +| Application | CLI Command | Platforms | +|-------------|-------------|-----------| +| Person Recognition (ReID) | `sharpai-cli yolov7_reid start` | Jetson/Windows/Linux/macOS | +| Person Detector | `sharpai-cli yolov7_person_detector start` | Jetson/Windows/Linux/macOS | +| Facial Recognition | `sharpai-cli deepcamera start` | Jetson/Windows/Linux/macOS | +| Local Facial Recognition | `sharpai-cli local_deepcamera start` | Windows/Linux/macOS | +| Screen Monitor | `sharpai-cli screen_monitor start` | Windows/Linux/macOS | +| Parking Monitor | `sharpai-cli yoloparking start` | Jetson AGX | +| Fall Detection | `sharpai-cli falldetection start` | Jetson AGX | -# [DeepCamera Feature List](docs/DeepCamera_Features.md) +๐Ÿ“– [Detailed setup guides โ†’](docs/legacy-applications.md) -# Commercial Version -- Provide real time pipeline on edge device -- E2E pipeline to support model customization -- Cluster on the edge -- Port to specific edge device/chipset -- Voice application (ASR/KWS) end to end pipeline -- ReID model -- Behavior analysis model -- Transformer model -- Contrastive learning -- [Click to join sharpai slack channel for commercial support](https://sharpai-invite-automation.herokuapp.com/) +#### Tested Devices +- **Edge**: Jetson Nano, Xavier AGX, Raspberry Pi 4/8GB +- **Desktop**: macOS, Windows 11, Ubuntu 20.04 +- **MCU**: ESP32 CAM, ESP32-S3-Eye -# FAQ +#### Tested Cameras +- RTSP: DaHua, Lorex, Amcrest +- Cloud: Blink, Nest (via Home Assistant) +- Mobile: IP Camera Lite (iOS) + + + +--- + +
+

๐Ÿ—๏ธ Architecture

-## ๐Ÿ—๏ธ Architecture ![architecture](screenshots/DeepCamera_infrastructure.png) +[Complete Feature List โ†’](docs/DeepCamera_Features.md) + +
+ ## ๐Ÿค Support & Community -### Community Support -- Join our [Slack Community](https://join.slack.com/t/sharpai/shared_invite/zt-1nt1g0dkg-navTKx6REgeq5L3eoC1Pqg) for help and discussions -- Visit our [GitHub Issues](https://github.com/SharpAI/DeepCamera/issues) for technical support -- Need help with camera setup? Our community is here to assist! - -### Commercial Support -SharpAI offers professional support for enterprise deployments: -- Real-time processing pipeline optimization -- End-to-end model customization -- Edge device clustering -- Hardware-specific optimizations -- Voice application pipelines (ASR/KWS) -- Custom AI model development - - ReID models - - Behavior analysis - - Transformer-based solutions - - Contrastive learning - -[Contact us for commercial support](https://join.slack.com/t/sharpai/shared_invite/zt-1nt1g0dkg-navTKx6REgeq5L3eoC1Pqg) - -## โ“ FAQ - -### Installation & Setup -- [How to install Python3](https://www.python.org/downloads) -- [How to install pip3](https://pip.pypa.io/en/stable/installation) -- [How to configure the web GUI](screenshots/how_to_config_on_web_gui.png) -- [How to configure RTSP on GUI](https://github.com/SharpAI/DeepCamera/blob/master/docs/shinobi.md) -- [Camera streaming URL formats](https://shinobi.video) - -### Device-Specific Setup -#### Jetson Nano Docker-compose Installation -```bash -sudo apt-get install -y libhdf5-dev python3 python3-pip -pip3 install -U pip -sudo pip3 install docker-compose==1.27.4 -``` - -### Additional Resources -- [Complete Feature List](docs/DeepCamera_Features.md) -- [How to Contribute](Contributions.md) +- ๐Ÿ’ฌ [Slack Community](https://join.slack.com/t/sharpai/shared_invite/zt-1nt1g0dkg-navTKx6REgeq5L3eoC1Pqg) โ€” help, discussions, and camera setup assistance +- ๐Ÿ› [GitHub Issues](https://github.com/SharpAI/DeepCamera/issues) โ€” technical support and bug reports +- ๐Ÿข [Commercial Support](https://join.slack.com/t/sharpai/shared_invite/zt-1nt1g0dkg-navTKx6REgeq5L3eoC1Pqg) โ€” pipeline optimization, custom models, edge deployment + ## [Contributions](Contributions.md) diff --git a/docs/legacy-applications.md b/docs/legacy-applications.md new file mode 100644 index 00000000..6db48503 --- /dev/null +++ b/docs/legacy-applications.md @@ -0,0 +1,278 @@ +# Legacy Applications (SharpAI-Hub CLI) + +> **Note:** These applications use the `sharpai-cli` Docker-based workflow. +> For the modern experience, use [SharpAI Aegis](https://www.sharpai.org) โ€” the desktop companion for DeepCamera. + +--- + +## Application 1: Self-supervised Person Recognition (REID) for Intruder Detection + +SharpAI yolov7_reid is an open source python application that leverages AI technologies to detect intruders with traditional surveillance cameras. [Source code](https://github.com/SharpAI/DeepCamera/blob/master/src/yolov7_reid/src/detector_cpu.py) + +It leverages Yolov7 as person detector, FastReID for person feature extraction, Milvus the local vector database for self-supervised learning to identify unseen persons, Labelstudio to host images locally and for further usage such as labeling data and training your own classifier. It also integrates with Home-Assistant to empower smart home with AI technology. + +In simple terms, yolov7_reid is a person detector. + +
+ Machine learning technologies + + - Yolov7 Tiny, pretrained from COCO dataset + - FastReID ResNet50 + - Vector Database Milvus for self-supervised learning +
+ +
+ Supported Devices + + - Nvidia Jetson + - [Nano (ReComputer j1010)](https://www.seeedstudio.com/Jetson-10-1-H0-p-5335.html) + - Xavier AGX + - Single Board Computer (SBC) + - Raspberry Pi 4GB + - Raspberry Pi 8GB + - Intel X64 + - MacOS + - Windows + - Ubuntu + - MCU Camera + - ESP32 CAM + - ESP32-S3-Eye + - Tested Cameras/CCTV/NVR + - RTSP Camera (Lorex/Amrest/DoorBell) + - Blink Camera + - IMOU Camera + - Google Nest (Indoor/Outdoor) +
+ + + +### Installation Guide + +```bash +pip3 install sharpai-hub +sharpai-cli yolov7_reid start +``` + +
+

Prerequisites

+ +1. Docker (Latest version) +2. Python (v3.6 to v3.10 will work fine) +
+ +
+

Step-by-step guide

+ +```NOTE: Before executing any of commands mentioned below please start Docker.``` +```This guide is to install the sharpai and run the yolov7_reid service but can also be used to start other services.``` + +1) Install SharpAI-Hub by running the following command in a Command Prompt and Terminal. Remember this as Command Prompt 1. This will be needed in further steps: + ``` + pip3 install sharpai-hub + ``` +2) Now run the following command: + ``` + sharpai-cli yolov7_reid start + ``` +**NOTE: If in a Windows system after running command mentioned in Step 2 if you get error:** +`'sharpai-cli' is not recognized as an internal or external command, operable program or batch file.` +Then it means environment variable is not set for Python on your system. More on this at the end of page in FAQ section. + +3) If you are using Windows and get error in step 2 you can also use following command line to start yolov7_reid + +``` +python3 -m sharpai_hub.cli yolov7_reid start +``` +OR + +``` +python -m sharpai_hub.cli yolov7_reid start +``` + +4) Go to directory ```C:\Users``` and open the folder with name of current user. Here look for a folder ```.sharpai``` . In ```.sharpai``` folder you will see a folder ```yolov7_reid```. Open it and start a new Command Prompt here. Remember this as ```Command Prompt 2``` + +5) In Command Prompt 2 run the below command: + +``` +docker compose up +``` + +**NOTE: DO NOT TERMINATE THIS COMMAND.** Let it complete. After running the above command it will take roughly 15-20 minutes or even more time to complete depending upon your system specifications and internet speed. After 5-10 minutes of running the command in the images tab of Docker will images will start to appear. If the command ran successful then there must be seven images in images tab plus one container named as `yolov7_reid` in the container tab. + +6) Go to folder ```yolov7_reid``` mentioned in step 4. In this folder there will be file ```.env```. Delete it. Now close the Command Prompt 1. Open and new Command prompt and run the following command again. We will call this as Command Prompt 3. + +``` +sharpai-cli yolov7_reid start +``` +OR + +``` +python3 -m sharpai_hub.cli yolov7_reid start +``` +OR + +``` +python -m sharpai_hub.cli yolov7_reid start +``` + +7) Running command in Step 6 will open a Signup/Signin page in the browser and in Command Prompt it will ask for the Labelstudio Token. After Signing up in you will be taken to your account. At the top right corner you will see a small circle with your account initials. Click on it and after that click on `Account Setting`. Here at the right side of page you will see an Access token. Copy the token and paste it carefully in the command prompt 3. + +8) Add Camera to Home-Assistant, you can use "Generic Camera" to add camera with RTSP url + +9) In this step, we will obtain the camera entity ID of your cameras. After adding your camera to `home-Assistant`, go to the `Overview` tab. Here all your cameras will be listed. Click on the video stream of a camera, after which a small popup will open. At the top right of the popup, click the gear icon to open the settings page. A new popup will open with a few editable properties. Here look for Entity ID, which is in the format `camera.IP_ADDRESS_OF_CAMERA`, copy/note this entity ID (these entity ids will be required later). If you have multiple cameras, we will need each cameras Entity ID. Note all these camera entity IDs. + +10) Run following two commands to open and edit the `configuration.yaml` of Home-Assistant: + +``` +docker exec -ti home-assistant /bin/bash +``` + +``` +vi configuration.yaml +``` + +**NOTE FOR WINDOWS SYSTEM USERS:** These commands won't work with Windows systems. For Windows system, please open Docker (the instance of Docker, which is already running from the start) and in the container tab, open the `yolov7_reid`. Here look for the `home-assistant` container. Hover your mouse cursor on the `home-assistant` container, and a few options will appear. Click on `cli`. An inbuilt console will start on the same page. If the typing cursor keeps blinking and nothing shows up on the inbuilt console, then click on `Open in External Terminal`, which is just above the blinking cursor. After clicking it, a new command prompt will open. To check everything is working as expected, run the command `ls` and see if the commands list the files and folders in the config folder. + +Now run a command `vi configuration.yaml`. This command will open your configuration file of the `home-assistant` in the Vi editor. Vi Editor is a bit tricky if you are unfamiliar with using it. You will now have to enter into Insert mode to add the integration code mentioned in Step 9 to the configuration file. Press the `I` key to enter Insert mode and go end of the file using the down arrow key. Next, press the right mouse (while the mouse cursor is inside the command prompt window) while in the command prompt. This will paste the integration code that you had copied earlier. After making changes to the config file, press the escape key, type the following `:wq` (yes with colon) and press enter key. You will be back taken to `/config #`. This command `:wq` means you want to write changes to the config file and quit. You can now close the command prompt. + +11) Add the below code to the end of `configuration.yaml` file. + +**Here, replace `camera.` with the camera entity ID we obtained in Step 9. If you have multiple cameras then keep adding the `entity_id` under `images_processing`.** + +```yaml +stream: + ll_hls: true + part_duration: 0.75 + segment_duration: 6 + +image_processing: + - platform: sharpai + source: + - entity_id: camera. + scan_interval: 1 +``` + +If you have multiple cameras then after changing the 'entity_id' the code will become similar to this: + +```yaml +stream: + ll_hls: true + part_duration: 0.75 + segment_duration: 6 + +image_processing: + - platform: sharpai + source: + - entity_id: camera.192_168_29_44 + - entity_id: camera.192_168_29_45 + - entity_id: camera.192_168_29_46 + - entity_id: camera.192_168_29_47 + scan_interval: 1 +``` + +12) At `home-assistant` homepage `http://localhost:8123` select `Developer Tools`. Look for and click `Check Configuration` under `Configuration Validation`. If everything went well then it must show "Configuration Valid". Click `Restart`. Now go to the `container` tab of docker, click three vertical dots under `Actions` and press restart. Open the `Overview` tab of `home-assistant`. If you see `Image Processing` beside your cameras and below it `Sharp IP_ADDRESS_OF_YOUR_CAMERA`, then congrats. Everything is working as expected. + +```NOTE: Till further steps are added you can use demo video in the beginning tutorial for further help.``` + +
+ +
+

Important Links

+ +The yolov7 detector is running in docker, you can access the docker desktop with http://localhost:8000 +Home-Assistant is hosted at http://localhost:8123 +Labelstudio is hosted at http://localhost:8080 +
+ +--- + +## Application 2: Facial Recognition Based Intruder Detection (Local) + +We received feedback from the community โ€” local deployment is needed. With local DeepCamera deployment, all information/images will be saved locally. + +```bash +sharpai-cli local_deepcamera start +``` + +--- + +## Application 3: DeepCamera Facial Recognition (Cloud โ€” Free) + +- Register account on [SharpAI website](http://dp.sharpai.org:3000) +- Login on device: `sharpai-cli login` +- Register device: `sharpai-cli device register` +- Start DeepCamera: `sharpai-cli deepcamera start` + +--- + +## Application 4: Laptop Screen Monitor (Child Safety) + +SharpAI Screen monitor captures screen, extracts image features (embeddings) with AI model, saves unseen features into AI vector database [Milvus](https://milvus.io/), and stores raw images to [Labelstudio](https://labelstud.io) for labeling and model training. All information/images are saved locally. + +```bash +sharpai-cli screen_monitor start +``` + +- Access streaming screen: http://localhost:8000 +- Access Labelstudio: http://localhost:8080 + +--- + +## Application 5: Person Detector + +```bash +sharpai-cli yolov7_person_detector start +``` + +--- + +## SharpAI-Hub Application Catalog + +SharpAI community is continually working on bringing state-of-the-art computer vision applications to your device. + +```bash +sharpai-cli start +``` + +| Application | SharpAI CLI Name | OS/Device | +|---|---|---| +| Intruder detection with Person shape | yolov7_reid | Jetson Nano/AGX/Windows/Linux/MacOS | +| Person Detector | yolov7_person_detector | Jetson Nano/AGX/Windows/Linux/MacOS | +| [Laptop Screen Monitor](https://github.com/SharpAI/laptop_monitor) | screen_monitor | Windows/Linux/MacOS | +| [Facial Recognition Intruder Detection](how_to_run_intruder_detection.md) | deepcamera | Jetson Nano/Windows/Linux/MacOS | +| [Local Facial Recognition Intruder Detection](how_to_run_local_intruder_detection.md) | local_deepcamera | Windows/Linux/MacOS | +| [Parking Lot Monitor](Yolo_Parking.md) | yoloparking | Jetson AGX | +| [Fall Detection](FallDetection_with_shinobi.md) | falldetection | Jetson AGX | + +--- + +## Tested Devices + +### Edge AI Devices / Workstation +- [Jetson Nano (ReComputer j1010)](https://www.seeedstudio.com/Jetson-10-1-H0-p-5335.html) +- Jetson Xavier AGX +- MacOS 12.4 +- Windows 11 +- Ubuntu 20.04 + +### Tested Cameras +- DaHua / Lorex / AMCREST: URL Path: `/cam/realmonitor?channel=1&subtype=0` Port: `554` +- IP Camera Lite on iOS: URL Path: `/live` Port: `8554` +- Nest Camera indoor/outdoor by Home-Assistant integration + +--- + +## โ“ FAQ + +### Installation & Setup +- [How to install Python3](https://www.python.org/downloads) +- [How to install pip3](https://pip.pypa.io/en/stable/installation) +- [How to configure RTSP on GUI](https://github.com/SharpAI/DeepCamera/blob/master/docs/shinobi.md) +- [Camera streaming URL formats](https://shinobi.video) + +### Jetson Nano Docker-compose +```bash +sudo apt-get install -y libhdf5-dev python3 python3-pip +pip3 install -U pip +sudo pip3 install docker-compose==1.27.4 +``` diff --git a/docs/skill-params.md b/docs/skill-params.md new file mode 100644 index 00000000..dd8306fc --- /dev/null +++ b/docs/skill-params.md @@ -0,0 +1,104 @@ +# Aegis Skill Platform Parameters + +Aegis automatically injects these environment variables to every skill process. Skills should **not** ask users to configure these โ€” they are provided by the platform. + +## Platform Parameters (auto-injected) + +| Env Var | Type | Description | +|---------|------|-------------| +| `AEGIS_GATEWAY_URL` | `string` | LLM gateway endpoint (e.g. `http://localhost:5407`). Proxies to whatever LLM provider the user has configured (OpenAI, Anthropic, local). Skills should use this for all LLM calls โ€” it handles auth, routing, and model selection. | +| `AEGIS_VLM_URL` | `string` | Local VLM (Vision Language Model) server endpoint (e.g. `http://localhost:5405`). Available when the user has a local VLM running. | +| `AEGIS_SKILL_ID` | `string` | The skill's unique identifier (e.g. `home-security-benchmark`). | +| `AEGIS_SKILL_PARAMS` | `JSON string` | User-configured parameters from `config.yaml` (see below). | +| `AEGIS_PORTS` | `JSON string` | All Aegis service ports as JSON. Use the URL vars above instead of parsing this directly. | + +## User Parameters (from config.yaml) + +Skills can define user-configurable parameters in a `config.yaml` file alongside `SKILL.md`. Aegis parses this at install time and renders a config panel in the UI. User values are passed as JSON via `AEGIS_SKILL_PARAMS`. + +### config.yaml Format + +```yaml +params: + - key: mode + label: Test Mode + type: select + options: [option1, option2, option3] + default: option1 + description: "Human-readable description shown in the config panel" + + - key: verbose + label: Verbose Output + type: boolean + default: false + description: "Enable detailed logging" + + - key: threshold + label: Confidence Threshold + type: number + default: 0.7 + description: "Minimum confidence score (0.0โ€“1.0)" + + - key: apiEndpoint + label: Custom API Endpoint + type: string + default: "" + description: "Optional override for external API" +``` + +Supported types: `string`, `boolean`, `select`, `number` + +### Reading config.yaml in Your Skill + +```javascript +// Node.js โ€” parse AEGIS_SKILL_PARAMS +let skillParams = {}; +try { skillParams = JSON.parse(process.env.AEGIS_SKILL_PARAMS || '{}'); } catch {} + +const mode = skillParams.mode || 'default'; +const verbose = skillParams.verbose || false; +``` + +```python +# Python โ€” parse AEGIS_SKILL_PARAMS +import os, json + +skill_params = json.loads(os.environ.get('AEGIS_SKILL_PARAMS', '{}')) +mode = skill_params.get('mode', 'default') +verbose = skill_params.get('verbose', False) +``` + +### Precedence + +``` +CLI flags > AEGIS_SKILL_PARAMS > Platform env vars > Defaults +``` + +When a skill supports both CLI arguments and `AEGIS_SKILL_PARAMS`, CLI flags should take priority. Platform-injected env vars (like `AEGIS_GATEWAY_URL`) are always available regardless of `config.yaml`. + +## Gateway as Proxy + +The gateway (`AEGIS_GATEWAY_URL`) is an OpenAI-compatible proxy. Skills call it like any OpenAI endpoint โ€” the gateway handles: + +- **API key management** โ€” user configures keys in Aegis settings +- **Provider routing** โ€” OpenAI, Anthropic, local models +- **Model selection** โ€” user picks model in Aegis UI + +Skills should **not** need raw API keys. If a skill needs direct provider access in the future, Aegis will expose additional env vars (`AEGIS_LLM_API_KEY`, `AEGIS_LLM_PROVIDER`, etc.) โ€” but this is not yet implemented. + +### Example: Calling the Gateway + +```javascript +const gatewayUrl = process.env.AEGIS_GATEWAY_URL || 'http://localhost:5407'; + +const response = await fetch(`${gatewayUrl}/v1/chat/completions`, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + messages: [{ role: 'user', content: 'Hello' }], + stream: false, + }), +}); +``` + +No API key header needed โ€” the gateway injects it. diff --git a/screenshots/homesec-bench-results.png b/screenshots/homesec-bench-results.png new file mode 100644 index 00000000..77935be1 Binary files /dev/null and b/screenshots/homesec-bench-results.png differ diff --git a/skills/analysis/home-security-benchmark/SKILL.md b/skills/analysis/home-security-benchmark/SKILL.md index 42cc01a9..c5fba21c 100644 --- a/skills/analysis/home-security-benchmark/SKILL.md +++ b/skills/analysis/home-security-benchmark/SKILL.md @@ -59,6 +59,17 @@ node scripts/run-benchmark.cjs --no-open > **Note**: URLs should be base URLs (e.g. `http://localhost:5405`). The benchmark appends `/v1/chat/completions` automatically. Including a `/v1` suffix is also accepted โ€” it will be stripped to avoid double-pathing. +### User Configuration (config.yaml) + +This skill includes a [`config.yaml`](config.yaml) that defines user-configurable parameters. Aegis parses this at install time and renders a config panel in the UI. Values are delivered via `AEGIS_SKILL_PARAMS`. + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `mode` | select | `llm` | Which suites to run: `llm` (96 tests), `vlm` (35 tests), or `full` (131 tests) | +| `noOpen` | boolean | `false` | Skip auto-opening the HTML report in browser | + +Platform parameters like `AEGIS_GATEWAY_URL` and `AEGIS_VLM_URL` are auto-injected by Aegis โ€” they are **not** in `config.yaml`. See [Aegis Skill Platform Parameters](../../../docs/skill-params.md) for the full platform contract. + ### CLI Arguments (standalone fallback) | Argument | Default | Description | diff --git a/skills/analysis/home-security-benchmark/config.yaml b/skills/analysis/home-security-benchmark/config.yaml new file mode 100644 index 00000000..c643fb7c --- /dev/null +++ b/skills/analysis/home-security-benchmark/config.yaml @@ -0,0 +1,13 @@ +params: + - key: mode + label: Test Mode + type: select + options: [llm, vlm, full] + default: llm + description: "Which test suites to run: llm-only, vlm-only, or full" + + - key: noOpen + label: Don't auto-open report + type: boolean + default: false + description: Skip opening the HTML report in browser after completion diff --git a/skills/analysis/home-security-benchmark/scripts/run-benchmark.cjs b/skills/analysis/home-security-benchmark/scripts/run-benchmark.cjs index 441dd961..bd9483ea 100644 --- a/skills/analysis/home-security-benchmark/scripts/run-benchmark.cjs +++ b/skills/analysis/home-security-benchmark/scripts/run-benchmark.cjs @@ -73,18 +73,19 @@ Tests: 131 total (96 LLM + 35 VLM) across 16 suites process.exit(0); } +// Parse skill parameters if running as Aegis skill +let skillParams = {}; +try { skillParams = JSON.parse(process.env.AEGIS_SKILL_PARAMS || '{}'); } catch { } + // Aegis provides config via env vars; CLI args are fallback for standalone const GATEWAY_URL = process.env.AEGIS_GATEWAY_URL || getArg('gateway', 'http://localhost:5407'); const VLM_URL = process.env.AEGIS_VLM_URL || getArg('vlm', ''); const RESULTS_DIR = getArg('out', path.join(os.homedir(), '.aegis-ai', 'benchmarks')); -const NO_OPEN = args.includes('--no-open'); +const IS_SKILL_MODE = !!process.env.AEGIS_SKILL_ID; +const NO_OPEN = args.includes('--no-open') || skillParams.noOpen || false; +const TEST_MODE = skillParams.mode || 'full'; const TIMEOUT_MS = 30000; const FIXTURES_DIR = path.join(__dirname, '..', 'fixtures'); -const IS_SKILL_MODE = !!process.env.AEGIS_SKILL_ID; - -// Parse skill parameters if running as Aegis skill -let skillParams = {}; -try { skillParams = JSON.parse(process.env.AEGIS_SKILL_PARAMS || '{}'); } catch { } // โ”€โ”€โ”€ Skill Protocol: JSON lines on stdout, human text on stderr โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ @@ -1706,6 +1707,24 @@ async function main() { // Emit ready event (Aegis listens for this) emit({ event: 'ready', model: results.model.name, system: results.system.cpu }); + // Filter suites by test mode (from AEGIS_SKILL_PARAMS or default 'full') + if (TEST_MODE !== 'full') { + const isVlmSuite = (name) => name.includes('VLM Scene') || name.includes('๐Ÿ“ธ'); + const originalCount = suites.length; + if (TEST_MODE === 'llm') { + // Remove VLM image-analysis suites (VLM-to-Alert Triage stays โ€” it's LLM-based text triage) + for (let i = suites.length - 1; i >= 0; i--) { + if (isVlmSuite(suites[i].name)) suites.splice(i, 1); + } + } else if (TEST_MODE === 'vlm') { + // Keep only VLM image-analysis suites (requires VLM URL) + for (let i = suites.length - 1; i >= 0; i--) { + if (!isVlmSuite(suites[i].name)) suites.splice(i, 1); + } + } + log(` Filter: ${TEST_MODE} mode โ†’ ${suites.length}/${originalCount} suites selected`); + } + const suiteStart = Date.now(); await runSuites(); results.totals.timeMs = Date.now() - suiteStart; @@ -1806,9 +1825,14 @@ async function main() { process.exit(failed > 0 ? 1 : 0); } -main().catch(err => { - log(`Fatal: ${err.message}`); - emit({ event: 'error', message: err.message }); - process.exit(1); -}); +// Only run when executed directly (not when require()'d for syntax/import checks) +if (require.main === module) { + main().catch(err => { + log(`Fatal: ${err.message}`); + emit({ event: 'error', message: err.message }); + process.exit(1); + }); +} + +module.exports = { main };