|
54 | 54 |
|
55 | 55 | --- |
56 | 56 |
|
57 | | -### 🗺️ DeepCamera Roadmap |
58 | | - |
59 | | -DeepCamera is evolving into a full AI skill platform. Planned features: |
60 | | - |
61 | | -- [ ] **Upgrade object detection** to 2026 state-of-the-art YOLO models |
62 | | -- [ ] **VLM analysis backend** — offline scene understanding of recorded clips using vision language models |
63 | | -- [ ] **AI Studios backend** — SAM2 interactive segmentation, DINOv3 grounding, depth estimation, feature extraction |
64 | | -- [ ] **Direct camera provider plugins** — Blink, Ring, Eufy, Reolink, Tapo, RTSP/ONVIF (beyond Home Assistant) |
65 | | -- [ ] **Messaging channel plugins** — Telegram, Discord, Slack, WhatsApp |
66 | | -- [ ] **Automation triggers** — MQTT, webhooks, Home Assistant events |
67 | | -- [ ] **go2rtc streaming** — RTSP to WebRTC live view |
68 | | -- [x] **Skill architecture** — pluggable `SKILL.md` interface for all capabilities |
69 | | - |
70 | 57 | --- |
71 | 58 |
|
72 | 59 | ## 🎯 Overview |
73 | 60 |
|
74 | | -DeepCamera transforms traditional surveillance cameras and CCTV/NVR systems into intelligent monitoring solutions using advanced machine learning technologies. It provides: |
| 61 | +DeepCamera is an **open-source AI skill platform** that transforms any camera into an intelligent monitoring system. It provides a growing catalog of pluggable AI skills — from real-time object detection and person re-identification to VLM scene analysis, interactive segmentation, and smart home automation. |
75 | 62 |
|
76 | | -- Open-source facial recognition for intrusion detection |
77 | | -- Fall detection capabilities |
78 | | -- Smart parking lot monitoring |
79 | | -- Local inference engine for privacy and performance |
| 63 | +Each skill is a self-contained module with its own model, parameters, and communication protocol. Skills are installed, configured, and orchestrated through [SharpAI Aegis](https://www.sharpai.org) — the desktop companion that adds LLM-powered setup, agent chat, and smart alerts. |
80 | 64 |
|
81 | | -SharpAI-hub is the cloud platform that enables rapid deployment of AI applications to your CCTV cameras and edge devices. |
| 65 | +### Core Capabilities |
82 | 66 |
|
83 | | -## ✨ Key Features |
| 67 | +- 🔍 **Detection** — YOLO object detection, DINOv3 open-vocabulary grounding, person re-identification (ReID) |
| 68 | +- 🧠 **Analysis** — VLM scene understanding of recorded clips, SAM2 interactive segmentation |
| 69 | +- 🎨 **Transformation** — Depth Anything v2 real-time depth maps |
| 70 | +- 🏷️ **Annotation** — AI-assisted dataset creation with COCO export |
| 71 | +- 📷 **Camera Providers** — Eufy, Reolink, Tapo (RTSP/ONVIF) |
| 72 | +- 📺 **Streaming** — Multi-camera RTSP → WebRTC via go2rtc |
| 73 | +- 💬 **Channels** — Matrix, LINE, Signal messaging for the Clawdbot agent |
| 74 | +- ⚡ **Automation** — MQTT, webhooks, Home Assistant triggers |
| 75 | +- 🏠 **Integrations** — Bidirectional Home Assistant bridge |
84 | 76 |
|
85 | | -### 🤖 Advanced AI Capabilities |
86 | | -- Facial Recognition |
87 | | -- Person Re-identification (RE-ID) |
88 | | -- Parking Space Management |
89 | | -- Fall Detection |
90 | | -- More features in development |
| 77 | +## 🧩 Skill Catalog |
91 | 78 |
|
92 | | -### 📊 Professional ML Pipeline |
93 | | -- Feature clustering with Milvus vector database |
94 | | -- Data labeling with Labelstudio |
95 | | -- Comprehensive model training workflow |
| 79 | +Every skill lives in [`skills/`](skills/) with a `SKILL.md` manifest, `requirements.txt`, and working Python script. See the [Skill Development Guide](docs/skill-development.md) to build your own. |
96 | 80 |
|
97 | | -### 💻 Edge AI Development |
98 | | -- Containerized AI frameworks |
99 | | -- Browser-based desktop environment |
100 | | -- No VNC client installation needed |
| 81 | +| Category | Skill | What It Does | |
| 82 | +|----------|-------|--------------| |
| 83 | +| **Detection** | [`yolo-detection-2026`](skills/detection/yolo-detection-2026/) | Real-time 80+ class object detection | |
| 84 | +| | [`dinov3-grounding`](skills/detection/dinov3-grounding/) | Open-vocabulary detection — describe what to find | |
| 85 | +| | [`person-recognition`](skills/detection/person-recognition/) | Re-identify individuals across cameras | |
| 86 | +| **Analysis** | [`vlm-scene-analysis`](skills/analysis/vlm-scene-analysis/) | Describe what happened in recorded clips | |
| 87 | +| | [`sam2-segmentation`](skills/analysis/sam2-segmentation/) | Click-to-segment with pixel-perfect masks | |
| 88 | +| **Transformation** | [`depth-estimation`](skills/transformation/depth-estimation/) | Monocular depth maps with Depth Anything v2 | |
| 89 | +| **Annotation** | [`dataset-annotation`](skills/annotation/dataset-annotation/) | AI-assisted labeling → COCO export | |
| 90 | +| **Camera Providers** | [`eufy`](skills/camera-providers/eufy/) · [`reolink`](skills/camera-providers/reolink/) · [`tapo`](skills/camera-providers/tapo/) | Direct camera integrations via RTSP | |
| 91 | +| **Streaming** | [`go2rtc-cameras`](skills/streaming/go2rtc-cameras/) | RTSP → WebRTC live view | |
| 92 | +| **Channels** | [`matrix`](skills/channels/matrix/) · [`line`](skills/channels/line/) · [`signal`](skills/channels/signal/) | Messaging channels for Clawdbot agent | |
| 93 | +| **Automation** | [`mqtt`](skills/automation/mqtt/) · [`webhook`](skills/automation/webhook/) · [`ha-trigger`](skills/automation/ha-trigger/) | Event-driven automation triggers | |
| 94 | +| **Integrations** | [`homeassistant-bridge`](skills/integrations/homeassistant-bridge/) | HA cameras in ↔ detection results out | |
101 | 95 |
|
102 | | -DeepCamera empowers your traditional surveillance cameras and CCTV/NVR with machine learning technologies. |
103 | | -It provides open source facial recognition based intrusion detection, fall detection and parking lot monitoring with the inference engine on your local device. |
| 96 | +> **Registry:** All skills are indexed in [`skills.json`](skills.json) for programmatic discovery. |
104 | 97 |
|
105 | | -SharpAI-hub is the cloud hosting for AI applications which help you deploy AI applications with your CCTV camera on your edge device in minutes. |
| 98 | +### 🗺️ Roadmap |
106 | 99 |
|
107 | | -<details> |
108 | | - <summary><h1>Features</h1></summary> |
109 | | - |
110 | | - ## Empower any camera with the state of the art AI |
111 | | - - facial recognition |
112 | | - - person recognition(RE-ID) |
113 | | - - parking lot management |
114 | | - - fall detection |
115 | | - - more comming |
116 | | - ## ML pipeline for AI camera/CCTV development |
117 | | - - feature clustering with vector database Milvus |
118 | | - - labelling with Labelstudio |
119 | | - ## Easy to use Edge AI development environment |
120 | | - - AI frameworks in docker |
121 | | - - desktop in docker with web vnc client, so you don't need even install vnc client |
122 | | -</details> |
| 100 | +- [x] **Skill architecture** — pluggable `SKILL.md` interface for all capabilities |
| 101 | +- [x] **Full skill catalog** — 18 skills across 9 categories with working scripts |
| 102 | +- [ ] **Skill Store UI** — browse, install, and configure skills from Aegis |
| 103 | +- [ ] **Custom skill packaging** — community-contributed skills via GitHub |
| 104 | +- [ ] **GPU-optimized containers** — one-click Docker deployment per skill |
123 | 105 |
|
124 | 106 | ## 🚀 Applications |
125 | 107 |
|
|
0 commit comments