Skip to content

BLACK0X80/CORTEX-AI

Repository files navigation

CORTEX AI Logo

CORTEX AI

Enterprise-Grade AI Platform for the Future

Python FastAPI PyTorch Docker Kubernetes

License Build Coverage PRs Welcome

Typing SVG


line

Overview

CORTEX AI is a comprehensive enterprise-grade AI platform delivering production-ready components for:

  • Natural Language Processing
  • Computer Vision
  • Predictive Analytics
  • Workflow Automation

Built with cutting-edge technologies and designed for massive scale.

   ██████╗ ██████╗ ██████╗ ████████╗███████╗██╗  ██╗
  ██╔════╝██╔═══██╗██╔══██╗╚══██╔══╝██╔════╝╚██╗██╔╝
  ██║     ██║   ██║██████╔╝   ██║   █████╗   ╚███╔╝ 
  ██║     ██║   ██║██╔══██╗   ██║   ██╔══╝   ██╔██╗ 
  ╚██████╗╚██████╔╝██║  ██║   ██║   ███████╗██╔╝ ██╗
   ╚═════╝ ╚═════╝ ╚═╝  ╚═╝   ╚═╝   ╚══════╝╚═╝  ╚═╝
                         A I

line

Table of Contents


Features

NLP Engine Vision AI Analytics Automation
Text Analysis Object Detection Forecasting Task Orchestration
Sentiment Analysis Image Classification Anomaly Detection Workflow Engine
Text Classification Segmentation Clustering Decision Engine
Question Answering Face Recognition Recommendations Event Processing
Text Generation OCR Regression Webhooks
Embeddings Enhancement Classification API Gateway

Detailed Feature Breakdown

NLP Engine

  • Text Analysis: Tokenization, Named Entity Recognition (NER), Part-of-Speech (POS) tagging using spaCy
  • Sentiment Analysis: Multi-class sentiment classification using fine-tuned BERT models
  • Text Classification: Zero-shot classification with BART-large-MNLI for any label set
  • Question Answering: Extractive QA using RoBERTa trained on SQuAD 2.0
  • Text Generation: GPT-2 based text generation with temperature and top-p sampling
  • Embeddings: Sentence-BERT embeddings for semantic search and similarity

Vision AI

  • Object Detection: YOLOv8 for real-time object detection with 80+ classes
  • Image Classification: ResNet-50 pretrained on ImageNet for 1000 categories
  • Semantic Segmentation: Instance and semantic segmentation with YOLO-seg
  • Face Recognition: Haar cascade detection with face encoding for comparison
  • OCR: EasyOCR supporting 80+ languages for text extraction
  • Image Enhancement: Brightness, contrast, denoising, sharpening, histogram equalization

Analytics

  • Time Series Forecasting: Prophet and ARIMA for demand forecasting
  • Anomaly Detection: Isolation Forest and statistical Z-score methods
  • Clustering: K-Means with automatic optimal cluster detection
  • Recommendations: Collaborative filtering and content-based recommendations
  • Regression: Linear, Ridge, and Random Forest regression
  • Classification: XGBoost for tabular classification with feature importance

Architecture

graph TB
    subgraph Client Layer
        A[REST API] --> B[SDK]
        A --> C[WebSocket]
    end
    
    subgraph API Gateway
        D[Auth] --> E[Load Balancer]
        E --> F[Rate Limiter]
    end
    
    subgraph AI Services
        G[NLP] --> H[Model Serving]
        I[Vision] --> H
        J[Analytics] --> H
    end
    
    subgraph Infrastructure
        K[(PostgreSQL)]
        L[(MongoDB)]
        M[(Redis)]
        N[MinIO]
    end
    
    A --> D
    F --> G
    F --> I
    F --> J
    H --> K
    H --> L
    H --> M
    H --> N
    
    style A fill:#DC143C,color:#fff
    style G fill:#DC143C,color:#fff
    style I fill:#DC143C,color:#fff
    style J fill:#DC143C,color:#fff
    style H fill:#000,color:#DC143C
Loading

System Components

Component Technology Purpose
API Server FastAPI High-performance async REST API
SQL Database PostgreSQL 15 + TimescaleDB User data, models metadata, audit logs
NoSQL Database MongoDB 6 Document storage, embeddings, unstructured data
Cache Redis 7 Session management, caching, rate limiting
Object Storage MinIO Model artifacts, images, files
Message Queue Kafka Event streaming, async processing
Task Queue Celery Background job processing
Search Elasticsearch Full-text search, log aggregation
Monitoring Prometheus + Grafana Metrics and dashboards

Installation

Prerequisites

  • Python 3.11 or higher
  • Docker and Docker Compose
  • 16GB RAM minimum (32GB recommended for full ML capabilities)
  • NVIDIA GPU (optional, for accelerated inference)

Option 1: Docker (Recommended)

# Clone the repository
git clone https://github.com/BLACK0X80/cortex-ai.git
cd cortex-ai

# Start all services
docker-compose up -d

# Verify services are running
docker-compose ps

# Access API docs
open http://localhost:8000/docs

Option 2: Local Development

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -e ".[dev]"

# Download spaCy model
python -m spacy download en_core_web_sm

# Start infrastructure (PostgreSQL, Redis, MongoDB, MinIO)
docker-compose up -d postgres redis mongo minio

# Initialize database
python -c "from cortex.storage.postgres.database import init_db; import asyncio; asyncio.run(init_db())"

# Run the API
uvicorn cortex.api.main:app --reload --host 0.0.0.0 --port 8000

Option 3: Kubernetes

# Add Helm repository (if using external charts)
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

# Deploy CORTEX AI
helm install cortex-ai ./helm/cortex-ai -f ./helm/cortex-ai/values.yaml

# Check deployment status
kubectl get pods -l app=cortex-api
kubectl get svc cortex-api

Quick Start

1. Start the Services

docker-compose up -d

2. Register a User

curl -X POST "http://localhost:8000/api/v1/auth/register" \
  -H "Content-Type: application/json" \
  -d '{
    "username": "testuser",
    "email": "test@example.com",
    "password": "SecurePass123!"
  }'

3. Get Access Token

curl -X POST "http://localhost:8000/api/v1/auth/login" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=testuser&password=SecurePass123!"

4. Use AI Services

# Analyze text
curl -X POST "http://localhost:8000/api/v1/nlp/analyze" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Apple Inc. was founded by Steve Jobs in California.",
    "tasks": ["tokenize", "ner", "pos"]
  }'

# Sentiment analysis
curl -X POST "http://localhost:8000/api/v1/nlp/sentiment" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "This product is absolutely amazing! Best purchase ever."
  }'

API Reference

Authentication

All API endpoints (except /auth/register and /auth/login) require JWT authentication.

# Include the token in Authorization header
-H "Authorization: Bearer YOUR_ACCESS_TOKEN"

NLP Endpoints

Method Endpoint Description
POST /api/v1/nlp/analyze Text Analysis & NER
POST /api/v1/nlp/sentiment Sentiment Analysis
POST /api/v1/nlp/classify Text Classification
POST /api/v1/nlp/embed Generate Embeddings
POST /api/v1/nlp/qa Question Answering
POST /api/v1/nlp/generate Text Generation

Vision Endpoints

Method Endpoint Description
POST /api/v1/vision/detect Object Detection
POST /api/v1/vision/classify Image Classification
POST /api/v1/vision/segment Semantic Segmentation
POST /api/v1/vision/face/detect Face Detection
POST /api/v1/vision/face/encode Face Encoding
POST /api/v1/vision/ocr Text Extraction
POST /api/v1/vision/enhance Image Enhancement

Analytics Endpoints

Method Endpoint Description
POST /api/v1/analytics/forecast Time Series Forecast
POST /api/v1/analytics/anomaly Anomaly Detection
POST /api/v1/analytics/classify Tabular Classification
POST /api/v1/analytics/recommend Recommendations
POST /api/v1/analytics/cluster Clustering
POST /api/v1/analytics/regression Regression

Model Management

Method Endpoint Description
GET /api/v1/models List Models
POST /api/v1/models/register Register Model
GET /api/v1/models/{id} Get Model Info
POST /api/v1/models/{id}/upload Upload Artifact
PUT /api/v1/models/{id}/stage Update Stage
POST /api/v1/models/{id}/predict Run Inference

Automation

Method Endpoint Description
POST /api/v1/automation/tasks Create Task
GET /api/v1/automation/tasks List Tasks
POST /api/v1/automation/tasks/{id}/run Run Task
POST /api/v1/automation/workflows Create Workflow
POST /api/v1/automation/workflows/{id}/run Execute Workflow
POST /api/v1/automation/decision Evaluate Rules

NLP Services

Text Analysis

Perform comprehensive text analysis including tokenization, NER, and POS tagging.

import httpx

async def analyze_text():
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "http://localhost:8000/api/v1/nlp/analyze",
            headers={"Authorization": f"Bearer {token}"},
            json={
                "text": "Google announced new AI features at their Mountain View headquarters.",
                "tasks": ["tokenize", "ner", "pos"]
            }
        )
        result = response.json()
        
        # Entities found: [{"text": "Google", "label": "ORG"}, {"text": "Mountain View", "label": "GPE"}]
        print(result["entities"])

Sentiment Analysis

Classify text sentiment with confidence scores.

response = await client.post(
    "http://localhost:8000/api/v1/nlp/sentiment",
    headers={"Authorization": f"Bearer {token}"},
    json={"text": "I absolutely love this product! It exceeded all my expectations."}
)

# Result: {"sentiment": "very_positive", "score": 1.0, "confidence": 0.94}

Zero-Shot Classification

Classify text into custom categories without training.

response = await client.post(
    "http://localhost:8000/api/v1/nlp/classify",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "text": "The new iPhone 15 has an amazing camera system with improved low-light performance.",
        "labels": ["technology", "sports", "politics", "entertainment"]
    }
)

# Result: {"predicted_label": "technology", "scores": {"technology": 0.92, "entertainment": 0.05, ...}}

Question Answering

Extract answers from context.

response = await client.post(
    "http://localhost:8000/api/v1/nlp/qa",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "question": "Who founded Microsoft?",
        "context": "Microsoft Corporation was founded by Bill Gates and Paul Allen on April 4, 1975."
    }
)

# Result: {"answer": "Bill Gates and Paul Allen", "confidence": 0.89}

Semantic Embeddings

Generate vector embeddings for semantic search.

response = await client.post(
    "http://localhost:8000/api/v1/nlp/embed",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "texts": ["Machine learning is a subset of AI", "Deep learning uses neural networks"],
        "model": "sentence-transformers/all-MiniLM-L6-v2"
    }
)

# Result: {"embeddings": [[0.123, -0.456, ...], [...]], "dimensions": 384}

Vision Services

Object Detection

Detect objects in images using YOLOv8.

with open("image.jpg", "rb") as f:
    response = await client.post(
        "http://localhost:8000/api/v1/vision/detect",
        headers={"Authorization": f"Bearer {token}"},
        files={"file": f},
        data={"confidence_threshold": 0.5}
    )

# Result: {"objects": [{"label": "person", "confidence": 0.95, "bbox": [100, 50, 300, 400]}], "count": 1}

Image Classification

Classify images into 1000 ImageNet categories.

with open("cat.jpg", "rb") as f:
    response = await client.post(
        "http://localhost:8000/api/v1/vision/classify",
        headers={"Authorization": f"Bearer {token}"},
        files={"file": f}
    )

# Result: {"top_class": "Egyptian_cat", "confidence": 0.87, "predictions": [...]}

OCR (Text Extraction)

Extract text from images in 80+ languages.

with open("document.png", "rb") as f:
    response = await client.post(
        "http://localhost:8000/api/v1/vision/ocr",
        headers={"Authorization": f"Bearer {token}"},
        files={"file": f},
        data={"languages": "en,ar"}
    )

# Result: {"text": "Extracted text content...", "blocks": [...], "confidence": 0.91}

Analytics Services

Time Series Forecasting

Predict future values using Prophet.

response = await client.post(
    "http://localhost:8000/api/v1/analytics/forecast",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "data": [
            {"date": "2024-01-01", "value": 100},
            {"date": "2024-01-02", "value": 120},
            {"date": "2024-01-03", "value": 115},
            # ... more data points
        ],
        "date_column": "date",
        "value_column": "value",
        "periods": 30,
        "frequency": "D"
    }
)

# Result: {"forecast": [{"date": "2024-02-01", "prediction": 135.5, "lower_bound": 120, "upper_bound": 150}...]}

Anomaly Detection

Detect outliers in data using Isolation Forest.

response = await client.post(
    "http://localhost:8000/api/v1/analytics/anomaly",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "data": [
            {"cpu": 45, "memory": 60},
            {"cpu": 50, "memory": 65},
            {"cpu": 95, "memory": 98},  # Anomaly
            {"cpu": 48, "memory": 62}
        ],
        "columns": ["cpu", "memory"],
        "contamination": 0.1
    }
)

# Result: {"anomalies": [{"cpu": 95, "memory": 98, "index": 2, "score": -0.85}], "anomaly_count": 1}

Clustering

Group similar data points using K-Means.

response = await client.post(
    "http://localhost:8000/api/v1/analytics/cluster",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "data": [
            {"age": 25, "income": 30000, "spending": 500},
            {"age": 35, "income": 60000, "spending": 1200},
            # ... more data
        ],
        "feature_columns": ["age", "income", "spending"],
        "n_clusters": 3
    }
)

# Result: {"clusters": [...], "centroids": [...], "inertia": 1234.5}

Model Registry

Register a Model

response = await client.post(
    "http://localhost:8000/api/v1/models/register",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "name": "customer-churn-predictor",
        "version": "1.0.0",
        "framework": "xgboost",
        "description": "Predicts customer churn probability",
        "metrics": {"accuracy": 0.92, "f1_score": 0.89}
    }
)

Upload Model Artifact

with open("model.pkl", "rb") as f:
    response = await client.post(
        f"http://localhost:8000/api/v1/models/{model_id}/upload",
        headers={"Authorization": f"Bearer {token}"},
        files={"file": ("model.pkl", f)}
    )

Promote to Production

response = await client.put(
    f"http://localhost:8000/api/v1/models/{model_id}/stage",
    headers={"Authorization": f"Bearer {token}"},
    params={"stage": "production"}
)

Automation & Workflows

Create a Workflow

response = await client.post(
    "http://localhost:8000/api/v1/automation/workflows",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "name": "data-pipeline",
        "steps": [
            {"name": "analyze", "type": "nlp_analysis", "parameters": {"tasks": ["ner"]}},
            {"name": "classify", "type": "nlp_analysis", "parameters": {"tasks": ["sentiment"]}}
        ]
    }
)

Decision Engine

Evaluate business rules dynamically.

response = await client.post(
    "http://localhost:8000/api/v1/automation/decision",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "rules": [
            {
                "name": "high-value-customer",
                "conditions": [
                    {"field": "purchase_amount", "operator": "gte", "value": 1000},
                    {"field": "customer_tier", "operator": "eq", "value": "gold"}
                ],
                "action": {"discount": 0.20, "priority": "high"}
            }
        ],
        "context": {
            "purchase_amount": 1500,
            "customer_tier": "gold"
        }
    }
)

# Result: {"matched_count": 1, "actions": [{"discount": 0.20, "priority": "high"}]}

Security

Authentication

CORTEX AI uses JWT (JSON Web Tokens) for authentication.

# Login to get token
response = await client.post(
    "http://localhost:8000/api/v1/auth/login",
    data={"username": "user", "password": "pass"}
)
token = response.json()["access_token"]

# Use token in subsequent requests
headers = {"Authorization": f"Bearer {token}"}

Role-Based Access Control (RBAC)

Role Permissions
admin Full access to all resources
user Read, write, API access
service Read, API access, model deployment

API Key Management

Generate API keys for programmatic access:

from cortex.security.encryption import EncryptionService

key, key_hash = EncryptionService.generate_api_key()
# Store key_hash in database, give key to user

Monitoring

Prometheus Metrics

Access metrics at http://localhost:8000/metrics:

  • cortex_requests_total - Total request count
  • cortex_request_latency_seconds - Request latency histogram
  • cortex_model_inference_total - Model inference count
  • cortex_model_inference_latency_seconds - Inference latency

Health Checks

curl http://localhost:8000/health
# {"status": "healthy", "service": "CORTEX AI"}

Grafana Dashboards

Access Grafana at http://localhost:3000 (admin/admin) to view:

  • API request rates and latencies
  • Model inference performance
  • Infrastructure health

Testing

Run All Tests

# Run unit tests
pytest tests/ -v --cov=cortex --cov-report=html

# Run specific test file
pytest tests/test_nlp.py -v

# Run with markers
pytest -m "not slow" -v

Test Examples

API Tests

# tests/test_api.py
import pytest
from httpx import AsyncClient, ASGITransport
from cortex.api.main import app


@pytest.fixture
async def client():
    transport = ASGITransport(app=app)
    async with AsyncClient(transport=transport, base_url="http://test") as ac:
        yield ac


@pytest.mark.anyio
async def test_health_check(client):
    response = await client.get("/health")
    assert response.status_code == 200
    assert response.json()["status"] == "healthy"


@pytest.mark.anyio
async def test_root_endpoint(client):
    response = await client.get("/")
    assert response.status_code == 200
    assert "service" in response.json()
    assert response.json()["service"] == "CORTEX AI"


@pytest.mark.anyio
async def test_openapi_docs(client):
    response = await client.get("/docs")
    assert response.status_code == 200


@pytest.mark.anyio
async def test_unauthorized_access(client):
    response = await client.post("/api/v1/nlp/analyze", json={"text": "test"})
    assert response.status_code == 401

NLP Service Tests

# tests/test_nlp.py
import pytest
from cortex.nlp.analysis.text_analyzer import TextAnalyzer
from cortex.nlp.sentiment.sentiment_analyzer import SentimentAnalyzer
from cortex.nlp.classification.text_classifier import TextClassifier
from cortex.nlp.embeddings.embedding_service import EmbeddingService
from cortex.nlp.qa.qa_service import QAService


class TestTextAnalyzer:
    @pytest.fixture
    def analyzer(self):
        return TextAnalyzer()

    @pytest.mark.asyncio
    async def test_tokenization(self, analyzer):
        result = await analyzer.analyze("Hello world", tasks=["tokenize"])
        assert "tokens" in result
        assert len(result["tokens"]) == 2
        assert result["tokens"][0]["text"] == "Hello"

    @pytest.mark.asyncio
    async def test_named_entity_recognition(self, analyzer):
        result = await analyzer.analyze(
            "Apple Inc. is headquartered in Cupertino, California.",
            tasks=["ner"]
        )
        assert "entities" in result
        entities = [e["text"] for e in result["entities"]]
        assert "Apple Inc." in entities or "Apple" in entities

    @pytest.mark.asyncio
    async def test_pos_tagging(self, analyzer):
        result = await analyzer.analyze("The quick brown fox", tasks=["pos"])
        assert "pos_tags" in result
        assert len(result["pos_tags"]) == 4

    @pytest.mark.asyncio
    async def test_keyword_extraction(self, analyzer):
        text = "Machine learning and artificial intelligence are transforming technology."
        keywords = await analyzer.extract_keywords(text, top_n=3)
        assert len(keywords) <= 3
        assert all("keyword" in k for k in keywords)


class TestSentimentAnalyzer:
    @pytest.fixture
    def analyzer(self):
        return SentimentAnalyzer()

    @pytest.mark.asyncio
    async def test_positive_sentiment(self, analyzer):
        result = await analyzer.analyze("This is absolutely wonderful!")
        assert result["sentiment"] in ["positive", "very_positive"]
        assert result["score"] > 0

    @pytest.mark.asyncio
    async def test_negative_sentiment(self, analyzer):
        result = await analyzer.analyze("This is terrible and disappointing.")
        assert result["sentiment"] in ["negative", "very_negative"]
        assert result["score"] < 0

    @pytest.mark.asyncio
    async def test_confidence_score(self, analyzer):
        result = await analyzer.analyze("I love this product!")
        assert 0 <= result["confidence"] <= 1


class TestTextClassifier:
    @pytest.fixture
    def classifier(self):
        return TextClassifier()

    @pytest.mark.asyncio
    async def test_zero_shot_classification(self, classifier):
        result = await classifier.classify(
            "The stock market reached new highs today.",
            labels=["finance", "sports", "technology", "health"]
        )
        assert result["predicted_label"] == "finance"
        assert "scores" in result
        assert result["scores"]["finance"] > 0.5

    @pytest.mark.asyncio
    async def test_multi_label_classification(self, classifier):
        result = await classifier.multi_label_classify(
            "AI is revolutionizing healthcare with new diagnostic tools.",
            labels=["technology", "health", "sports", "politics"],
            threshold=0.3
        )
        assert "technology" in result["predicted_labels"]
        assert "health" in result["predicted_labels"]


class TestEmbeddingService:
    @pytest.fixture
    def service(self):
        return EmbeddingService()

    @pytest.mark.asyncio
    async def test_single_embedding(self, service):
        embedding = await service.encode_single("Hello world")
        assert isinstance(embedding, list)
        assert len(embedding) > 0

    @pytest.mark.asyncio
    async def test_batch_embedding(self, service):
        texts = ["First sentence", "Second sentence", "Third sentence"]
        embeddings = await service.encode(texts)
        assert len(embeddings) == 3
        assert all(len(e) == len(embeddings[0]) for e in embeddings)

    @pytest.mark.asyncio
    async def test_similarity_calculation(self, service):
        similarity = await service.similarity(
            "The cat sat on the mat",
            "A cat is sitting on a mat"
        )
        assert 0.5 < similarity <= 1.0

    @pytest.mark.asyncio
    async def test_similar_document_search(self, service):
        documents = [
            "Python is a programming language",
            "Java is used for enterprise applications",
            "Machine learning uses statistical methods",
            "Cats are popular pets"
        ]
        results = await service.find_similar("coding in Python", documents, top_k=2)
        assert len(results) == 2
        assert "Python" in results[0]["document"]


class TestQAService:
    @pytest.fixture
    def service(self):
        return QAService()

    @pytest.mark.asyncio
    async def test_simple_qa(self, service):
        result = await service.answer(
            question="What is the capital of France?",
            context="Paris is the capital and largest city of France."
        )
        assert "Paris" in result["answer"]
        assert result["confidence"] > 0.5

    @pytest.mark.asyncio
    async def test_qa_with_dates(self, service):
        result = await service.answer(
            question="When was the company founded?",
            context="TechCorp was founded in 2010 by John Smith in Silicon Valley."
        )
        assert "2010" in result["answer"]

Analytics Tests

# tests/test_analytics.py
import pytest
import numpy as np
from cortex.analytics.anomaly.detector import AnomalyDetector
from cortex.analytics.clustering.clusterer import Clusterer
from cortex.analytics.recommendation.recommender import Recommender
from cortex.analytics.regression.regressor import Regressor


class TestAnomalyDetector:
    @pytest.fixture
    def detector(self):
        return AnomalyDetector()

    @pytest.mark.asyncio
    async def test_isolation_forest_detection(self, detector):
        data = [{"value": i} for i in range(20)] + [{"value": 100}]
        result = await detector.detect(data, ["value"], contamination=0.1)
        assert len(result["anomalies"]) >= 1
        anomaly_values = [a["value"] for a in result["anomalies"]]
        assert 100 in anomaly_values

    @pytest.mark.asyncio
    async def test_statistical_detection(self, detector):
        data = [{"value": 50 + np.random.randn()} for _ in range(50)]
        data.append({"value": 200})  # Clear outlier
        result = await detector.detect_statistical(data, "value", threshold=3.0)
        assert len(result["anomalies"]) >= 1

    @pytest.mark.asyncio
    async def test_multivariate_detection(self, detector):
        data = [
            {"x": 10, "y": 10},
            {"x": 12, "y": 11},
            {"x": 11, "y": 12},
            {"x": 100, "y": 100},  # Anomaly
            {"x": 10, "y": 11}
        ]
        result = await detector.detect(data, ["x", "y"], contamination=0.2)
        assert result["anomalies"][0]["x"] == 100


class TestClusterer:
    @pytest.fixture
    def clusterer(self):
        return Clusterer()

    @pytest.mark.asyncio
    async def test_kmeans_clustering(self, clusterer):
        data = [
            {"x": 1, "y": 1}, {"x": 1.5, "y": 1.5}, {"x": 2, "y": 1},
            {"x": 10, "y": 10}, {"x": 10.5, "y": 10.5}, {"x": 11, "y": 10}
        ]
        result = await clusterer.cluster(data, ["x", "y"], n_clusters=2)
        assert len(result["centroids"]) == 2
        assert "inertia" in result

    @pytest.mark.asyncio
    async def test_optimal_cluster_finding(self, clusterer):
        data = [{"x": i % 3, "y": i % 3 + np.random.randn() * 0.1} for i in range(30)]
        result = await clusterer.find_optimal_clusters(data, ["x", "y"], max_clusters=5)
        assert 2 <= result["optimal_clusters"] <= 5
        assert len(result["silhouette_scores"]) == 4


class TestRecommender:
    @pytest.fixture
    def recommender(self):
        return Recommender()

    @pytest.mark.asyncio
    async def test_collaborative_filtering(self, recommender):
        items = [{"id": f"item_{i}"} for i in range(10)]
        interactions = [
            {"user_id": "user_1", "item_id": "item_0", "rating": 5},
            {"user_id": "user_1", "item_id": "item_1", "rating": 4},
            {"user_id": "user_2", "item_id": "item_0", "rating": 5},
            {"user_id": "user_2", "item_id": "item_2", "rating": 4},
            {"user_id": "user_3", "item_id": "item_0", "rating": 4},
            {"user_id": "user_3", "item_id": "item_3", "rating": 5}
        ]
        result = await recommender.recommend("user_1", items, interactions, n_recommendations=3)
        assert len(result["recommendations"]) <= 3
        assert "item_0" not in result["recommendations"]  # Already interacted

    @pytest.mark.asyncio
    async def test_content_based_filtering(self, recommender):
        items = [
            {"id": "a", "genre": 1, "year": 2020, "rating": 4.5},
            {"id": "b", "genre": 1, "year": 2021, "rating": 4.2},
            {"id": "c", "genre": 2, "year": 2019, "rating": 3.8},
            {"id": "d", "genre": 1, "year": 2020, "rating": 4.4}
        ]
        result = await recommender.content_based("a", items, ["genre", "year", "rating"], n_recommendations=2)
        assert "d" in result["recommendations"]  # Should be similar to 'a'


class TestRegressor:
    @pytest.fixture
    def regressor(self):
        return Regressor()

    @pytest.mark.asyncio
    async def test_linear_regression(self, regressor):
        data = [{"x": i, "y": 2 * i + 1 + np.random.randn() * 0.1} for i in range(100)]
        result = await regressor.train_and_predict(data, "y", ["x"], model_type="linear")
        assert result["r2_score"] > 0.9
        assert "mse" in result

    @pytest.mark.asyncio
    async def test_random_forest_regression(self, regressor):
        data = [{"x1": i, "x2": i * 2, "y": i * 3 + 5} for i in range(50)]
        result = await regressor.train_and_predict(data, "y", ["x1", "x2"], model_type="random_forest")
        assert result["r2_score"] > 0.8
        assert "feature_importance" in result

Vision Tests

# tests/test_vision.py
import pytest
import cv2
import numpy as np
from cortex.vision.detection.object_detector import ObjectDetector
from cortex.vision.classification.image_classifier import ImageClassifier
from cortex.vision.ocr.ocr_service import OCRService
from cortex.vision.face.face_service import FaceService
from cortex.vision.enhancement.enhancer import ImageEnhancer


def create_test_image(width=224, height=224):
    img = np.random.randint(0, 255, (height, width, 3), dtype=np.uint8)
    _, buffer = cv2.imencode('.jpg', img)
    return buffer.tobytes()


class TestObjectDetector:
    @pytest.fixture
    def detector(self):
        return ObjectDetector()

    @pytest.mark.asyncio
    async def test_detection_returns_objects(self, detector):
        image_data = create_test_image()
        result = await detector.detect(image_data, confidence_threshold=0.1)
        assert "objects" in result
        assert isinstance(result["objects"], list)

    @pytest.mark.asyncio
    async def test_count_objects(self, detector):
        image_data = create_test_image()
        result = await detector.count_objects(image_data)
        assert "total" in result
        assert "by_class" in result


class TestImageClassifier:
    @pytest.fixture
    def classifier(self):
        return ImageClassifier()

    @pytest.mark.asyncio
    async def test_classification_returns_predictions(self, classifier):
        image_data = create_test_image()
        result = await classifier.classify(image_data, top_k=5)
        assert "predictions" in result
        assert len(result["predictions"]) == 5
        assert "top_class" in result
        assert 0 <= result["confidence"] <= 1


class TestImageEnhancer:
    @pytest.fixture
    def enhancer(self):
        return ImageEnhancer()

    @pytest.mark.asyncio
    async def test_auto_enhancement(self, enhancer):
        image_data = create_test_image()
        result = await enhancer.enhance(image_data, "auto")
        assert "image_base64" in result
        assert "enhancements" in result
        assert len(result["enhancements"]) > 0

    @pytest.mark.asyncio
    async def test_brightness_enhancement(self, enhancer):
        image_data = create_test_image()
        result = await enhancer.enhance(image_data, "brightness")
        assert "brightness" in result["enhancements"]

    @pytest.mark.asyncio
    async def test_denoise_enhancement(self, enhancer):
        image_data = create_test_image()
        result = await enhancer.enhance(image_data, "denoise")
        assert "denoise" in result["enhancements"]


class TestFaceService:
    @pytest.fixture
    def service(self):
        return FaceService()

    @pytest.mark.asyncio
    async def test_face_detection(self, service):
        image_data = create_test_image(640, 480)
        result = await service.detect(image_data)
        assert "faces" in result
        assert isinstance(result["faces"], list)

    @pytest.mark.asyncio
    async def test_face_encoding(self, service):
        image_data = create_test_image(640, 480)
        result = await service.encode(image_data)
        assert "encodings" in result

Integration Tests

# tests/test_integration.py
import pytest
from httpx import AsyncClient, ASGITransport
from cortex.api.main import app


@pytest.fixture
async def authenticated_client():
    transport = ASGITransport(app=app)
    async with AsyncClient(transport=transport, base_url="http://test") as client:
        # Register user
        await client.post("/api/v1/auth/register", json={
            "username": "testuser",
            "email": "test@test.com",
            "password": "TestPass123!"
        })
        
        # Login
        response = await client.post(
            "/api/v1/auth/login",
            data={"username": "testuser", "password": "TestPass123!"}
        )
        token = response.json()["access_token"]
        
        client.headers["Authorization"] = f"Bearer {token}"
        yield client


@pytest.mark.anyio
async def test_full_nlp_workflow(authenticated_client):
    # Analyze text
    response = await authenticated_client.post(
        "/api/v1/nlp/analyze",
        json={"text": "Google was founded by Larry Page.", "tasks": ["ner"]}
    )
    assert response.status_code == 200
    assert len(response.json()["entities"]) > 0

    # Get sentiment
    response = await authenticated_client.post(
        "/api/v1/nlp/sentiment",
        json={"text": "This is amazing!"}
    )
    assert response.status_code == 200
    assert response.json()["score"] > 0

    # Generate embeddings
    response = await authenticated_client.post(
        "/api/v1/nlp/embed",
        json={"texts": ["Hello world"]}
    )
    assert response.status_code == 200
    assert len(response.json()["embeddings"]) == 1


@pytest.mark.anyio
async def test_analytics_workflow(authenticated_client):
    # Forecast
    data = [{"date": f"2024-01-{i:02d}", "value": 100 + i * 2} for i in range(1, 31)]
    response = await authenticated_client.post(
        "/api/v1/analytics/forecast",
        json={
            "data": data,
            "date_column": "date",
            "value_column": "value",
            "periods": 7
        }
    )
    assert response.status_code == 200
    assert len(response.json()["forecast"]) == 7

    # Anomaly detection
    response = await authenticated_client.post(
        "/api/v1/analytics/anomaly",
        json={
            "data": [{"x": i} for i in range(10)] + [{"x": 100}],
            "columns": ["x"],
            "contamination": 0.1
        }
    )
    assert response.status_code == 200
    assert response.json()["anomaly_count"] >= 1

Test Coverage Report

# Generate HTML coverage report
pytest tests/ --cov=cortex --cov-report=html

# View report
open htmlcov/index.html

Deployment

Docker Compose (Development)

docker-compose up -d

Kubernetes (Production)

# Apply manifests
kubectl apply -f k8s/

# Or use Helm
helm install cortex-ai ./helm/cortex-ai \
  --set image.tag=v1.0.0 \
  --set ingress.host=api.yourdomain.com

Environment Variables

Variable Description Default
DATABASE_URL PostgreSQL connection string postgresql+asyncpg://...
MONGODB_URL MongoDB connection string mongodb://localhost:27017
REDIS_URL Redis connection string redis://localhost:6379
JWT_SECRET_KEY Secret for JWT signing (required)
JWT_EXPIRATION_MINUTES Token expiration time 30
LOG_LEVEL Logging level INFO

Performance

Metric Value
API Latency (p50) 15ms
API Latency (p99) 85ms
Throughput 10,000 req/s
NLP Inference 50-100ms
Vision Inference 100-200ms
Model Load Time 2-5s
Container Memory 2-4 GB

Scaling Recommendations

Users API Replicas Workers PostgreSQL Redis
< 1K 2 2 2 vCPU, 4GB 1 GB
< 10K 4 4 4 vCPU, 8GB 2 GB
< 100K 10 10 8 vCPU, 16GB 4 GB
100K+ 20+ 20+ 16+ vCPU 8+ GB

Tech Stack

Python
Python
FastAPI
FastAPI
PyTorch
PyTorch
TensorFlow
TensorFlow
Docker
Docker
Kubernetes
K8s
PostgreSQL
PostgreSQL
MongoDB
MongoDB
Redis
Redis
Kafka
Kafka
Prometheus
Prometheus
Grafana
Grafana

Project Structure

CORTEX AI/
├── cortex/
│   ├── api/
│   │   ├── main.py              # FastAPI application
│   │   ├── routes/              # API endpoints
│   │   ├── schemas/             # Pydantic models
│   │   └── middleware/          # Custom middleware
│   ├── ml/
│   │   ├── model_registry/      # Model versioning
│   │   └── serving/             # Inference engine
│   ├── nlp/
│   │   ├── analysis/            # Text analysis (spaCy)
│   │   ├── sentiment/           # Sentiment (BERT)
│   │   ├── classification/      # Zero-shot (BART)
│   │   ├── qa/                  # QA (RoBERTa)
│   │   ├── generation/          # Generation (GPT-2)
│   │   └── embeddings/          # Embeddings (SBERT)
│   ├── vision/
│   │   ├── detection/           # Object detection (YOLO)
│   │   ├── classification/      # Image classification
│   │   ├── segmentation/        # Segmentation
│   │   ├── face/                # Face recognition
│   │   ├── ocr/                 # Text extraction
│   │   └── enhancement/         # Image enhancement
│   ├── analytics/
│   │   ├── forecasting/         # Time series (Prophet)
│   │   ├── anomaly/             # Anomaly detection
│   │   ├── classification/      # Tabular (XGBoost)
│   │   ├── clustering/          # K-Means
│   │   ├── recommendation/      # Recommendations
│   │   └── regression/          # Regression
│   ├── automation/
│   │   ├── celery_app.py        # Celery configuration
│   │   ├── tasks/               # Background tasks
│   │   ├── workflows/           # Workflow engine
│   │   └── decision/            # Rules engine
│   ├── storage/
│   │   ├── postgres/            # SQL database
│   │   ├── mongo/               # Document store
│   │   ├── redis/               # Cache
│   │   └── minio/               # Object storage
│   ├── security/
│   │   ├── auth/                # JWT authentication
│   │   ├── authorization/       # RBAC
│   │   └── encryption/          # Crypto utilities
│   ├── monitoring/
│   │   ├── metrics/             # Prometheus
│   │   ├── logging/             # Structured logs
│   │   └── health/              # Health checks
│   └── integrations/            # External APIs
├── k8s/                         # Kubernetes manifests
├── helm/                        # Helm charts
├── tests/                       # Test suite
├── .github/workflows/           # CI/CD
├── docker-compose.yml           # Local development
├── Dockerfile                   # Container build
└── pyproject.toml               # Dependencies

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

# Fork and clone
git clone https://github.com/BLACK0X80/CORTEX-AI
cd CORTEX-AI

# Create branch
git checkout -b feature/your-feature

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Submit PR

License

This project is licensed under the MIT License - see the LICENSE file for details.




Built with passion by BLACK0X80

GitHub

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors