|
CORTEX AI is a comprehensive enterprise-grade AI platform delivering production-ready components for:
Built with cutting-edge technologies and designed for massive scale. |
|
- Features
- Architecture
- Installation
- Quick Start
- API Reference
- NLP Services
- Vision Services
- Analytics Services
- Model Registry
- Automation & Workflows
- Security
- Monitoring
- Testing
- Deployment
- Performance
- Contributing
| NLP Engine | Vision AI | Analytics | Automation |
|---|---|---|---|
| Text Analysis | Object Detection | Forecasting | Task Orchestration |
| Sentiment Analysis | Image Classification | Anomaly Detection | Workflow Engine |
| Text Classification | Segmentation | Clustering | Decision Engine |
| Question Answering | Face Recognition | Recommendations | Event Processing |
| Text Generation | OCR | Regression | Webhooks |
| Embeddings | Enhancement | Classification | API Gateway |
- Text Analysis: Tokenization, Named Entity Recognition (NER), Part-of-Speech (POS) tagging using spaCy
- Sentiment Analysis: Multi-class sentiment classification using fine-tuned BERT models
- Text Classification: Zero-shot classification with BART-large-MNLI for any label set
- Question Answering: Extractive QA using RoBERTa trained on SQuAD 2.0
- Text Generation: GPT-2 based text generation with temperature and top-p sampling
- Embeddings: Sentence-BERT embeddings for semantic search and similarity
- Object Detection: YOLOv8 for real-time object detection with 80+ classes
- Image Classification: ResNet-50 pretrained on ImageNet for 1000 categories
- Semantic Segmentation: Instance and semantic segmentation with YOLO-seg
- Face Recognition: Haar cascade detection with face encoding for comparison
- OCR: EasyOCR supporting 80+ languages for text extraction
- Image Enhancement: Brightness, contrast, denoising, sharpening, histogram equalization
- Time Series Forecasting: Prophet and ARIMA for demand forecasting
- Anomaly Detection: Isolation Forest and statistical Z-score methods
- Clustering: K-Means with automatic optimal cluster detection
- Recommendations: Collaborative filtering and content-based recommendations
- Regression: Linear, Ridge, and Random Forest regression
- Classification: XGBoost for tabular classification with feature importance
graph TB
subgraph Client Layer
A[REST API] --> B[SDK]
A --> C[WebSocket]
end
subgraph API Gateway
D[Auth] --> E[Load Balancer]
E --> F[Rate Limiter]
end
subgraph AI Services
G[NLP] --> H[Model Serving]
I[Vision] --> H
J[Analytics] --> H
end
subgraph Infrastructure
K[(PostgreSQL)]
L[(MongoDB)]
M[(Redis)]
N[MinIO]
end
A --> D
F --> G
F --> I
F --> J
H --> K
H --> L
H --> M
H --> N
style A fill:#DC143C,color:#fff
style G fill:#DC143C,color:#fff
style I fill:#DC143C,color:#fff
style J fill:#DC143C,color:#fff
style H fill:#000,color:#DC143C
| Component | Technology | Purpose |
|---|---|---|
| API Server | FastAPI | High-performance async REST API |
| SQL Database | PostgreSQL 15 + TimescaleDB | User data, models metadata, audit logs |
| NoSQL Database | MongoDB 6 | Document storage, embeddings, unstructured data |
| Cache | Redis 7 | Session management, caching, rate limiting |
| Object Storage | MinIO | Model artifacts, images, files |
| Message Queue | Kafka | Event streaming, async processing |
| Task Queue | Celery | Background job processing |
| Search | Elasticsearch | Full-text search, log aggregation |
| Monitoring | Prometheus + Grafana | Metrics and dashboards |
- Python 3.11 or higher
- Docker and Docker Compose
- 16GB RAM minimum (32GB recommended for full ML capabilities)
- NVIDIA GPU (optional, for accelerated inference)
# Clone the repository
git clone https://github.com/BLACK0X80/cortex-ai.git
cd cortex-ai
# Start all services
docker-compose up -d
# Verify services are running
docker-compose ps
# Access API docs
open http://localhost:8000/docs# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -e ".[dev]"
# Download spaCy model
python -m spacy download en_core_web_sm
# Start infrastructure (PostgreSQL, Redis, MongoDB, MinIO)
docker-compose up -d postgres redis mongo minio
# Initialize database
python -c "from cortex.storage.postgres.database import init_db; import asyncio; asyncio.run(init_db())"
# Run the API
uvicorn cortex.api.main:app --reload --host 0.0.0.0 --port 8000# Add Helm repository (if using external charts)
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
# Deploy CORTEX AI
helm install cortex-ai ./helm/cortex-ai -f ./helm/cortex-ai/values.yaml
# Check deployment status
kubectl get pods -l app=cortex-api
kubectl get svc cortex-apidocker-compose up -dcurl -X POST "http://localhost:8000/api/v1/auth/register" \
-H "Content-Type: application/json" \
-d '{
"username": "testuser",
"email": "test@example.com",
"password": "SecurePass123!"
}'curl -X POST "http://localhost:8000/api/v1/auth/login" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "username=testuser&password=SecurePass123!"# Analyze text
curl -X POST "http://localhost:8000/api/v1/nlp/analyze" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"text": "Apple Inc. was founded by Steve Jobs in California.",
"tasks": ["tokenize", "ner", "pos"]
}'
# Sentiment analysis
curl -X POST "http://localhost:8000/api/v1/nlp/sentiment" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"text": "This product is absolutely amazing! Best purchase ever."
}'All API endpoints (except /auth/register and /auth/login) require JWT authentication.
# Include the token in Authorization header
-H "Authorization: Bearer YOUR_ACCESS_TOKEN"| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v1/nlp/analyze |
Text Analysis & NER |
POST |
/api/v1/nlp/sentiment |
Sentiment Analysis |
POST |
/api/v1/nlp/classify |
Text Classification |
POST |
/api/v1/nlp/embed |
Generate Embeddings |
POST |
/api/v1/nlp/qa |
Question Answering |
POST |
/api/v1/nlp/generate |
Text Generation |
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v1/vision/detect |
Object Detection |
POST |
/api/v1/vision/classify |
Image Classification |
POST |
/api/v1/vision/segment |
Semantic Segmentation |
POST |
/api/v1/vision/face/detect |
Face Detection |
POST |
/api/v1/vision/face/encode |
Face Encoding |
POST |
/api/v1/vision/ocr |
Text Extraction |
POST |
/api/v1/vision/enhance |
Image Enhancement |
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v1/analytics/forecast |
Time Series Forecast |
POST |
/api/v1/analytics/anomaly |
Anomaly Detection |
POST |
/api/v1/analytics/classify |
Tabular Classification |
POST |
/api/v1/analytics/recommend |
Recommendations |
POST |
/api/v1/analytics/cluster |
Clustering |
POST |
/api/v1/analytics/regression |
Regression |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/v1/models |
List Models |
POST |
/api/v1/models/register |
Register Model |
GET |
/api/v1/models/{id} |
Get Model Info |
POST |
/api/v1/models/{id}/upload |
Upload Artifact |
PUT |
/api/v1/models/{id}/stage |
Update Stage |
POST |
/api/v1/models/{id}/predict |
Run Inference |
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v1/automation/tasks |
Create Task |
GET |
/api/v1/automation/tasks |
List Tasks |
POST |
/api/v1/automation/tasks/{id}/run |
Run Task |
POST |
/api/v1/automation/workflows |
Create Workflow |
POST |
/api/v1/automation/workflows/{id}/run |
Execute Workflow |
POST |
/api/v1/automation/decision |
Evaluate Rules |
Perform comprehensive text analysis including tokenization, NER, and POS tagging.
import httpx
async def analyze_text():
async with httpx.AsyncClient() as client:
response = await client.post(
"http://localhost:8000/api/v1/nlp/analyze",
headers={"Authorization": f"Bearer {token}"},
json={
"text": "Google announced new AI features at their Mountain View headquarters.",
"tasks": ["tokenize", "ner", "pos"]
}
)
result = response.json()
# Entities found: [{"text": "Google", "label": "ORG"}, {"text": "Mountain View", "label": "GPE"}]
print(result["entities"])Classify text sentiment with confidence scores.
response = await client.post(
"http://localhost:8000/api/v1/nlp/sentiment",
headers={"Authorization": f"Bearer {token}"},
json={"text": "I absolutely love this product! It exceeded all my expectations."}
)
# Result: {"sentiment": "very_positive", "score": 1.0, "confidence": 0.94}Classify text into custom categories without training.
response = await client.post(
"http://localhost:8000/api/v1/nlp/classify",
headers={"Authorization": f"Bearer {token}"},
json={
"text": "The new iPhone 15 has an amazing camera system with improved low-light performance.",
"labels": ["technology", "sports", "politics", "entertainment"]
}
)
# Result: {"predicted_label": "technology", "scores": {"technology": 0.92, "entertainment": 0.05, ...}}Extract answers from context.
response = await client.post(
"http://localhost:8000/api/v1/nlp/qa",
headers={"Authorization": f"Bearer {token}"},
json={
"question": "Who founded Microsoft?",
"context": "Microsoft Corporation was founded by Bill Gates and Paul Allen on April 4, 1975."
}
)
# Result: {"answer": "Bill Gates and Paul Allen", "confidence": 0.89}Generate vector embeddings for semantic search.
response = await client.post(
"http://localhost:8000/api/v1/nlp/embed",
headers={"Authorization": f"Bearer {token}"},
json={
"texts": ["Machine learning is a subset of AI", "Deep learning uses neural networks"],
"model": "sentence-transformers/all-MiniLM-L6-v2"
}
)
# Result: {"embeddings": [[0.123, -0.456, ...], [...]], "dimensions": 384}Detect objects in images using YOLOv8.
with open("image.jpg", "rb") as f:
response = await client.post(
"http://localhost:8000/api/v1/vision/detect",
headers={"Authorization": f"Bearer {token}"},
files={"file": f},
data={"confidence_threshold": 0.5}
)
# Result: {"objects": [{"label": "person", "confidence": 0.95, "bbox": [100, 50, 300, 400]}], "count": 1}Classify images into 1000 ImageNet categories.
with open("cat.jpg", "rb") as f:
response = await client.post(
"http://localhost:8000/api/v1/vision/classify",
headers={"Authorization": f"Bearer {token}"},
files={"file": f}
)
# Result: {"top_class": "Egyptian_cat", "confidence": 0.87, "predictions": [...]}Extract text from images in 80+ languages.
with open("document.png", "rb") as f:
response = await client.post(
"http://localhost:8000/api/v1/vision/ocr",
headers={"Authorization": f"Bearer {token}"},
files={"file": f},
data={"languages": "en,ar"}
)
# Result: {"text": "Extracted text content...", "blocks": [...], "confidence": 0.91}Predict future values using Prophet.
response = await client.post(
"http://localhost:8000/api/v1/analytics/forecast",
headers={"Authorization": f"Bearer {token}"},
json={
"data": [
{"date": "2024-01-01", "value": 100},
{"date": "2024-01-02", "value": 120},
{"date": "2024-01-03", "value": 115},
# ... more data points
],
"date_column": "date",
"value_column": "value",
"periods": 30,
"frequency": "D"
}
)
# Result: {"forecast": [{"date": "2024-02-01", "prediction": 135.5, "lower_bound": 120, "upper_bound": 150}...]}Detect outliers in data using Isolation Forest.
response = await client.post(
"http://localhost:8000/api/v1/analytics/anomaly",
headers={"Authorization": f"Bearer {token}"},
json={
"data": [
{"cpu": 45, "memory": 60},
{"cpu": 50, "memory": 65},
{"cpu": 95, "memory": 98}, # Anomaly
{"cpu": 48, "memory": 62}
],
"columns": ["cpu", "memory"],
"contamination": 0.1
}
)
# Result: {"anomalies": [{"cpu": 95, "memory": 98, "index": 2, "score": -0.85}], "anomaly_count": 1}Group similar data points using K-Means.
response = await client.post(
"http://localhost:8000/api/v1/analytics/cluster",
headers={"Authorization": f"Bearer {token}"},
json={
"data": [
{"age": 25, "income": 30000, "spending": 500},
{"age": 35, "income": 60000, "spending": 1200},
# ... more data
],
"feature_columns": ["age", "income", "spending"],
"n_clusters": 3
}
)
# Result: {"clusters": [...], "centroids": [...], "inertia": 1234.5}response = await client.post(
"http://localhost:8000/api/v1/models/register",
headers={"Authorization": f"Bearer {token}"},
json={
"name": "customer-churn-predictor",
"version": "1.0.0",
"framework": "xgboost",
"description": "Predicts customer churn probability",
"metrics": {"accuracy": 0.92, "f1_score": 0.89}
}
)with open("model.pkl", "rb") as f:
response = await client.post(
f"http://localhost:8000/api/v1/models/{model_id}/upload",
headers={"Authorization": f"Bearer {token}"},
files={"file": ("model.pkl", f)}
)response = await client.put(
f"http://localhost:8000/api/v1/models/{model_id}/stage",
headers={"Authorization": f"Bearer {token}"},
params={"stage": "production"}
)response = await client.post(
"http://localhost:8000/api/v1/automation/workflows",
headers={"Authorization": f"Bearer {token}"},
json={
"name": "data-pipeline",
"steps": [
{"name": "analyze", "type": "nlp_analysis", "parameters": {"tasks": ["ner"]}},
{"name": "classify", "type": "nlp_analysis", "parameters": {"tasks": ["sentiment"]}}
]
}
)Evaluate business rules dynamically.
response = await client.post(
"http://localhost:8000/api/v1/automation/decision",
headers={"Authorization": f"Bearer {token}"},
json={
"rules": [
{
"name": "high-value-customer",
"conditions": [
{"field": "purchase_amount", "operator": "gte", "value": 1000},
{"field": "customer_tier", "operator": "eq", "value": "gold"}
],
"action": {"discount": 0.20, "priority": "high"}
}
],
"context": {
"purchase_amount": 1500,
"customer_tier": "gold"
}
}
)
# Result: {"matched_count": 1, "actions": [{"discount": 0.20, "priority": "high"}]}CORTEX AI uses JWT (JSON Web Tokens) for authentication.
# Login to get token
response = await client.post(
"http://localhost:8000/api/v1/auth/login",
data={"username": "user", "password": "pass"}
)
token = response.json()["access_token"]
# Use token in subsequent requests
headers = {"Authorization": f"Bearer {token}"}| Role | Permissions |
|---|---|
admin |
Full access to all resources |
user |
Read, write, API access |
service |
Read, API access, model deployment |
Generate API keys for programmatic access:
from cortex.security.encryption import EncryptionService
key, key_hash = EncryptionService.generate_api_key()
# Store key_hash in database, give key to userAccess metrics at http://localhost:8000/metrics:
cortex_requests_total- Total request countcortex_request_latency_seconds- Request latency histogramcortex_model_inference_total- Model inference countcortex_model_inference_latency_seconds- Inference latency
curl http://localhost:8000/health
# {"status": "healthy", "service": "CORTEX AI"}Access Grafana at http://localhost:3000 (admin/admin) to view:
- API request rates and latencies
- Model inference performance
- Infrastructure health
# Run unit tests
pytest tests/ -v --cov=cortex --cov-report=html
# Run specific test file
pytest tests/test_nlp.py -v
# Run with markers
pytest -m "not slow" -v# tests/test_api.py
import pytest
from httpx import AsyncClient, ASGITransport
from cortex.api.main import app
@pytest.fixture
async def client():
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as ac:
yield ac
@pytest.mark.anyio
async def test_health_check(client):
response = await client.get("/health")
assert response.status_code == 200
assert response.json()["status"] == "healthy"
@pytest.mark.anyio
async def test_root_endpoint(client):
response = await client.get("/")
assert response.status_code == 200
assert "service" in response.json()
assert response.json()["service"] == "CORTEX AI"
@pytest.mark.anyio
async def test_openapi_docs(client):
response = await client.get("/docs")
assert response.status_code == 200
@pytest.mark.anyio
async def test_unauthorized_access(client):
response = await client.post("/api/v1/nlp/analyze", json={"text": "test"})
assert response.status_code == 401# tests/test_nlp.py
import pytest
from cortex.nlp.analysis.text_analyzer import TextAnalyzer
from cortex.nlp.sentiment.sentiment_analyzer import SentimentAnalyzer
from cortex.nlp.classification.text_classifier import TextClassifier
from cortex.nlp.embeddings.embedding_service import EmbeddingService
from cortex.nlp.qa.qa_service import QAService
class TestTextAnalyzer:
@pytest.fixture
def analyzer(self):
return TextAnalyzer()
@pytest.mark.asyncio
async def test_tokenization(self, analyzer):
result = await analyzer.analyze("Hello world", tasks=["tokenize"])
assert "tokens" in result
assert len(result["tokens"]) == 2
assert result["tokens"][0]["text"] == "Hello"
@pytest.mark.asyncio
async def test_named_entity_recognition(self, analyzer):
result = await analyzer.analyze(
"Apple Inc. is headquartered in Cupertino, California.",
tasks=["ner"]
)
assert "entities" in result
entities = [e["text"] for e in result["entities"]]
assert "Apple Inc." in entities or "Apple" in entities
@pytest.mark.asyncio
async def test_pos_tagging(self, analyzer):
result = await analyzer.analyze("The quick brown fox", tasks=["pos"])
assert "pos_tags" in result
assert len(result["pos_tags"]) == 4
@pytest.mark.asyncio
async def test_keyword_extraction(self, analyzer):
text = "Machine learning and artificial intelligence are transforming technology."
keywords = await analyzer.extract_keywords(text, top_n=3)
assert len(keywords) <= 3
assert all("keyword" in k for k in keywords)
class TestSentimentAnalyzer:
@pytest.fixture
def analyzer(self):
return SentimentAnalyzer()
@pytest.mark.asyncio
async def test_positive_sentiment(self, analyzer):
result = await analyzer.analyze("This is absolutely wonderful!")
assert result["sentiment"] in ["positive", "very_positive"]
assert result["score"] > 0
@pytest.mark.asyncio
async def test_negative_sentiment(self, analyzer):
result = await analyzer.analyze("This is terrible and disappointing.")
assert result["sentiment"] in ["negative", "very_negative"]
assert result["score"] < 0
@pytest.mark.asyncio
async def test_confidence_score(self, analyzer):
result = await analyzer.analyze("I love this product!")
assert 0 <= result["confidence"] <= 1
class TestTextClassifier:
@pytest.fixture
def classifier(self):
return TextClassifier()
@pytest.mark.asyncio
async def test_zero_shot_classification(self, classifier):
result = await classifier.classify(
"The stock market reached new highs today.",
labels=["finance", "sports", "technology", "health"]
)
assert result["predicted_label"] == "finance"
assert "scores" in result
assert result["scores"]["finance"] > 0.5
@pytest.mark.asyncio
async def test_multi_label_classification(self, classifier):
result = await classifier.multi_label_classify(
"AI is revolutionizing healthcare with new diagnostic tools.",
labels=["technology", "health", "sports", "politics"],
threshold=0.3
)
assert "technology" in result["predicted_labels"]
assert "health" in result["predicted_labels"]
class TestEmbeddingService:
@pytest.fixture
def service(self):
return EmbeddingService()
@pytest.mark.asyncio
async def test_single_embedding(self, service):
embedding = await service.encode_single("Hello world")
assert isinstance(embedding, list)
assert len(embedding) > 0
@pytest.mark.asyncio
async def test_batch_embedding(self, service):
texts = ["First sentence", "Second sentence", "Third sentence"]
embeddings = await service.encode(texts)
assert len(embeddings) == 3
assert all(len(e) == len(embeddings[0]) for e in embeddings)
@pytest.mark.asyncio
async def test_similarity_calculation(self, service):
similarity = await service.similarity(
"The cat sat on the mat",
"A cat is sitting on a mat"
)
assert 0.5 < similarity <= 1.0
@pytest.mark.asyncio
async def test_similar_document_search(self, service):
documents = [
"Python is a programming language",
"Java is used for enterprise applications",
"Machine learning uses statistical methods",
"Cats are popular pets"
]
results = await service.find_similar("coding in Python", documents, top_k=2)
assert len(results) == 2
assert "Python" in results[0]["document"]
class TestQAService:
@pytest.fixture
def service(self):
return QAService()
@pytest.mark.asyncio
async def test_simple_qa(self, service):
result = await service.answer(
question="What is the capital of France?",
context="Paris is the capital and largest city of France."
)
assert "Paris" in result["answer"]
assert result["confidence"] > 0.5
@pytest.mark.asyncio
async def test_qa_with_dates(self, service):
result = await service.answer(
question="When was the company founded?",
context="TechCorp was founded in 2010 by John Smith in Silicon Valley."
)
assert "2010" in result["answer"]# tests/test_analytics.py
import pytest
import numpy as np
from cortex.analytics.anomaly.detector import AnomalyDetector
from cortex.analytics.clustering.clusterer import Clusterer
from cortex.analytics.recommendation.recommender import Recommender
from cortex.analytics.regression.regressor import Regressor
class TestAnomalyDetector:
@pytest.fixture
def detector(self):
return AnomalyDetector()
@pytest.mark.asyncio
async def test_isolation_forest_detection(self, detector):
data = [{"value": i} for i in range(20)] + [{"value": 100}]
result = await detector.detect(data, ["value"], contamination=0.1)
assert len(result["anomalies"]) >= 1
anomaly_values = [a["value"] for a in result["anomalies"]]
assert 100 in anomaly_values
@pytest.mark.asyncio
async def test_statistical_detection(self, detector):
data = [{"value": 50 + np.random.randn()} for _ in range(50)]
data.append({"value": 200}) # Clear outlier
result = await detector.detect_statistical(data, "value", threshold=3.0)
assert len(result["anomalies"]) >= 1
@pytest.mark.asyncio
async def test_multivariate_detection(self, detector):
data = [
{"x": 10, "y": 10},
{"x": 12, "y": 11},
{"x": 11, "y": 12},
{"x": 100, "y": 100}, # Anomaly
{"x": 10, "y": 11}
]
result = await detector.detect(data, ["x", "y"], contamination=0.2)
assert result["anomalies"][0]["x"] == 100
class TestClusterer:
@pytest.fixture
def clusterer(self):
return Clusterer()
@pytest.mark.asyncio
async def test_kmeans_clustering(self, clusterer):
data = [
{"x": 1, "y": 1}, {"x": 1.5, "y": 1.5}, {"x": 2, "y": 1},
{"x": 10, "y": 10}, {"x": 10.5, "y": 10.5}, {"x": 11, "y": 10}
]
result = await clusterer.cluster(data, ["x", "y"], n_clusters=2)
assert len(result["centroids"]) == 2
assert "inertia" in result
@pytest.mark.asyncio
async def test_optimal_cluster_finding(self, clusterer):
data = [{"x": i % 3, "y": i % 3 + np.random.randn() * 0.1} for i in range(30)]
result = await clusterer.find_optimal_clusters(data, ["x", "y"], max_clusters=5)
assert 2 <= result["optimal_clusters"] <= 5
assert len(result["silhouette_scores"]) == 4
class TestRecommender:
@pytest.fixture
def recommender(self):
return Recommender()
@pytest.mark.asyncio
async def test_collaborative_filtering(self, recommender):
items = [{"id": f"item_{i}"} for i in range(10)]
interactions = [
{"user_id": "user_1", "item_id": "item_0", "rating": 5},
{"user_id": "user_1", "item_id": "item_1", "rating": 4},
{"user_id": "user_2", "item_id": "item_0", "rating": 5},
{"user_id": "user_2", "item_id": "item_2", "rating": 4},
{"user_id": "user_3", "item_id": "item_0", "rating": 4},
{"user_id": "user_3", "item_id": "item_3", "rating": 5}
]
result = await recommender.recommend("user_1", items, interactions, n_recommendations=3)
assert len(result["recommendations"]) <= 3
assert "item_0" not in result["recommendations"] # Already interacted
@pytest.mark.asyncio
async def test_content_based_filtering(self, recommender):
items = [
{"id": "a", "genre": 1, "year": 2020, "rating": 4.5},
{"id": "b", "genre": 1, "year": 2021, "rating": 4.2},
{"id": "c", "genre": 2, "year": 2019, "rating": 3.8},
{"id": "d", "genre": 1, "year": 2020, "rating": 4.4}
]
result = await recommender.content_based("a", items, ["genre", "year", "rating"], n_recommendations=2)
assert "d" in result["recommendations"] # Should be similar to 'a'
class TestRegressor:
@pytest.fixture
def regressor(self):
return Regressor()
@pytest.mark.asyncio
async def test_linear_regression(self, regressor):
data = [{"x": i, "y": 2 * i + 1 + np.random.randn() * 0.1} for i in range(100)]
result = await regressor.train_and_predict(data, "y", ["x"], model_type="linear")
assert result["r2_score"] > 0.9
assert "mse" in result
@pytest.mark.asyncio
async def test_random_forest_regression(self, regressor):
data = [{"x1": i, "x2": i * 2, "y": i * 3 + 5} for i in range(50)]
result = await regressor.train_and_predict(data, "y", ["x1", "x2"], model_type="random_forest")
assert result["r2_score"] > 0.8
assert "feature_importance" in result# tests/test_vision.py
import pytest
import cv2
import numpy as np
from cortex.vision.detection.object_detector import ObjectDetector
from cortex.vision.classification.image_classifier import ImageClassifier
from cortex.vision.ocr.ocr_service import OCRService
from cortex.vision.face.face_service import FaceService
from cortex.vision.enhancement.enhancer import ImageEnhancer
def create_test_image(width=224, height=224):
img = np.random.randint(0, 255, (height, width, 3), dtype=np.uint8)
_, buffer = cv2.imencode('.jpg', img)
return buffer.tobytes()
class TestObjectDetector:
@pytest.fixture
def detector(self):
return ObjectDetector()
@pytest.mark.asyncio
async def test_detection_returns_objects(self, detector):
image_data = create_test_image()
result = await detector.detect(image_data, confidence_threshold=0.1)
assert "objects" in result
assert isinstance(result["objects"], list)
@pytest.mark.asyncio
async def test_count_objects(self, detector):
image_data = create_test_image()
result = await detector.count_objects(image_data)
assert "total" in result
assert "by_class" in result
class TestImageClassifier:
@pytest.fixture
def classifier(self):
return ImageClassifier()
@pytest.mark.asyncio
async def test_classification_returns_predictions(self, classifier):
image_data = create_test_image()
result = await classifier.classify(image_data, top_k=5)
assert "predictions" in result
assert len(result["predictions"]) == 5
assert "top_class" in result
assert 0 <= result["confidence"] <= 1
class TestImageEnhancer:
@pytest.fixture
def enhancer(self):
return ImageEnhancer()
@pytest.mark.asyncio
async def test_auto_enhancement(self, enhancer):
image_data = create_test_image()
result = await enhancer.enhance(image_data, "auto")
assert "image_base64" in result
assert "enhancements" in result
assert len(result["enhancements"]) > 0
@pytest.mark.asyncio
async def test_brightness_enhancement(self, enhancer):
image_data = create_test_image()
result = await enhancer.enhance(image_data, "brightness")
assert "brightness" in result["enhancements"]
@pytest.mark.asyncio
async def test_denoise_enhancement(self, enhancer):
image_data = create_test_image()
result = await enhancer.enhance(image_data, "denoise")
assert "denoise" in result["enhancements"]
class TestFaceService:
@pytest.fixture
def service(self):
return FaceService()
@pytest.mark.asyncio
async def test_face_detection(self, service):
image_data = create_test_image(640, 480)
result = await service.detect(image_data)
assert "faces" in result
assert isinstance(result["faces"], list)
@pytest.mark.asyncio
async def test_face_encoding(self, service):
image_data = create_test_image(640, 480)
result = await service.encode(image_data)
assert "encodings" in result# tests/test_integration.py
import pytest
from httpx import AsyncClient, ASGITransport
from cortex.api.main import app
@pytest.fixture
async def authenticated_client():
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
# Register user
await client.post("/api/v1/auth/register", json={
"username": "testuser",
"email": "test@test.com",
"password": "TestPass123!"
})
# Login
response = await client.post(
"/api/v1/auth/login",
data={"username": "testuser", "password": "TestPass123!"}
)
token = response.json()["access_token"]
client.headers["Authorization"] = f"Bearer {token}"
yield client
@pytest.mark.anyio
async def test_full_nlp_workflow(authenticated_client):
# Analyze text
response = await authenticated_client.post(
"/api/v1/nlp/analyze",
json={"text": "Google was founded by Larry Page.", "tasks": ["ner"]}
)
assert response.status_code == 200
assert len(response.json()["entities"]) > 0
# Get sentiment
response = await authenticated_client.post(
"/api/v1/nlp/sentiment",
json={"text": "This is amazing!"}
)
assert response.status_code == 200
assert response.json()["score"] > 0
# Generate embeddings
response = await authenticated_client.post(
"/api/v1/nlp/embed",
json={"texts": ["Hello world"]}
)
assert response.status_code == 200
assert len(response.json()["embeddings"]) == 1
@pytest.mark.anyio
async def test_analytics_workflow(authenticated_client):
# Forecast
data = [{"date": f"2024-01-{i:02d}", "value": 100 + i * 2} for i in range(1, 31)]
response = await authenticated_client.post(
"/api/v1/analytics/forecast",
json={
"data": data,
"date_column": "date",
"value_column": "value",
"periods": 7
}
)
assert response.status_code == 200
assert len(response.json()["forecast"]) == 7
# Anomaly detection
response = await authenticated_client.post(
"/api/v1/analytics/anomaly",
json={
"data": [{"x": i} for i in range(10)] + [{"x": 100}],
"columns": ["x"],
"contamination": 0.1
}
)
assert response.status_code == 200
assert response.json()["anomaly_count"] >= 1# Generate HTML coverage report
pytest tests/ --cov=cortex --cov-report=html
# View report
open htmlcov/index.htmldocker-compose up -d# Apply manifests
kubectl apply -f k8s/
# Or use Helm
helm install cortex-ai ./helm/cortex-ai \
--set image.tag=v1.0.0 \
--set ingress.host=api.yourdomain.com| Variable | Description | Default |
|---|---|---|
DATABASE_URL |
PostgreSQL connection string | postgresql+asyncpg://... |
MONGODB_URL |
MongoDB connection string | mongodb://localhost:27017 |
REDIS_URL |
Redis connection string | redis://localhost:6379 |
JWT_SECRET_KEY |
Secret for JWT signing | (required) |
JWT_EXPIRATION_MINUTES |
Token expiration time | 30 |
LOG_LEVEL |
Logging level | INFO |
| Metric | Value |
|---|---|
| API Latency (p50) | 15ms |
| API Latency (p99) | 85ms |
| Throughput | 10,000 req/s |
| NLP Inference | 50-100ms |
| Vision Inference | 100-200ms |
| Model Load Time | 2-5s |
| Container Memory | 2-4 GB |
| Users | API Replicas | Workers | PostgreSQL | Redis |
|---|---|---|---|---|
| < 1K | 2 | 2 | 2 vCPU, 4GB | 1 GB |
| < 10K | 4 | 4 | 4 vCPU, 8GB | 2 GB |
| < 100K | 10 | 10 | 8 vCPU, 16GB | 4 GB |
| 100K+ | 20+ | 20+ | 16+ vCPU | 8+ GB |
CORTEX AI/
├── cortex/
│ ├── api/
│ │ ├── main.py # FastAPI application
│ │ ├── routes/ # API endpoints
│ │ ├── schemas/ # Pydantic models
│ │ └── middleware/ # Custom middleware
│ ├── ml/
│ │ ├── model_registry/ # Model versioning
│ │ └── serving/ # Inference engine
│ ├── nlp/
│ │ ├── analysis/ # Text analysis (spaCy)
│ │ ├── sentiment/ # Sentiment (BERT)
│ │ ├── classification/ # Zero-shot (BART)
│ │ ├── qa/ # QA (RoBERTa)
│ │ ├── generation/ # Generation (GPT-2)
│ │ └── embeddings/ # Embeddings (SBERT)
│ ├── vision/
│ │ ├── detection/ # Object detection (YOLO)
│ │ ├── classification/ # Image classification
│ │ ├── segmentation/ # Segmentation
│ │ ├── face/ # Face recognition
│ │ ├── ocr/ # Text extraction
│ │ └── enhancement/ # Image enhancement
│ ├── analytics/
│ │ ├── forecasting/ # Time series (Prophet)
│ │ ├── anomaly/ # Anomaly detection
│ │ ├── classification/ # Tabular (XGBoost)
│ │ ├── clustering/ # K-Means
│ │ ├── recommendation/ # Recommendations
│ │ └── regression/ # Regression
│ ├── automation/
│ │ ├── celery_app.py # Celery configuration
│ │ ├── tasks/ # Background tasks
│ │ ├── workflows/ # Workflow engine
│ │ └── decision/ # Rules engine
│ ├── storage/
│ │ ├── postgres/ # SQL database
│ │ ├── mongo/ # Document store
│ │ ├── redis/ # Cache
│ │ └── minio/ # Object storage
│ ├── security/
│ │ ├── auth/ # JWT authentication
│ │ ├── authorization/ # RBAC
│ │ └── encryption/ # Crypto utilities
│ ├── monitoring/
│ │ ├── metrics/ # Prometheus
│ │ ├── logging/ # Structured logs
│ │ └── health/ # Health checks
│ └── integrations/ # External APIs
├── k8s/ # Kubernetes manifests
├── helm/ # Helm charts
├── tests/ # Test suite
├── .github/workflows/ # CI/CD
├── docker-compose.yml # Local development
├── Dockerfile # Container build
└── pyproject.toml # Dependencies
We welcome contributions! Please see our Contributing Guide for details.
# Fork and clone
git clone https://github.com/BLACK0X80/CORTEX-AI
cd CORTEX-AI
# Create branch
git checkout -b feature/your-feature
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/ -v
# Submit PR

