Skip to content

AryannAgrawall/Playbook-Security-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SentinelScope

Offline, provenance-aware security playbook generation for manual pentesters

SentinelScope is a lightweight research assistant that ingests HAR files, HTML snapshots, and reconnaissance notes to craft reproducible security playbooks. Instead of driving a live browser, it reasons over captured evidence, enriches it with curated OWASP/CAPEC/CISA knowledge, and emits deterministic markdown + JSON outputs you can hand to a human tester.

Key Features

  • Offline Artifact Ingestion: Consolidates HAR, HTML, and reconnaissance notes into a single structured context bundle
  • Agentic Planning Loop: Observe -> plan -> critique -> refine pipeline outputs reproducible, stepwise test ideas
  • Security-Focused RAG: OWASP WSTG, CAPEC, PentestMonkey, and curated CISA KEV snippets keep plans grounded in real techniques
  • Deterministic Output Bundle: Saves markdown + JSON playbooks plus the parsed context snapshot for auditability
  • Manual Tester Ergonomics: Payloads emphasize curl/httpie/ffuf workflows with professional, research-grade tone
  • Minimal Dependencies: Pure-Python pipeline—no browser automation, no traffic interception, and no network capture

Quick Start

Note: If using the src/ layout, run via python src/output/playbook_generator.py ...

Prerequisites

  • Python 3.10+
  • Ollama installed locally (e.g., https://ollama.com) with at least one model such as llama3.1
  • (Optional) Local copies of HAR / HTML artifacts you want to analyze

Installation

# Clone the repository
git clone https://github.com/AryannAgrawall/sentinelscope
cd sentinelscope

# Install dependencies (pure Python, no browser tooling required)
pip install -r requirements.txt

# (Optional) Load environment variables via .env if you rely on custom hints
# For example: export BASE_URL='https://offline.test'

Local LLM Setup (Ollama)

All critique/refine prompts run through a local Ollama runtime. Start the daemon and pull the default model before generating playbooks:

ollama serve   # run this in a separate terminal session
ollama pull llama3.1

Basic Usage

python playbook_generator.py \
  --input artifacts/demo.har artifacts/home.html notes.txt \
  --title "Checkout Flow" \
  --plans 6

Windows (PowerShell) Run Guide

Requirements: Python 3.10+, PowerShell, and a local virtual environment.

# From the repository root
py -3.10 -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
python playbook_generator.py --input path\to\artifact.har --title "Demo" --plans 4

The single python playbook_generator.py ... command is all that is needed once the venv is activated.

Local LLM Inference

SentinelScope delegates every critique/refine step to a locally hosted Ollama runtime. This keeps sensitive reconnaissance data on your workstation, eliminates recurring API bills, and avoids noisy rate limits.

  • Default model: llama3.1
  • Supported alternates: deepseek-r1, mistral
  • Switch models via --llm-name (example: --llm-name deepseek-r1)
  • No API keys, SaaS plans, or quota management are needed—everything runs offline once your chosen model has been downloaded.

Local inference is ideal for security research and educational AI agents because:

  1. Data privacy: Parsed HAR/HTML artifacts never leave your threat lab.
  2. Deterministic pricing: You pay once (hardware + time) rather than per token.
  3. Repeatable workflows: Offline models remove vendor outages, key rotation, and throttling surprises.
  4. Hands-on learning: Experimenting with prompt engineering and agent tweaks is frictionless without cloud guardrails.

Common Flags

Option Description Default Example
--input One or more artifact paths or directories (HAR/HTML/txt) required --input dumps/site.har notes/
--title Friendly name applied to the playbook bundle Security Playbook --title "Checkout Flow"
--plans Number of targeted plans to synthesize 5 --plans 8
--base-url Assumed origin for relative links inside HTML snapshots http://offline.test --base-url https://target.local
--output-dir Destination directory for generated bundles playbooks --output-dir reports/offline
--disable-rag Skip loading the curated security knowledge base disabled --disable-rag
--llm-name Local Ollama model leveraged for critique/refine loops llama3.1 --llm-name mistral:7b

Example Workflows

# 1) Single HAR capture only
python playbook_generator.py --input captures/login.har --title "Login Portal" --plans 4

# 2) Mix of HTML + manual notes, keep fewer plans
python playbook_generator.py --input snapshots/*.html recon/notes.txt --plans 3 --title "Marketing Microsite"

# 3) RAG-disabled dry run for deterministic baselines
python playbook_generator.py --input dumps/api.har --plans 2 --disable-rag

Advanced Features

Artifact Intelligence

  • Normalized Context Bundle: URLs, forms, headers, JavaScript clues, and recon notes are merged into a single JSON payload for planning.
  • Signal De-duplication: The ingestor collapses duplicate requests and noisy parameters so the LLM focuses on unique surfaces.
  • Base URL Reconstruction: Relative paths inside archived HTML are resolved via the --base-url hint for consistent payload suggestions.

Knowledge Fusion

  • Curated Offline Dataset: PentestMonkey payloads, CAPEC attack patterns, OWASP WSTG chapters, and CISA KEV highlights load locally.
  • Contextual CVE Matching: The knowledge fetcher scores KEV entries against detected technologies to spotlight relevant vulns.
  • Deterministic Summaries: Knowledge snippets are truncated and cached so repeated runs stay reproducible.

Reasoned Playbooks

  • Agentic Loop: Each plan iterates through observe → hypothesis → refine to keep guidance actionable.
  • Payload Recipes: Output emphasizes curl/httpie/ffuf/code fences that can be replayed without automation.
  • Validation + Hardening: Every plan documents success indicators and suggested defensive follow-ups.

Architecture

SentinelScope's offline generator is composed of five deterministic building blocks:

  • Artifact Ingestor: Parses HAR/HTML/txt assets, fingerprints technologies, and emits normalized context data.
  • Knowledge Fetcher: Loads the offline dataset (PentestMonkey, CAPEC, OWASP WSTG, CISA KEV) and scores contextual CVEs.
  • Planner: Runs the agentic observe → plan → refine loop to create targeted manual testing ideas.
  • LLM Reasoner: Expands each plan into a polished section with goals, steps, payloads, validation, and hardening tips.
  • Playbook Reporter: Writes timestamp + slugged bundles (for example 20251226_102530_checkout-flow.*) under the chosen output directory.

Example Report

Reports are generated in both text and markdown formats, containing:

  • Executive summary
  • Detailed findings with severity ratings
  • Technical details and reproduction steps
  • Evidence and impact analysis
  • Remediation recommendations

Future Modifications

  • Add support for additional Ollama-compatible model templates
  • Integrate vision API capabilities for visual analysis
  • Run against HackerOne reports to find first LLM-powered vulnerability in the wild
  • Implement more sophisticated planning algorithms
  • Add better execution strategies and error handling
  • Support for custom LLM model deployment
  • Add collaborative testing capabilities
  • Improve subdomain enumeration techniques
  • Add API security testing capabilities

About

Offline agentic AI system for generating security testing playbooks from captured artifacts

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages