Skip to content

chrisAlex128/Invela_Chunker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Invela Assessment Automation

A production-ready RAG-based system for automatically answering TPRM (Third-Party Risk Management) assessment questions using AWS Bedrock and vector retrieval.

Features

  • Multi-Model Comparison: Test multiple LLMs (Claude 3 Opus, Sonnet, Haiku) and compare results
  • Consistency Analysis: Run multiple iterations to measure answer reliability
  • RAG Pipeline: Semantic chunking, Titan Embeddings, FAISS vector store
  • Confidence Scoring: Structured scoring with rubric-based assessment
  • Citation Tracking: Every answer includes document citations
  • Production Ready: Type hints, comprehensive logging, error handling

Security Considerations

This codebase follows security best practices:

  • No credentials in code - Uses AWS credential chain
  • Safe serialization - JSON instead of pickle for data persistence
  • Input validation - Path traversal prevention, input sanitization
  • YAML safe loading - Prevents arbitrary code execution
  • Whitelisted file types - Only processes allowed extensions

Quick Start

1. Install Dependencies

cd invela_assessment
pip3 install -r requirements.txt

2. Configure AWS Credentials

# Option 1: AWS CLI
aws configure

# Option 2: Environment variables
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
export AWS_DEFAULT_REGION=us-east-1

3. Add Your Documents

Place company documents in data/company_docs/:

data/company_docs/
├── SOC2_Report_2024.pdf
├── Privacy_Policy.docx
├── InfoSec_Policy.pdf
└── ...

Supported formats: PDF, DOCX, TXT, MD, JSON

4. Configure Models

Edit config.yaml to specify which models to test:

bedrock:
  models:
    - name: "Claude 3 Sonnet"
      model_id: "anthropic.claude-3-sonnet-20240229-v1:0"
      temperature: 0.0  # Keep at 0 for consistency
    - name: "Claude 3 Haiku"
      model_id: "anthropic.claude-3-haiku-20240307-v1:0"
      temperature: 0.0

5. Run Assessment

# Test with specific questions first
python3 run_assessment.py --questions 1 22 26 --runs 2

# Full run with all models
python3 run_assessment.py --runs 3

Output

Results are saved to results/:

results/
├── claude_3_sonnet_run1_20241201_143022.json
├── claude_3_sonnet_run2_20241201_143156.json
├── claude_3_haiku_run1_20241201_143312.json
├── model_comparison_20241201_143500.json
└── assessment.log

Sample Output

======================================================================
MULTI-MODEL ASSESSMENT COMPARISON SUMMARY
======================================================================

Models tested: 2
Questions: 142

--- Model Rankings (by Avg Confidence) ---
  #1: Claude 3 Sonnet - 0.742
  #2: Claude 3 Haiku - 0.698

--- Per-Model Details ---

Claude 3 Sonnet:
  Runs: 2
  Avg Confidence: 0.7420
  Variance across runs: 0.0012
    Run 1: conf=0.741, complete=98, partial=31, not_found=13
    Run 2: conf=0.743, complete=98, partial=31, not_found=13

Project Structure

invela_assessment/
├── config.yaml              # Main configuration
├── run_assessment.py        # CLI entry point
├── requirements.txt         # Python dependencies
├── .gitignore              # Git ignore rules
├── src/
│   ├── __init__.py         # Package exports
│   ├── config.py           # Configuration management
│   ├── bedrock_client.py   # AWS Bedrock LLM + Embeddings
│   ├── document_processor.py # Document loading & chunking
│   ├── vector_store.py     # FAISS vector storage
│   ├── rag_engine.py       # RAG orchestration
│   ├── analysis.py         # Multi-run analysis
│   └── runner.py           # Main runner
├── data/
│   ├── company_docs/       # Your documents (gitignored)
│   ├── vector_index/       # Generated index (gitignored)
│   └── invela_accreditation_questions.json
└── results/                 # Output (gitignored)

Configuration Reference

Key Parameters

Parameter Default Description
temperature 0.0 Keep at 0 for deterministic outputs
chunk_size 512 Tokens per chunk
chunk_overlap 50 Overlap between chunks
top_k 10 Documents to retrieve
rerank_top_n 5 Documents after reranking
num_runs 3 Runs per model for consistency

Available Models (On-Demand)

These models work without inference profiles:

# Claude 3 Family
- anthropic.claude-3-opus-20240229-v1:0
- anthropic.claude-3-sonnet-20240229-v1:0
- anthropic.claude-3-haiku-20240307-v1:0

Programmatic Usage

from src import AssessmentRunner

# Create runner
runner = AssessmentRunner(config_path="config.yaml")

# Initialize (loads questions, builds vector store)
runner.initialize()

# Run single model
results = runner.run_model(runner.config.bedrock.models[0], num_runs=2)

# Run all models and compare
report = runner.run_all_models(num_runs=3)
runner.print_summary(report)

Troubleshooting

"No credentials found"

aws configure  # Set up AWS CLI
# Or set environment variables

"Model requires inference profile"

Newer models (Claude 3.5, 4.x) require inference profiles. Use Claude 3 models or set up inference profiles in AWS Console.

Low consistency scores

  • Add more relevant documentation
  • Check comparison_report.json for specific issues
  • Questions about missing docs will naturally have low consistency

Development

Code Quality

The codebase follows:

  • PEP 8 style guide
  • Type hints throughout
  • Comprehensive docstrings
  • Defensive error handling

Testing

# Test with subset of questions
python3 run_assessment.py --questions 1 2 3 --runs 1

# Force rebuild index after adding docs
python3 run_assessment.py --reindex

License

Proprietary - Invela Inc.

Version History

  • v2.0.0 - Multi-model support, improved security, JSON serialization
  • v1.0.0 - Initial release

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages