AgentSlurm User Guide

Overview

AgentSlurm is an intelligent analyzer for SLURM job scripts that helps users optimize their HPC jobs, particularly focusing on Lustre filesystem performance. This tool automatically reviews your SLURM scripts and provides actionable feedback to improve performance and efficiency.

Getting Started

Installation

Clone the repository:

git clone https://github.com/basillicus/agentSlurm.git
cd agentSlurm

Create and activate a conda environment (recommended):

conda create -n agentslurm-env python=3.9   # Or your preferred Python version
conda activate agentslurm-env

Install dependencies:
```
pip install -e .
```
Verify installation:
```
agentslurm --help
```

Running the Analyzer

Basic usage:

agentslurm /path/to/your/slurm_script.slurm

With user profile specification:

agentslurm /path/to/your/slurm_script.slurm --profile Medium

Available user profiles:

Basic: Simple explanations for newcomers to HPC
Medium: Balanced explanations for regular users (default)
Advanced: Technical details for experienced HPC users

Command Line Options

Basic Options

script_path: Path to the SLURM script to analyze (required)
--profile: User experience level (Basic, Medium, Advanced) [default: Medium]
--output-file: Path to save the analysis report in Markdown format
--focus-on: Comma-separated list of categories to focus on (e.g., LUSTRE,PERFORMANCE)

LLM Options

--use-llm: Enable LLM for deeper analysis
--llm-provider: LLM provider to use (openai, anthropic, ollama, huggingface) [default: openai]
--llm-model: Model to use [default: gpt-3.5-turbo]
--api-key: API key for LLM provider (not needed for Ollama); if not provided, the system will try environment variables (OPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN respectively)
--base-url: Base URL for LLM provider (for Ollama, Hugging Face custom endpoints)
--export-rules: Export learned rules to this file (only with LLM analysis)

Advanced Usage with LLM Integration

OpenAI Integration

# Basic OpenAI analysis with API key
agentslurm /path/to/your/script.slurm --use-llm --llm-provider openai --llm-model gpt-3.5-turbo --api-key YOUR_API_KEY

# Or using environment variable (set OPENAI_API_KEY=your_key in your environment)
agentslurm /path/to/your/script.slurm --use-llm --llm-provider openai --llm-model gpt-3.5-turbo

Anthropic Integration

# Using Claude models with API key
agentslurm /path/to/your/script.slurm --use-llm --llm-provider anthropic --llm-model claude-3-sonnet --api-key YOUR_API_KEY

# Or using environment variable (set ANTHROPIC_API_KEY=your_key in your environment)
agentslurm /path/to/your/script.slurm --use-llm --llm-provider anthropic --llm-model claude-3-sonnet

Ollama Integration (Local Models)

# Using locally running models (make sure Ollama is installed and running: `ollama serve`)
# No API key needed for local models
agentslurm /path/to/your/script.slurm --use-llm --llm-provider ollama --llm-model llama2

# With custom Ollama server URL
agentslurm /path/to/your/script.slurm --use-llm --llm-provider ollama --llm-model mistral --base-url http://localhost:11434/v1

Hugging Face Integration

# Using Hugging Face models with API key
agentslurm /path/to/your/script.slurm --use-llm --llm-provider huggingface --llm-model microsoft/DialoGPT-medium --api-key YOUR_HF_API_KEY

# With custom Hugging Face endpoint
agentslurm /path/to/your/script.slurm --use-llm --llm-provider huggingface --base-url https://your-endpoint.hf.space --api-key YOUR_HF_API_KEY

Exporting Learned Rules

After running LLM analysis, you can export newly learned rules:

agentslurm /path/to/your/script.slurm --use-llm --llm-provider openai --api-key YOUR_API_KEY --export-rules learned_rules.yaml

Understanding the Output

Report Format

The analyzer produces a structured report with:

Issues Found: Problems detected in your script with severity indicators:
- ⚠️ Warning: Something that could cause performance issues
- ❌ Error: Something that could cause job failure
- ℹ️ Info: Helpful information about your script
Analysis Summary: Key information about what was detected in your script

Example Output

Agentic Slurm Analyzer - Analysis Report
=======================================

Issues Found:
-------------
1. ⚠️ Missing Lustre Striping Configuration
   This workflow appears to process large files (detected tools like bwa, gatk, etc.) without explicit Lustre striping configuration. For large-file I/O patterns, setting an appropriate stripe count and size using 'lfs setstripe' can significantly improve performance. Consider adding 'lfs setstripe -c [n] -s [size] [directory]' where appropriate.

Analysis Summary:
-----------------
• Total findings: 1
• User profile: Medium
• Tools detected: bwa, samtools

Current Analysis Features

1. Lustre I/O Analysis

LUSTRE-001: Missing Lustre Striping for Large Files

Detection: When the script includes tools commonly used for processing large files (bwa, gatk, samtools, vasp, star, hisat2, bowtie2) but no lfs setstripe command is present.

Recommendation: Add an appropriate lfs setstripe command:

# For large files, spread across multiple OSTs
lfs setstripe -c 4 -s 64M $OUTPUT_DIR

LUSTRE-002: Inappropriate Wide Striping for Small Files

Detection: When the script includes tools commonly used for processing many small files (fastqc, multiqc, blastn, blastp, diamond) and a lfs setstripe command with stripe count > 1 is present.

Recommendation: Use single stripe for small files:

# For many small files, use single stripe
lfs setstripe -c 1 $OUTPUT_DIR

LLM Provider Setup

OpenAI

Create an account at OpenAI Platform
Generate an API key in the dashboard
Use the API key with the --api-key option or set the OPENAI_API_KEY environment variable

Anthropic

Create an account at Anthropic
Generate an API key
Use the API key with the --api-key option or set the ANTHROPIC_API_KEY environment variable

Ollama (Local Models)

Install Ollama from ollama.ai
Pull a model: ollama pull llama2
Start the Ollama server: ollama serve
Run Agent Slurm without an API key

Hugging Face

Create an account at Hugging Face and get an API key
Use the API key with the --api-key option or set the HF_TOKEN environment variable

Security Considerations

API keys should be treated as sensitive information
Don't hardcode API keys in script files
Consider using environment variables or secure credential management in production environments
The system only processes the SLURM script content and doesn't store your API keys
API keys can be provided via command line or environment variables

Best Practices

Analyze Your Scripts: Run AgentSlurm before submitting large or important jobs
Check Lustre I/O: Apply Lustre striping recommendations for better performance
Use Appropriate User Profiles: Select the profile level that matches your HPC expertise
Enable LLM Integration: For complex scripts, use LLM analysis to get deeper insights

Getting More Help

If you encounter issues or have questions:

Run your script through AgentSlurm with different user profiles
Check your HPC center's documentation for Lustre guidelines
Consult with system administrators for site-specific recommendations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AgentSlurm User Guide

Overview

Getting Started

Installation

Running the Analyzer

Command Line Options

Basic Options

LLM Options

Advanced Usage with LLM Integration

OpenAI Integration

Anthropic Integration

Ollama Integration (Local Models)

Hugging Face Integration

Exporting Learned Rules

Understanding the Output

Report Format

Example Output

Current Analysis Features

1. Lustre I/O Analysis

LUSTRE-001: Missing Lustre Striping for Large Files

LUSTRE-002: Inappropriate Wide Striping for Small Files

LLM Provider Setup

OpenAI

Anthropic

Ollama (Local Models)

Hugging Face

Security Considerations

Best Practices

Getting More Help

FilesExpand file tree

user_guide.md

Latest commit

History

user_guide.md

File metadata and controls

AgentSlurm User Guide

Overview

Getting Started

Installation

Running the Analyzer

Command Line Options

Basic Options

LLM Options

Advanced Usage with LLM Integration

OpenAI Integration

Anthropic Integration

Ollama Integration (Local Models)

Hugging Face Integration

Exporting Learned Rules

Understanding the Output

Report Format

Example Output

Current Analysis Features

1. Lustre I/O Analysis

LUSTRE-001: Missing Lustre Striping for Large Files

LUSTRE-002: Inappropriate Wide Striping for Small Files

LLM Provider Setup

OpenAI

Anthropic

Ollama (Local Models)

Hugging Face

Security Considerations

Best Practices

Getting More Help