TryOn VTON Agent: LangChain-Based Multi-Provider Agent by kailashahirwar · Pull Request #80 · tryonlabs/opentryon

kailashahirwar · 2025-12-19T08:03:49Z

Virtual Try-On Agent - LangChain-Based Multi-Provider Agent

🎯 Overview

This PR introduces a LangChain-based Virtual Try-On Agent that intelligently selects and uses the appropriate virtual try-on adapter based on natural language prompts. The agent provides a unified interface to multiple VTOn providers (Kling AI, Amazon Nova Canvas, and Segmind) with automatic provider selection and comprehensive error handling.

✨ Features

Core Agent Capabilities

Intelligent Provider Selection: Automatically chooses the best VTOn adapter based on keywords in user prompts
Multi-LLM Support: Compatible with OpenAI GPT, Anthropic Claude, and Google Gemini
Natural Language Interface: Accepts conversational prompts like "Use Kling AI to create a virtual try-on"
Real-time Progress Tracking: Streaming support with intermediate step visibility
Token-Efficient Caching: Stores full image data in memory cache to avoid LLM token limits
Flexible Input: Supports both file paths and URLs for images

Supported Virtual Try-On Providers

Kling AI - High-quality results with asynchronous processing
Amazon Nova Canvas - AWS Bedrock integration with automatic garment detection
Segmind - Fast and efficient for quick iterations

CLI Interface

Comprehensive command-line tool (vton_agent.py)
Support for all three LLM providers
Configurable temperature, model selection, and output directory
Verbose mode for debugging agent reasoning
Automatic image format handling (PNG/Base64)

📁 Files Added/Modified

New Files

tryon/agents/vton/agent.py - Main VTOnAgent class implementation
tryon/agents/vton/tools.py - LangChain tool wrappers for each VTOn adapter
tryon/agents/vton/__init__.py - Module exports
vton_agent.py - CLI interface for the agent
docs/docs/agents/vton-agent.md - Comprehensive documentation

Modified Files

tryon/agents/__init__.py - Export VTOnAgent module
README.md - Added VTOn Agent section with usage examples
requirements.txt - Added LangChain dependencies
docs/sidebars.ts - Added agent documentation to sidebar

🚀 Usage

Python API

from tryon.agents.vton import VTOnAgent

# Initialize agent with OpenAI
agent = VTOnAgent(llm_provider="openai")

# Generate virtual try-on
result = agent.generate(
    person_image="person.jpg",
    garment_image="shirt.jpg",
    prompt="Create a virtual try-on using Kling AI"
)

if result["status"] == "success":
    images = result["images"]
    provider = result["provider"]
    print(f"Generated {len(images)} images using {provider}")

CLI

# Basic usage
python vton_agent.py --person person.jpg --garment shirt.jpg \
    --prompt "Use Kling AI for high-quality try-on"

# Use Anthropic Claude
python vton_agent.py --person person.jpg --garment shirt.jpg \
    --prompt "Generate with Nova Canvas" --llm-provider anthropic

# Verbose mode with custom output directory
python vton_agent.py --person person.jpg --garment shirt.jpg \
    --prompt "Try Segmind for fast results" \
    --output-dir results/ --verbose

🏗️ Architecture

The agent follows LangChain's ReAct pattern:

User Prompt → LLM Reasoning → Tool Selection → Adapter Execution → Result Formatting

User provides: Person image, garment image, and natural language prompt
Agent analyzes: Prompt to identify desired provider (or defaults to Kling AI)
Tool executes: Selected adapter with provided images
Cache stores: Full image data (avoiding token limits)
Agent returns: Structured result with status, provider, and images

🔧 Technical Details

LangChain Integration

Uses create_agent() API for agent creation
Custom @tool decorators for each VTOn provider
Pydantic schemas for type-safe tool inputs
Async streaming with astream() for progress tracking

Cache Management

Global in-memory cache for tool outputs
MD5-based cache keys to reference full image data
Prevents LLM token exhaustion from large base64 images
Cache retrieval via get_tool_output_from_cache()

Error Handling

Comprehensive try-catch blocks in all tools
Graceful fallback from streaming to standard execution
Detailed error messages with provider context
Validation for file paths and URLs

📦 Dependencies

New dependencies added to requirements.txt:

langchain>=1.0.0
langchain-openai>=0.2.0
langchain-anthropic>=0.2.0
langchain-google-genai>=2.0.0
pydantic>=2.0.0

✅ Testing

Manual Testing Checklist

Test Commands Used

# Test with Kling AI
python vton_agent.py --person data/female-model.jpeg --garment data/garment.png \
    --prompt "Use Kling AI" --verbose

# Test with Nova Canvas
python vton_agent.py --person data/female-model.jpeg --garment data/garment.png \
    --prompt "Try Amazon Nova Canvas" --verbose

# Test with Segmind
python vton_agent.py --person data/female-model.jpeg --garment data/garment.png \
    --prompt "Generate with Segmind" --verbose

📚 Documentation

Complete documentation available at:

Online Docs: https://tryonlabs.github.io/opentryon/docs/agents/vton-agent
README: Updated with VTOn Agent section
Inline Docs: Comprehensive docstrings in all classes and methods

🐛 Known Issues & Future Work

Known Issues

create_agent() API compatibility needs verification with latest LangChain version
Default model names (e.g., gpt-5.1) may need updates based on actual model availability
Global cache has no TTL or size limits (unlimited growth)

Future Enhancements

Add unit and integration tests
Implement generate_and_decode() method fully
Add separate async/sync methods (generate() and agenerate())
Implement cache TTL and size limits
Add input validation (image format, size, accessibility)
Add rate limiting and retry logic with exponential backoff
Add logging and metrics/telemetry
Support for additional VTOn providers
Batch processing support for multiple images

🔐 Environment Variables Required

# LLM Providers (at least one required)
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export GOOGLE_API_KEY="your-google-key"

# VTOn Providers (based on which one you'll use)
export KLING_API_KEY="your-kling-key"
export KLING_API_SECRET="your-kling-secret"
export AWS_ACCESS_KEY_ID="your-aws-key"
export AWS_SECRET_ACCESS_KEY="your-aws-secret"
export AWS_REGION="us-east-1"
export SEGMIND_API_KEY="your-segmind-key"

🎉 Benefits

Unified Interface: Single API for multiple VTOn providers
Developer-Friendly: Natural language prompts instead of remembering adapter APIs
Extensible: Easy to add new providers by creating new tools
Production-Ready: Error handling, streaming, and caching built-in
Framework Agnostic: Works with any LangChain-compatible LLM

📝 Migration Guide

For users currently using adapters directly:

Before (Direct Adapter Usage)

from tryon.api import KlingAIVTONAdapter

adapter = KlingAIVTONAdapter()
result = adapter.generate(source_image="person.jpg", reference_image="shirt.jpg")

After (Agent Usage)

from tryon.agents.vton import VTOnAgent

agent = VTOnAgent()
result = agent.generate(
    person_image="person.jpg",
    garment_image="shirt.jpg",
    prompt="Use Kling AI"
)

Note: Direct adapter usage still works and is recommended for programmatic/batch processing where you know exactly which provider to use.

🤝 Contributing

To add a new VTOn provider to the agent:

Create the adapter in tryon/api/

Add a tool in tryon/agents/vton/tools.py:

@tool("provider_name_virtual_tryon", args_schema=YourToolInput)
def provider_virtual_tryon(person_image, garment_image, **kwargs):
    adapter = YourAdapter()
    result = adapter.generate(...)
    return json.dumps(result)

Update get_vton_tools() to include your tool
Update system prompt in agent.py to mention your provider
Add documentation and examples

Type: Feature
Priority: High
Breaking Changes: None
Backward Compatible: Yes

cursor · 2025-12-19T08:03:53Z

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on January 16.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

tryon agent for vton added

15570ee

kailashahirwar changed the title ~~tryon agent for vton added~~ TryOn VTON Agent: LangChain-Based Multi-Provider Agent Dec 19, 2025

kailashahirwar merged commit 847db92 into main Dec 19, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TryOn VTON Agent: LangChain-Based Multi-Provider Agent#80

TryOn VTON Agent: LangChain-Based Multi-Provider Agent#80
kailashahirwar merged 1 commit intomainfrom
agents

kailashahirwar commented Dec 19, 2025 •

edited

Loading

Uh oh!

cursor bot commented Dec 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kailashahirwar commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Virtual Try-On Agent - LangChain-Based Multi-Provider Agent

🎯 Overview

✨ Features

Core Agent Capabilities

Supported Virtual Try-On Providers

CLI Interface

📁 Files Added/Modified

New Files

Modified Files

🚀 Usage

Python API

CLI

🏗️ Architecture

🔧 Technical Details

LangChain Integration

Cache Management

Error Handling

📦 Dependencies

✅ Testing

Manual Testing Checklist

Test Commands Used

📚 Documentation

🐛 Known Issues & Future Work

Known Issues

Future Enhancements

🔐 Environment Variables Required

🎉 Benefits

📝 Migration Guide

Before (Direct Adapter Usage)

After (Agent Usage)

🤝 Contributing

Uh oh!

cursor bot commented Dec 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kailashahirwar commented Dec 19, 2025 •

edited

Loading