title	description	sidebarTitle
Quickstart	Get started with Handit.ai's complete AI observability and optimization platform in under 30 minutes.	Quickstart

import { Callout } from "nextra/components"; import { Steps } from "nextra/components"; import { Tabs } from "nextra/components";

Complete Handit.ai Quickstart

The Open Source Engine that Auto-Improves Your AI.
Handit evaluates every agent decision, auto-generates better prompts and datasets, A/B-tests the fix, and lets you control what goes live.

**What you'll build:** A fully observable, continuously evaluated, and automatically optimizing AI system that improves itself based on real production data.

Overview: The Complete Journey

Here's what we'll accomplish in three phases:

### [Phase 1: AI Observability](#phase-1-ai-observability-5-minutes) ⏱️ 5 minutes Set up comprehensive tracing to see inside your AI agents and understand what they're doing

Phase 2: Quality Evaluation ⏱️ 10 minutes

Add automated evaluation to continuously assess performance across multiple quality dimensions

Phase 3: Self-Improving AI ⏱️ 15 minutes

Enable automatic optimization that generates better prompts, tests them, and provides proven improvements

**The Result**: Complete visibility into performance with automated optimization recommendations based on real production data.

Prerequisites

Before we start, make sure you have:

A Handit.ai Account (sign up if needed)
15-30 minutes to complete the setup

Phase 1: AI Observability (5 minutes)

Let's add comprehensive tracing to see exactly what your AI is doing.

Step 1: Install the SDK

pip install handit_ai

</Tabs.Tab> <Tabs.Tab>

npm i @handit.ai/handit-ai

</Tabs.Tab>

Step 2: Get Your Integration Token

Log into your Handit.ai Dashboard
Go to Settings → Integrations
Copy your integration token

<video width="100%" autoPlay loop muted playsInline style={{ borderRadius: '8px', border: '1px solid #e5e7eb' }}

Your browser does not support the video tag.

Step 3: Add Simplified Tracing

Now, let's add tracing to your main agent function using our simplified approach. You only need to instrument the entry point - no need to trace individual child functions.

Simplified Python Approach - Just add the decorator to your entry point:

# Auto-generated by handit-cli setup
from handit_ai import tracing, configure
import os

configure(HANDIT_API_KEY=os.getenv("HANDIT_API_KEY"))

# Tracing added to your main agent function (entry point)
@tracing(agent="customer-service-agent")
async def process_customer_request(user_message: str):
    # Your existing agent logic (unchanged)
    intent = await classify_intent(user_message)      # Not traced individually
    context = await search_knowledge(intent)          # Not traced individually  
    response = await generate_response(context)       # Not traced individually
    return response

For FastAPI endpoints, put the decorator below the endpoint:

from handit_ai import tracing, configure
import os
from fastapi import FastAPI

configure(HANDIT_API_KEY=os.getenv("HANDIT_API_KEY"))

app = FastAPI()

@app.post("/process")
@tracing(agent="customer-service-agent")
async def process_customer_request(user_message: str):
    # Your existing agent logic (unchanged)
    intent = await classify_intent(user_message)      # Not traced individually
    context = await search_knowledge(intent)          # Not traced individually  
    response = await generate_response(context)       # Not traced individually
    return response

</Tabs.Tab> <Tabs.Tab>

Simplified JavaScript Approach - Just wrap your entry point:

// Auto-generated by handit-cli setup
import { configure, startTracing, endTracing } from '@handit.ai/handit-ai';

configure({
  HANDIT_API_KEY: process.env.HANDIT_API_KEY
});

// Tracing added to your main agent function (entry point)
export const processCustomerRequest = async (userMessage) => {
  startTracing({ agent: "customer-service-agent" });
  try {
    // Your existing agent logic (unchanged)
    const intent = await classifyIntent(userMessage);     // Not traced individually
    const context = await searchKnowledge(intent);       // Not traced individually
    const response = await generateResponse(context);     // Not traced individually
    return response;
  } finally {
    endTracing();
  }
};

</Tabs.Tab>

**Simplified Approach:** With this new simplified approach, you only need to add tracing to your entry point function. Handit.ai will automatically trace the entire execution flow from there. **Phase 1 Complete!** 🎉 You now have full observability with automatic tracing of your entire agent execution flow from the entry point.

➡️ Want to dive deeper? Check out our detailed Tracing Quickstart for advanced features and best practices.

Phase 2: Quality Evaluation (10 minutes)

Now let's add automated evaluation to continuously assess quality across multiple dimensions.

Step 1: Connect Evaluation Models

Go to Settings → Model Tokens
Add your OpenAI or other model credentials
These models will act as "judges" to evaluate responses

<video width="100%" autoPlay loop muted playsInline style={{ borderRadius: '8px', border: '1px solid #e5e7eb' }}

Your browser does not support the video tag.

Step 2: Create Focused Evaluators

Create separate evaluators for each quality aspect. Critical principle: One evaluator = one quality dimension.

Go to Evaluation → Evaluation Suite
Click Create New Evaluator

Example Evaluator 1: Response Completeness

You are evaluating whether an AI response completely addresses the user's question.

Focus ONLY on completeness - ignore other quality aspects.

User Question: {input}
AI Response: {output}

Rate on a scale of 1-10:
1-2 = Missing major parts of the question
3-4 = Addresses some parts but incomplete
5-6 = Addresses most parts adequately  
7-8 = Addresses all parts well
9-10 = Thoroughly addresses every aspect

Output format:
Score: [1-10]
Reasoning: [Brief explanation]

Example Evaluator 2: Accuracy Check

You are checking if an AI response contains accurate information.

Focus ONLY on factual accuracy - ignore other aspects.

User Question: {input}
AI Response: {output}

Rate on a scale of 1-10:
1-2 = Contains obvious false information
3-4 = Contains questionable claims
5-6 = Mostly accurate with minor concerns
7-8 = Accurate information
9-10 = Completely accurate and verifiable

Output format:
Score: [1-10]
Reasoning: [Brief explanation]

<video width="100%" autoPlay loop muted playsInline style={{ borderRadius: '8px', border: '1px solid #e5e7eb' }}

Your browser does not support the video tag.

Step 3: Associate Evaluators to Your LLM Nodes

Go to Agent Performance
Select your LLM node (e.g., "response-generator")
Click on Manage Evaluators on the menu
Add your evaluators

<video width="100%" autoPlay loop muted playsInline style={{ borderRadius: '8px', border: '1px solid #e5e7eb' }}

Your browser does not support the video tag. ### Step 4: Monitor Results

View real-time evaluation results in:

Tracing tab: Individual evaluation scores
Agent Performance: Quality trends over time

Tracing Dashboard - Individual Evaluation Scores:

Agent Performance Dashboard - Quality Trends:

**Phase 2 Complete!** 🎉 Continuous evaluation is now running across multiple quality dimensions with real-time insights into performance trends.

➡️ Want more sophisticated evaluators? Check out our detailed Evaluation Quickstart for advanced techniques.

Phase 3: Self-Improving AI (15 minutes)

Finally, let's enable automatic optimization that generates better prompts and provides proven improvements.

Step 1: Connect Optimization Models

Go to Settings → Model Tokens
Select optimization model tokens
Self-improving AI automatically activates once configured

<video width="100%" autoPlay loop muted playsInline style={{ borderRadius: '8px', border: '1px solid #e5e7eb' }}

Your browser does not support the video tag. **Automatic Activation**: Once optimization tokens are configured, the system automatically begins analyzing evaluation data and generating optimizations. No additional setup required!

Step 2: Deploy Optimizations

Review Recommendations in Release Hub
Compare Performance between current and optimized prompts
Mark as Production for prompts you want to deploy
Fetch via SDK in your application

<video width="100%" autoPlay loop muted playsInline style={{ borderRadius: '8px', border: '1px solid #e5e7eb' }}

Your browser does not support the video tag.

Fetch Optimized Prompts:

from handit_ai import HanditClient

# Initialize client
handit = HanditClient(api_key="your-api-key")

# Fetch current production prompt
optimized_prompt = handit.fetch_optimized_prompt(
    model_id="response-generator"
)

# Use in your LLM calls
response = your_llm_client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": optimized_prompt},
        {"role": "user", "content": user_query}
    ]
)

</Tabs.Tab> <Tabs.Tab>

import { HanditClient } from '@handit.ai/handit-ai';

const handit = new HanditClient({ apiKey: 'your-api-key' });

// Fetch current production prompt
const optimizedPrompt = await handit.fetchOptimizedPrompt({ 
  modelId: 'response-generator' 
});

// Use in your LLM calls
const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [
    { role: 'system', content: optimizedPrompt },
    { role: 'user', content: userQuery }
  ]
});

</Tabs.Tab>

**Phase 3 Complete!** 🎉 You now have a self-improving AI that automatically detects quality issues, generates better prompts, tests them in the background, and provides proven improvements.

➡️ Want advanced optimization features? Check out our detailed Optimization Quickstart for CI/CD integration and deployment strategies.

What You've Accomplished

Congratulations! You now have a complete AI observability and optimization system:

✅ Full Observability

Complete visibility into operations
Real-time monitoring of all LLM calls and tools
Detailed execution traces with timing and error tracking

✅ Continuous Evaluation

Automated quality assessment across multiple dimensions
Real-time evaluation scores and trends
Quality insights to identify improvement opportunities

✅ Self-Improving AI

Automatic detection of quality issues
AI-generated prompt optimizations
Background A/B testing with statistical confidence
Production-ready improvements delivered via SDK

Next Steps

Join our Discord community for support
Check out GitHub Issues for additional help
Explore Tracing to monitor your AI agents
Set up Evaluation to grade your AI outputs
Configure Optimization for continuous improvement

Resources

Tracing Documentation - Monitor AI agent performance
Evaluation Documentation - Grade AI outputs automatically
Optimization Documentation - Improve prompts continuously
Visit our GitHub Issues page

**Ready to transform your AI?** Visit [beta.handit.ai](https://beta.handit.ai) to get started with the complete Handit.ai platform today.

Troubleshooting

Tracing Not Working?

Verify your API key is correct and set as environment variable
Ensure you're using the functions correct

Evaluations Not Running?

Confirm model tokens are valid and have sufficient credits
Verify LLM nodes are receiving traffic
Check evaluation percentages are > 0%

Optimizations Not Generating?

Ensure evaluation data shows quality issues (scores below threshold)
Verify optimization model tokens are configured
Confirm sufficient evaluation data has been collected

Need Help?

Visit our Support page
Join our Discord community
Check individual quickstart guides for detailed troubleshooting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complete Handit.ai Quickstart

Overview: The Complete Journey

Phase 2: Quality Evaluation ⏱️ 10 minutes

Phase 3: Self-Improving AI ⏱️ 15 minutes

Prerequisites

Phase 1: AI Observability (5 minutes)

Step 1: Install the SDK

Step 2: Get Your Integration Token

Step 3: Add Simplified Tracing

Phase 2: Quality Evaluation (10 minutes)

Step 1: Connect Evaluation Models

Step 2: Create Focused Evaluators

Step 3: Associate Evaluators to Your LLM Nodes

Phase 3: Self-Improving AI (15 minutes)

Step 1: Connect Optimization Models

Step 2: Deploy Optimizations

What You've Accomplished

✅ Full Observability

✅ Continuous Evaluation

✅ Self-Improving AI

Next Steps

Resources

Troubleshooting

FilesExpand file tree

quickstart.mdx

Latest commit

History

quickstart.mdx

File metadata and controls

Complete Handit.ai Quickstart

Overview: The Complete Journey

Phase 2: Quality Evaluation ⏱️ 10 minutes

Phase 3: Self-Improving AI ⏱️ 15 minutes

Prerequisites

Phase 1: AI Observability (5 minutes)

Step 1: Install the SDK

Step 2: Get Your Integration Token

Step 3: Add Simplified Tracing

Phase 2: Quality Evaluation (10 minutes)

Step 1: Connect Evaluation Models

Step 2: Create Focused Evaluators

Step 3: Associate Evaluators to Your LLM Nodes

Phase 3: Self-Improving AI (15 minutes)

Step 1: Connect Optimization Models

Step 2: Deploy Optimizations

What You've Accomplished

✅ Full Observability

✅ Continuous Evaluation

✅ Self-Improving AI

Next Steps

Resources

Troubleshooting