Skip to content
KHemanthRaju edited this page Nov 18, 2025 · 1 revision

API Reference

Complete reference for all API endpoints in the RAG Process Visualizer.

Base URL

All API endpoints are relative to your Next.js application:

  • Development: http://localhost:3000/api
  • Production: https://your-domain.com/api

Endpoints

POST /api/chunk

Chunks a document into smaller pieces based on the specified chunk size.

Request Body:

{
  "text": "Your document text here...",
  "chunkSize": 200
}

Parameters:

  • text (string, required): The document text to chunk
  • chunkSize (number, optional): Maximum characters per chunk (default: 200, min: 50, max: 500)

Response:

{
  "chunks": [
    {
      "id": "chunk-0",
      "text": "First chunk text...",
      "index": 0
    },
    {
      "id": "chunk-1",
      "text": "Second chunk text...",
      "index": 1
    }
  ]
}

Example:

curl -X POST http://localhost:3000/api/chunk \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your document text...",
    "chunkSize": 200
  }'

Error Responses:

  • 400: Invalid text input
  • 500: Chunking error

POST /api/embed

Generates vector embeddings for document chunks.

Request Body:

{
  "chunks": [
    {
      "id": "chunk-0",
      "text": "Chunk text...",
      "index": 0
    }
  ]
}

Parameters:

  • chunks (array, required): Array of chunk objects

Response:

{
  "embeddings": [
    {
      "chunkId": "chunk-0",
      "vector": [0.123, -0.456, 0.789, ...],
      "dimension": 384
    }
  ]
}

Example:

curl -X POST http://localhost:3000/api/embed \
  -H "Content-Type: application/json" \
  -d '{
    "chunks": [
      {
        "id": "chunk-0",
        "text": "Chunk text...",
        "index": 0
      }
    ]
  }'

Note: Currently uses simulated embeddings. In production, integrate with OpenAI, Cohere, or similar services.

Error Responses:

  • 400: Invalid chunks input
  • 500: Embedding generation error

POST /api/vectordb

Stores embeddings in a vector database.

Request Body:

{
  "embeddings": [
    {
      "chunkId": "chunk-0",
      "vector": [0.123, -0.456, ...],
      "dimension": 384
    }
  ]
}

Parameters:

  • embeddings (array, required): Array of embedding objects

Response:

{
  "success": true,
  "stored": 5,
  "message": "Successfully stored 5 vectors in the database"
}

Example:

curl -X POST http://localhost:3000/api/vectordb \
  -H "Content-Type: application/json" \
  -d '{
    "embeddings": [...]
  }'

Note: Currently simulates storage. In production, integrate with Pinecone, Weaviate, Qdrant, or similar.

Error Responses:

  • 400: Invalid embeddings input
  • 500: Storage error

POST /api/query

Processes a query and retrieves relevant chunks using similarity search.

Request Body:

{
  "query": "What is machine learning?",
  "embeddings": [
    {
      "chunkId": "chunk-0",
      "vector": [0.123, -0.456, ...],
      "dimension": 384
    }
  ],
  "chunks": [
    {
      "id": "chunk-0",
      "text": "Chunk text...",
      "index": 0
    }
  ]
}

Parameters:

  • query (string, required): User's search query
  • embeddings (array, required): All stored embeddings
  • chunks (array, required): All chunks corresponding to embeddings

Response:

{
  "query": "What is machine learning?",
  "relevantChunks": [
    {
      "id": "chunk-2",
      "text": "Relevant chunk text...",
      "index": 2
    }
  ],
  "scores": [0.95, 0.87, 0.82]
}

Algorithm:

  1. Generates embedding for the query
  2. Calculates cosine similarity with all chunk embeddings
  3. Returns top 3 most relevant chunks with scores

Example:

curl -X POST http://localhost:3000/api/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is machine learning?",
    "embeddings": [...],
    "chunks": [...]
  }'

Error Responses:

  • 400: Invalid input
  • 500: Query processing error

POST /api/generate

Generates a response based on the query and retrieved context.

Request Body:

{
  "query": "What is machine learning?",
  "context": [
    {
      "id": "chunk-2",
      "text": "Relevant chunk text...",
      "index": 2
    }
  ]
}

Parameters:

  • query (string, required): Original user query
  • context (array, required): Retrieved relevant chunks

Response:

{
  "prompt": "What is machine learning?",
  "response": "Based on the retrieved context, machine learning is...",
  "context": [
    "Relevant chunk text...",
    "Another relevant chunk..."
  ]
}

Example:

curl -X POST http://localhost:3000/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is machine learning?",
    "context": [...]
  }'

Note: Currently uses template-based generation. In production, integrate with GPT-4, Claude, or similar LLMs.

Error Responses:

  • 400: Invalid input
  • 500: Generation error

Data Types

Chunk

interface Chunk {
  id: string;      // Unique chunk identifier
  text: string;    // Chunk text content
  index: number;   // Chunk position in document
}

Embedding

interface Embedding {
  chunkId: string;    // Reference to chunk
  vector: number[];   // Embedding vector
  dimension: number;  // Vector dimension (384)
}

QueryResult

interface QueryResult {
  query: string;           // Original query
  relevantChunks: Chunk[]; // Top relevant chunks
  scores: number[];        // Similarity scores (0-1)
}

GenerationResult

interface GenerationResult {
  prompt: string;    // Original query
  response: string;  // Generated response
  context: string[]; // Context chunks used
}

Error Handling

All endpoints return standard HTTP status codes:

  • 200: Success
  • 400: Bad Request (invalid input)
  • 500: Internal Server Error

Error response format:

{
  "error": "Error message description"
}

Rate Limiting

Currently, there are no rate limits. For production use, implement rate limiting based on your requirements.

Authentication

Currently, no authentication is required. For production:

  • Implement API key authentication
  • Add user authentication
  • Use JWT tokens

Best Practices

  1. Chunk Size: Use 150-300 characters for optimal results
  2. Batch Processing: Process multiple documents sequentially
  3. Error Handling: Always handle API errors gracefully
  4. Loading States: Show loading indicators during API calls
  5. Validation: Validate inputs before sending requests

Clone this wiki locally