Skip to content

Conversation

@TerrenceZhangX
Copy link
Owner

This pull request introduces a new minimal FastAPI-based backend service for LLMCompass, providing an HTTP API for submitting and querying kernel simulation tasks. It includes a Dockerized deployment, an in-memory task queue with background workers, modularized simulation logic, and clear extension points for adding new synchronous simulators. The changes also include comprehensive documentation for building, running, and extending the backend.

The most important changes are:

Backend API and Task Management:

  • Added a new FastAPI application in backend_app/main.py that implements endpoints for health checks, listing supported operations, submitting simulation tasks (with async background processing and optional synchronous wait), and querying task status/results. It uses an in-memory task store and a configurable pool of background worker coroutines for processing simulation tasks.
  • Implemented a new async scheduler and dispatcher in backend_app/scheduler.py, which offloads blocking simulation work to threads and standardizes result formatting and error handling.

Simulation Logic and Extensibility:

  • Added backend_app/sim_utils.py with shared helpers for dtype mapping, tensor creation, error formatting, and supported operations listing.
  • Added backend_app/sync_simulators.py with modular, synchronous simulation implementations for matmul, bmm, layernorm, gelu, and softmax, each using the software/hardware model APIs. Includes a routing function for selecting the appropriate simulator based on the requested operation.

Deployment and Documentation:

  • Updated the Dockerfile to copy the full application source, install FastAPI and dependencies, set up the conda environment, and launch the API server with Uvicorn. Removed legacy GitHub clone and shell activation logic.
  • Added a detailed backend_app/README.md with instructions for building/running the backend, API usage examples, environment variables, code structure, and guidelines for adding new simulators.

Dependency Management:

  • Updated environment.yml to include required Python and pip dependencies (e.g., torch, FastAPI, pytest) and removed unused channels and packages.

@TerrenceZhangX TerrenceZhangX self-assigned this Sep 17, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces a new FastAPI-based backend service for LLMCompass that provides HTTP API endpoints for submitting and querying kernel simulation tasks. The backend features async task processing, Docker deployment, and modular simulation logic with clear extension points.

  • Adds a FastAPI application with endpoints for health checks, supported operations, task submission, and status querying
  • Implements async task scheduling with background workers and thread-based simulation execution
  • Provides modular synchronous simulators for matmul, bmm, layernorm, gelu, and softmax operations

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_api_integration.py Comprehensive integration tests covering all supported operations with artifact generation
environment.yml Updated dependencies to include torch, fastapi, and pytest with version pinning
backend_app/sync_simulators.py Modular synchronous simulation implementations with operation routing
backend_app/sim_utils.py Shared utilities for dtype mapping, tensor creation, and error handling
backend_app/scheduler.py Async scheduler for dispatching simulation tasks to threads
backend_app/main.py FastAPI application with task management and background workers
backend_app/README.md Comprehensive documentation for building, running, and extending the backend
Dockerfile Updated container setup for FastAPI deployment

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

TerrenceZhangX and others added 5 commits September 17, 2025 13:20
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants