-
Notifications
You must be signed in to change notification settings - Fork 0
Backend wrapper #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request introduces a new FastAPI-based backend service for LLMCompass that provides HTTP API endpoints for submitting and querying kernel simulation tasks. The backend features async task processing, Docker deployment, and modular simulation logic with clear extension points.
- Adds a FastAPI application with endpoints for health checks, supported operations, task submission, and status querying
- Implements async task scheduling with background workers and thread-based simulation execution
- Provides modular synchronous simulators for matmul, bmm, layernorm, gelu, and softmax operations
Reviewed Changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_api_integration.py | Comprehensive integration tests covering all supported operations with artifact generation |
| environment.yml | Updated dependencies to include torch, fastapi, and pytest with version pinning |
| backend_app/sync_simulators.py | Modular synchronous simulation implementations with operation routing |
| backend_app/sim_utils.py | Shared utilities for dtype mapping, tensor creation, and error handling |
| backend_app/scheduler.py | Async scheduler for dispatching simulation tasks to threads |
| backend_app/main.py | FastAPI application with task management and background workers |
| backend_app/README.md | Comprehensive documentation for building, running, and extending the backend |
| Dockerfile | Updated container setup for FastAPI deployment |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This pull request introduces a new minimal FastAPI-based backend service for LLMCompass, providing an HTTP API for submitting and querying kernel simulation tasks. It includes a Dockerized deployment, an in-memory task queue with background workers, modularized simulation logic, and clear extension points for adding new synchronous simulators. The changes also include comprehensive documentation for building, running, and extending the backend.
The most important changes are:
Backend API and Task Management:
backend_app/main.pythat implements endpoints for health checks, listing supported operations, submitting simulation tasks (with async background processing and optional synchronous wait), and querying task status/results. It uses an in-memory task store and a configurable pool of background worker coroutines for processing simulation tasks.backend_app/scheduler.py, which offloads blocking simulation work to threads and standardizes result formatting and error handling.Simulation Logic and Extensibility:
backend_app/sim_utils.pywith shared helpers for dtype mapping, tensor creation, error formatting, and supported operations listing.backend_app/sync_simulators.pywith modular, synchronous simulation implementations formatmul,bmm,layernorm,gelu, andsoftmax, each using the software/hardware model APIs. Includes a routing function for selecting the appropriate simulator based on the requested operation.Deployment and Documentation:
Dockerfileto copy the full application source, install FastAPI and dependencies, set up the conda environment, and launch the API server with Uvicorn. Removed legacy GitHub clone and shell activation logic.backend_app/README.mdwith instructions for building/running the backend, API usage examples, environment variables, code structure, and guidelines for adding new simulators.Dependency Management:
environment.ymlto include required Python and pip dependencies (e.g., torch, FastAPI, pytest) and removed unused channels and packages.