Use Fastapi to serve the llama 2 cpp fastapi service.
This repository provides an optimized Docker container setup for running a FastAPI application.
- Utilizes the official Python 3.9 slim image as the base.
- Optimized installation of system packages to reduce container size.
- Pip-based Python dependency management with optimizations to minimize caching and speed up installs.
- Clone this repository:
git clone https://github.com/LiuYuWei/llama-cpp-fastapi-service.git
cd llama-cpp-fastapi-service- Build your Docker image:
docker build -t fastapi-container .- Run your FastAPI application:
# Your FastAPI application should now be running at http://localhost:8000.
docker run -p 8000:8000 fastapi-container- Clone this repository:
git clone https://github.com/LiuYuWei/llama-cpp-fastapi-service.git
cd llama-cpp-fastapi-service- To build the Docker image:
make build- To push the Docker image:
make push- To run the FastAPI application:
# Your FastAPI application should now be running at http://localhost:8000.
make run- To view the logs:
make logs- To remove the running container:
make removeIf you have suggestions or changes, please submit a pull request or open an issue.