Skip to content

Fix Dockerfile for successful builds, improve /chat response quality, and add dev workflow tooling#8

Open
sriharip123 wants to merge 3 commits intogrctest:mainfrom
sriharip123:main
Open

Fix Dockerfile for successful builds, improve /chat response quality, and add dev workflow tooling#8
sriharip123 wants to merge 3 commits intogrctest:mainfrom
sriharip123:main

Conversation

@sriharip123
Copy link

This PR makes the FastAPI-BitNet project buildable and usable out of the box with Docker, fixes the /chat endpoint to return clean responses, and adds developer convenience tooling.

Changes
Dockerfile fixes (81eb61b)

Removed unavailable packages (software-properties-common, lsb-release) for the python:3.10 base image
Switched to Debian repo clang package instead of the LLVM install script
Added sed patch for const-correctness error in ggml-bitnet-mad.cpp (clang is stricter than gcc)
Switched to pre-built GGUF model from microsoft/bitnet-b1.58-2B-4T-gguf to avoid broken HF-to-GGUF conversion (BitNetForCausalLM not supported by the converter)
Added docker-compose.yml for single-command builds
Added Postman collection covering all 19 API endpoints
Chat endpoint fix (d407312)

Switched from /completion to /v1/chat/completions so llama-server applies the correct LLaMA 3 chat template automatically
Added _clean_response() post-processor to strip repetitive patterns (Question:, Input:, (no answer), etc.)
Reduced default n_predict from 256 to 128 for cleaner output
Developer workflow (33e3d31)

Added volume mounts for main.py and lib/ in docker-compose.yml so code changes don't require a full rebuild
Added --reload flag to uvicorn for automatic hot-reloading on file changes

- Fix apt package issues (remove software-properties-common, lsb-release)
- Patch const-correctness error in ggml-bitnet-mad.cpp for clang
- Use pre-built GGUF model to skip broken HF-to-GGUF conversion
- Add docker-compose.yml for easy container management
- Add Postman collection covering all 19 API endpoints
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant