Conversation
…nd 4-bit quant support with smoke tests; update docs and tests Co-authored-by: Cosine <agent@cosine.sh>
…nused import, adjust exception typing, improve readme generation quoting) Co-authored-by: Cosine <agent@cosine.sh>
…tions and auto README; include HuggingFace deploy docs; add tests Co-authored-by: Cosine <agent@cosine.sh>
…or client creation to use it, and add smoke script and tests Co-authored-by: Cosine <agent@cosine.sh>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR implements Task 4 features: 4-bit loading in internal evaluation, a Hugging Face Hub publish workflow for adapters, and a Streamlit UI that uses remote inference via Hugging Face Inference. It adds tests, docs, and lightweight UI scaffolding to enable end-to-end workflow without loading large models locally.
What’s included
Part A — evaluate_internal.py 4-bit support and smoke mode
--load_in_4bit, --bnb_4bit_quant_type, --bnb_4bit_compute_dtype, --bnb_4bit_use_double_quant
Part B — Hugging Face Hub publish workflow
--repo_id (required)
--adapter_dir (default outputs/adapters)
--private (bool, default False)
--commit_message (default: Add QLoRA adapter artifacts)
--include_metrics (optional path to a metrics JSON file)
description of the adapter, training dataset, usage notes, safety, and metrics if provided.
Part C — Streamlit UI for remote HF Inference
Project-wide improvements
How to use (quick references)
python scripts/evaluate_internal.py --smoke --val_path data/processed/val.jsonl --out_dir reports/
python scripts/publish_to_hub.py --repo_id your-username/analytics-copilot-text2sql-mistral7b-qlora --adapter_dir outputs/adapters --private
cp .streamlit/secrets.toml.example .streamlit/secrets.toml
streamlit run app/streamlit_app.py
Notes on tests and quality gates
This PR delivers the end-to-end workflow for 4-bit evaluation, HF Hub publishing, and a remote-inference Streamlit UI, aligned with backward-compatible defaults and robust logging/diagnostics.
Original Task: analytics-copilot-text2sql/40b8o5133snj
Author: Brejesh Balakrishnan