Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ venv/
# Environment / secrets
.env
*.env
.streamlit/secrets.toml

# Data and outputs
data/
Expand Down
28 changes: 28 additions & 0 deletions .streamlit/secrets.toml.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Example Streamlit secrets for the Analytics Copilot Text-to-SQL demo.
#
# IMPORTANT:
# - Do NOT commit real tokens or endpoint URLs to version control.
# - Create a local `.streamlit/secrets.toml` (ignored by git) and fill in
# environment-specific values there, or configure secrets directly in
# Streamlit Community Cloud.
#
# The app also supports reading these values from environment variables; when
# both are present, Streamlit secrets take precedence.

HF_TOKEN = "hf_your_access_token_here"

# Preferred: dedicated Inference Endpoint / TGI deployment with adapters.
# This should point to your HF Inference Endpoint URL, and HF_ADAPTER_ID should
# match the adapter identifier configured in the endpoint's LORA_ADAPTERS.
HF_ENDPOINT_URL = "https://youendpoint.endpoints.huggingface.cloud"
HF_ADAPTER_ID = "your-adapter-id"

# Fallback: provider/router-based model id (no adapters).
# Use this only with models that are directly supported by HF Inference
# providers (e.g. merged text-to-SQL models), not pure adapter repos.
HF_MODEL_ID = "your-model-id"
HF_PROVIDER = "auto"

# Compatibility: older name for the endpoint base URL. The app will treat this
# as an alias for HF_ENDPOINT_URL if set.
HF_INFERENCE_BASE_URL = ""
165 changes: 151 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,19 @@ python scripts/evaluate_internal.py \
--out_dir reports/
```

Quick smoke test (small subset, CPU-friendly):

```bash
python scripts/evaluate_internal.py --smoke \
--val_path data/processed/val.jsonl \
--out_dir reports/
```

- On GPU machines, `--smoke` runs a tiny subset through the full pipeline.
- On CPU-only environments, `--smoke` automatically falls back to `--mock` so
that no heavy model loading is attempted while still exercising the metrics
and reporting stack.

Outputs:

- `reports/eval_internal.json` – metrics, config, and sample predictions.
Expand Down Expand Up @@ -257,18 +270,137 @@ pytest -q

These commands are also wired into the CI workflow (`.github/workflows/ci.yml`).

## Demo (placeholder)
---

## Publishing to Hugging Face Hub

After training, you can publish the QLoRA adapter artifacts to the Hugging Face Hub
for reuse and remote inference.

1. Authenticate with Hugging Face:

```bash
huggingface-cli login --token YOUR_HF_TOKEN
```

or set an environment variable:

```bash
export HF_TOKEN=YOUR_HF_TOKEN
```

2. Run the publish script, pointing it at your adapter directory and desired repo id:

```bash
python scripts/publish_to_hub.py \
--repo_id your-username/analytics-copilot-text2sql-mistral7b-qlora \
--adapter_dir outputs/adapters \
--include_metrics reports/eval_internal.json
```

The script will:

- Validate that a Hugging Face token is available.
- Create the model repo if it does not exist (public by default, or `--private`).
- Ensure a README.md model card is written into the adapter directory (with metrics
if provided).
- Upload the entire adapter folder to the Hub using `huggingface_hub.HfApi`.

You can re-run the script safely; it will perform another commit with the specified
`--commit_message`.

---



## Demo – Streamlit UI (Remote Inference)

The repository includes a lightweight Streamlit UI that talks to a **remote**
model via Hugging Face Inference (no local GPU required). The app lives at
`app/streamlit_app.py` and intentionally does **not** import `torch` or
`transformers`.

### Remote Inference Note

- The Streamlit app is **UI-only**; it never loads model weights locally.
- All text-to-SQL generation is performed remotely using
`huggingface_hub.InferenceClient`.
- For small models you may be able to use Hugging Face serverless Inference,
but large models like Mistral-7B often require **Inference Endpoints** or a
dedicated provider.
- If serverless calls fail or time out, consider deploying a dedicated
Inference Endpoint or self-hosted TGI/serving stack and pointing the app at
its URL via `HF_ENDPOINT_URL` / `HF_INFERENCE_BASE_URL`.

> **Adapter repos and the HF router:** If you point the app at a pure LoRA
> adapter repository (e.g. `BrejBala/analytics-copilot-mistral7b-text2sql-adapter`)
> using `HF_MODEL_ID` without an `HF_ENDPOINT_URL`, the request goes through
> the Hugging Face **router** and most providers will respond with
> `model_not_supported`. For adapter-based inference, use a dedicated
> Inference Endpoint and configure `HF_ENDPOINT_URL` + `HF_ADAPTER_ID` instead
> of trying to call the adapter repo directly via the router.

### Running the app locally

1. Configure Streamlit secrets by creating `.streamlit/secrets.toml` from the
example:

```bash
cp .streamlit/secrets.toml.example .streamlit/secrets.toml
```

Then edit `.streamlit/secrets.toml` (not tracked by git) and fill in either:

**Preferred: dedicated endpoint + adapter**

```toml
HF_TOKEN = "hf_your_access_token_here"

# Dedicated Inference Endpoint / TGI URL
HF_ENDPOINT_URL = "https://o0mmkmv1itfrikie.us-east4.gcp.endpoints.huggingface.cloud"

# Adapter identifier configured in your endpoint's LORA_ADAPTERS
HF_ADAPTER_ID = "BrejBala/analytics-copilot-mistral7b-text2sql-adapter"
```

**Fallback: provider/router-based merged model (no adapters)**

```toml
HF_TOKEN = "hf_your_access_token_here"
HF_MODEL_ID = "your-username/your-merged-text2sql-model"
HF_PROVIDER = "auto" # optional provider hint
```

`HF_INFERENCE_BASE_URL` is also supported as an alias for `HF_ENDPOINT_URL`.
The app will always prefer secrets over environment variables when both are
set.

2. Start the Streamlit app:

```bash
streamlit run app/streamlit_app.py
```

3. In the UI:

- Paste your database schema (DDL) into the **Schema** text area.
- Enter a natural-language question.
- Click **Generate SQL** to call the remote model.
- View the generated SQL in a code block (with a copy button).
- Optionally open the **Show prompt** expander to inspect the exact prompt
sent to the model (useful for debugging and prompt engineering).

> The Streamlit UI will be documented here when implemented.
### Deploying on Streamlit Community Cloud

Planned content for this section:
When deploying to Streamlit Cloud:

- How to start the Streamlit app (under `app/`).
- Sample configuration for connecting to a demo database or local SQLite DB.
- Usage examples:
- Asking natural-language questions.
- Viewing generated SQL and query results.
- Editing and re-running SQL.
- Add `HF_TOKEN`, `HF_ENDPOINT_URL`, and `HF_ADAPTER_ID` (or `HF_MODEL_ID` /
`HF_PROVIDER` for the router fallback) to the app's **Secrets** in the
Streamlit Cloud UI.
- The app will automatically construct an `InferenceClient` from those values
and use the dedicated endpoint when `HF_ENDPOINT_URL` is set.
- No GPU is required on the Streamlit side; all heavy lifting is done by the
remote Hugging Face Inference backend.

---

Expand All @@ -278,19 +410,23 @@ Current high-level layout:

```text
.
├── app/ # Streamlit app (to be implemented)
├── app/ # Streamlit UI (remote inference via HF InferenceClient)
│ └── streamlit_app.py
├── docs/ # Documentation, design notes, evaluation reports
│ ├── dataset.md
│ ├── training.md
│ ├── evaluation.md
│ └── external_validation.md
├── notebooks/ # Jupyter/Colab notebooks for experimentation
├── scripts/ # CLI scripts (dataset, training, evaluation)
├── scripts/ # CLI scripts (dataset, training, evaluation, utilities)
│ ├── build_dataset.py
│ ├── check_syntax.py
│ ├── smoke_load_dataset.py
│ ├── smoke_infer_endpoint.py
│ ├── train_qlora.py
│ ├── evaluate_internal.py
│ └── evaluate_spider_external.py
│ ├── evaluate_spider_external.py
│ └── publish_to_hub.py
├── src/
│ └── text2sql/ # Core Python package
│ ├── __init__.py
Expand All @@ -315,11 +451,12 @@ Current high-level layout:
│ ├── test_repo_smoke.py
│ ├── test_build_dataset_offline.py
│ ├── test_data_prep.py
│ ├── test_eval_cli_args.py
│ ├── test_infer_quantization.py
│ ├── test_prompt_formatting.py
│ ├── test_normalize_sql.py
│ ├── test_schema_adherence.py
│ ├── test_metrics_aggregate.py
│ └── test_prompt_building_spider.py
│ └── test_metrics_aggregate.py
├── .env.example # Example environment file
├── .gitignore
├── context.md # Persistent project context & decisions
Expand Down
Loading