Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 19 additions & 45 deletions examples/demo-snowflake-project/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,22 @@
# Databao Demo — Streamlit in Snowflake

This project deploys the [Databao](https://github.com/JetBrains/databao-cli) Streamlit UI as a native **Streamlit-in-Snowflake (SiS)** application. It connects to a Snowflake database as its datasource, loads secrets at runtime via a Snowflake UDF, and runs the chat-based data exploration interface directly inside your Snowflake account.
This project deploys the [Databao](https://github.com/JetBrains/databao-cli) Streamlit UI as a native **Streamlit-in-Snowflake (SiS)** application. It connects to a Snowflake database as its datasource using OAuth (the Streamlit app's own session), loads secrets at runtime via a Snowflake UDF, and runs the chat-based data exploration interface directly inside your Snowflake account.

## Prerequisites

- A Snowflake account with `ACCOUNTADMIN` privileges (for the initial setup)
- A Snowflake account with privileges to create databases, warehouses, integrations, and Streamlit apps
- An OpenAI API key and/or an Anthropic API key
- A Snowflake database with data you want the Databao agent to explore (see [Datasource credentials](#datasource-credentials-sf_ds_) below)
- A Snowflake warehouse and database with data you want the Databao agent to explore

## How It Works

1. **`setup.sql`** provisions everything needed inside Snowflake:
- A dedicated database, warehouse, and compute pool (all named with a configurable suffix)
- A dedicated database and warehouse (named with a configurable suffix)
- Network rules and external access integrations for outbound HTTPS
- A service user with a permissive network policy
- A Git repository object pointing at `databao-cli` on GitHub
- Snowflake secrets for the OpenAI/Anthropic API keys and datasource credentials
- Snowflake secrets for the OpenAI/Anthropic API keys and datasource coordinates (warehouse, database)
- A Python UDF (`get_secret`) that reads those secrets at runtime
- The Streamlit app itself, running on a container runtime (`CPU_X64_M`)
- The Streamlit app itself, running on `SYSTEM_COMPUTE_POOL_CPU` (or an optional dedicated compute pool)

2. **`cleanup.sql`** removes all objects created by `setup.sql` for a given suffix.

Expand All @@ -27,7 +26,7 @@ This project deploys the [Databao](https://github.com/JetBrains/databao-cli) Str
- Locates and configures the ADBC Snowflake driver shared library so DuckDB's Snowflake extension can find it
- Launches the standard Databao UI in **read-only domain** mode

4. **`databao/domains/root/`** contains the Databao domain definition a Snowflake datasource configured via environment variables and sample context files that ship with the demo.
4. **`databao/domains/root/`** contains the Databao domain definition -- a Snowflake datasource configured via environment variables and sample context files that ship with the demo.

## Setup

Expand All @@ -38,49 +37,24 @@ Open `setup.sql` and fill in the placeholder values at the top:
| Variable | Description |
|---|---|
| `suffix` | Name suffix appended to all Snowflake objects. Set to e.g. `V2` to create a fully independent copy (objects will be named `STREAMLIT_DATABAO_DB_V2`, etc.). Changing the suffix lets you run multiple independent instances side by side. |
| `openai_key` | OpenAI API key |
| `anthropic_key` | Anthropic API key |
| `sf_ds_account` | Snowflake datasource account identifier (see [below](#datasource-credentials-sf_ds_)) |
| `sf_ds_warehouse` | Warehouse for the datasource (see [below](#datasource-credentials-sf_ds_)) |
| `sf_ds_database` | Database to explore (see [below](#datasource-credentials-sf_ds_)) |
| `sf_ds_user` | Service user for the datasource (see [below](#datasource-credentials-sf_ds_)) |
| `sf_ds_password` | Password for that service user |
| `openai_api_key` | OpenAI API key |
| `anthropic_api_key` | Anthropic API key |
| `snowflake_ds_warehouse` | Warehouse the Databao agent will use to run queries against the datasource |
| `snowflake_ds_database` | Database containing the data the agent will explore |

#### Datasource credentials (`sf_ds_*`)

These credentials are used by the Databao agent to connect to a Snowflake database via the Snowflake API. The agent reads data from this database to answer your questions.

- **`sf_ds_account`** — your Snowflake account identifier (e.g. `abc12345.us-east-1`). You can find it in Snowsight under your account menu.

- **`sf_ds_warehouse`** — the warehouse the agent will use to run queries. If you don't have one, create it in **Snowsight → Admin → Warehouses → + Warehouse** (an `XSMALL` warehouse is sufficient).

- **`sf_ds_database`** — the database containing the data the agent will explore.

- **`sf_ds_user`** and **`sf_ds_password`** — a service user that the agent authenticates as. To create one:
1. Go to **Snowsight → Admin → Users & Roles → + User**
2. Enter a name (e.g. `STREAMLIT_SERVICE_USER`)
3. Set a password
4. Click **Create User**
5. Grant the user access to the target database and warehouse:

```sql
GRANT USAGE ON WAREHOUSE <your_warehouse> TO USER <your_service_user>;
GRANT USAGE ON DATABASE <your_database> TO USER <your_service_user>;
GRANT USAGE ON ALL SCHEMAS IN DATABASE <your_database> TO USER <your_service_user>;
GRANT SELECT ON ALL TABLES IN DATABASE <your_database> TO USER <your_service_user>;
```
The datasource connection uses the Streamlit app's own OAuth session -- no separate service user or password is needed. The user running the Streamlit app must have access to the target warehouse and database.

### 2. Run the Setup Script

Execute the entire `setup.sql` in a Snowflake worksheet (or via SnowSQL) while connected as `ACCOUNTADMIN`. The script is idempotent it uses `CREATE OR REPLACE` throughout, so re-running it is safe.
Execute the entire `setup.sql` in a Snowflake worksheet (or via SnowSQL). The script is idempotent -- it uses `CREATE OR REPLACE` throughout, so re-running it is safe.

### 3. Open the App

Once the script finishes, navigate to **Streamlit** in Snowsight and open the app (named `STREAMLIT_DATABAO_APP_<suffix>`, e.g. `STREAMLIT_DATABAO_APP_DEMO`). The compute pool may take a minute or two to resume on first launch.
Once the script finishes, navigate to **Streamlit** in Snowsight and open the app (named `STREAMLIT_DATABAO_APP_<suffix>`, e.g. `STREAMLIT_DATABAO_APP_DEMO`). The system compute pool may take a minute or two to resume on first launch.

## Cleanup

To remove all Snowflake objects created by `setup.sql`, open `cleanup.sql`, set the same `suffix` you used during setup, and run the script as `ACCOUNTADMIN`. This drops the database (cascading to all database-scoped objects), compute pool, integrations, service user, network policy, and warehouse.
To remove all Snowflake objects created by `setup.sql`, open `cleanup.sql`, set the same `suffix` you used during setup, and run the script. This drops the database (cascading to all database-scoped objects), integrations, and warehouse.

## Local Development

Expand All @@ -91,11 +65,11 @@ uv sync
# Set the required environment variables
export OPENAI_API_KEY="..."
export ANTHROPIC_API_KEY="..."
export SNOWFLAKE_DS_ACCOUNT="..."
export SNOWFLAKE_ACCOUNT="..." # e.g. abc12345
export SNOWFLAKE_HOST="..." # e.g. abc12345.snowflakecomputing.com
export SNOWFLAKE_DS_TOKEN="..." # OAuth token for datasource access
export SNOWFLAKE_DS_WAREHOUSE="..."
export SNOWFLAKE_DS_DATABASE="..."
export SNOWFLAKE_DS_USER="..."
export SNOWFLAKE_DS_PASSWORD="..."

# Run the Streamlit app
uv run streamlit run src/databao_snowflake_demo/app.py -- \
Expand All @@ -118,5 +92,5 @@ To test a dev/pre-release version, update the version specifier in `pyproject.to
## Notes

- The app runs in **read-only domain** mode — datasource configuration and domain builds are disabled in the UI. All domain setup is done ahead of time via the files in `databao/domains/root/`.
- The compute pool uses `CPU_X64_M` instances with auto-suspend after 5 minutes and auto-resume enabled.
- The app uses `SYSTEM_COMPUTE_POOL_CPU` by default. To use a dedicated compute pool, uncomment the relevant section in `setup.sql`.
- Network egress is allowed on ports 80 and 443 to enable OpenAI/Anthropic API calls and Snowflake datasource connections.
61 changes: 16 additions & 45 deletions examples/demo-snowflake-project/cleanup.sql
Original file line number Diff line number Diff line change
Expand Up @@ -3,55 +3,26 @@
-- Set the same suffix you used in setup.sql, then run this script.
-- ============================================================

SET suffix = 'DEMO';
SET suffix = 'DEMO';

USE ROLE ACCOUNTADMIN;
-- Derived object names (must match setup.sql)
SET database_name = 'STREAMLIT_DATABAO_DB_' || $suffix;
SET app_warehouse = 'STREAMLIT_DATABAO_WAREHOUSE_' || $suffix;
SET git_integration = 'STREAMLIT_DATABAO_GIT_INTEGRATION_' || $suffix;
SET app_eai = 'STREAMLIT_DATABAO_EAI_' || $suffix;
SET secrets_access = 'STREAMLIT_DATABAO_SECRETS_ACCESS_' || $suffix;

-- Bootstrap warehouse: needed for expression evaluation in the scripting block.
CREATE WAREHOUSE IF NOT EXISTS STREAMLIT_DATABAO_BOOTSTRAP_WH
WAREHOUSE_SIZE = 'XSMALL'
AUTO_SUSPEND = 60
AUTO_RESUME = TRUE;
USE WAREHOUSE STREAMLIT_DATABAO_BOOTSTRAP_WH;
-- Database (cascades: Streamlit app, UDF, secrets, git repo, network rule)
DROP DATABASE IF EXISTS IDENTIFIER($database_name);

DECLARE
_sql VARCHAR;
-- External access integrations
DROP INTEGRATION IF EXISTS IDENTIFIER($secrets_access);
DROP INTEGRATION IF EXISTS IDENTIFIER($app_eai);

-- Derived object names (must match setup.sql)
_db VARCHAR DEFAULT 'STREAMLIT_DATABAO_DB_' || $suffix;
_wh VARCHAR DEFAULT 'STREAMLIT_DATABAO_WAREHOUSE_' || $suffix;
_user VARCHAR DEFAULT 'STREAMLIT_DATABAO_USER_' || $suffix;
_network_policy VARCHAR DEFAULT 'STREAMLIT_DATABAO_NETWORK_POLICY_' || $suffix;
_git_integration VARCHAR DEFAULT 'STREAMLIT_DATABAO_GIT_INTEGRATION_' || $suffix;
_eai VARCHAR DEFAULT 'STREAMLIT_DATABAO_EAI_' || $suffix;
_secrets_access VARCHAR DEFAULT 'STREAMLIT_DATABAO_SECRETS_ACCESS_' || $suffix;
_compute_pool VARCHAR DEFAULT 'STREAMLIT_DATABAO_COMPUTE_POOL_' || $suffix;
BEGIN
-- Database (cascades: Streamlit app, UDF, secrets, git repo, network rule)
EXECUTE IMMEDIATE 'DROP DATABASE IF EXISTS ' || :_db;
-- API integration (git)
DROP INTEGRATION IF EXISTS IDENTIFIER($git_integration);

-- Compute pool
EXECUTE IMMEDIATE 'DROP COMPUTE POOL IF EXISTS ' || :_compute_pool;

-- External access integrations
EXECUTE IMMEDIATE 'DROP INTEGRATION IF EXISTS ' || :_secrets_access;
EXECUTE IMMEDIATE 'DROP INTEGRATION IF EXISTS ' || :_eai;

-- API integration (git)
EXECUTE IMMEDIATE 'DROP INTEGRATION IF EXISTS ' || :_git_integration;

-- User (unset network policy first, then drop)
_sql := 'ALTER USER IF EXISTS ' || :_user || ' UNSET NETWORK_POLICY';
EXECUTE IMMEDIATE :_sql;
EXECUTE IMMEDIATE 'DROP USER IF EXISTS ' || :_user;

-- Network policy
EXECUTE IMMEDIATE 'DROP NETWORK POLICY IF EXISTS ' || :_network_policy;

-- Warehouse
EXECUTE IMMEDIATE 'DROP WAREHOUSE IF EXISTS ' || :_wh;
END;

DROP WAREHOUSE IF EXISTS STREAMLIT_DATABAO_BOOTSTRAP_WH;
-- Warehouse
DROP WAREHOUSE IF EXISTS IDENTIFIER($app_warehouse);

SELECT 'Streamlit app STREAMLIT_DATABAO_APP_' || $suffix || ' cleaned up successfully.' AS STATUS;
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
type: snowflake
name: snowflake
connection:
account: {{ env_var('SNOWFLAKE_DS_ACCOUNT') }}
account: {{ env_var('SNOWFLAKE_ACCOUNT') }}
warehouse: {{ env_var('SNOWFLAKE_DS_WAREHOUSE') }}
database: {{ env_var('SNOWFLAKE_DS_DATABASE') }}
user: {{ env_var('SNOWFLAKE_DS_USER') }}
additional_properties:
host: {{ env_var('SNOWFLAKE_HOST') }}
Comment thread
catstrike marked this conversation as resolved.
auth:
password: {{ env_var('SNOWFLAKE_DS_PASSWORD') }}
token: {{ env_var('SNOWFLAKE_DS_TOKEN') }}
9 changes: 8 additions & 1 deletion examples/demo-snowflake-project/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,14 @@ description = "Databao demo project for Snowflake"
readme = "README.md"
requires-python = ">=3.11,<3.12"
dependencies = [
"databao>=0.3.3",
"databao==0.3.4.dev1",
"databao-agent==0.2.1.dev12",
"databao-context-engine[snowflake]==0.7.1.dev2",
"streamlit[snowflake]>=1.53.0",
"adbc-driver-snowflake>=1.10.0",
"adbc-driver-manager>=1.10.0",
"snowflake-sqlalchemy>=1.6.0",
"duckdb>=0.10.0",
]

[tool.uv]
Expand Down
Loading
Loading