Intugle
diff --git a/‎README.md‎
Lines changed: 20 additions & 0 deletions b/‎README.md‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎docsite/docs/streamlit-app.md‎
Lines changed: 66 additions & 0 deletions b/‎docsite/docs/streamlit-app.md‎
Lines changed: 66 additions & 0 deletions
diff --git a/‎pyproject.toml‎
Lines changed: 12 additions & 3 deletions b/‎pyproject.toml‎
Lines changed: 12 additions & 3 deletions
diff --git a/‎src/intugle/adapters/factory.py‎
Lines changed: 1 addition & 1 deletion b/‎src/intugle/adapters/factory.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎src/intugle/cli.py‎
Lines changed: 45 additions & 0 deletions b/‎src/intugle/cli.py‎
Lines changed: 45 additions & 0 deletions
diff --git a/‎streamlit_app/.streamlit/config.toml‎ ‎…gle/streamlit_app/.streamlit/config.toml‎streamlit_app/.streamlit/config.toml renamed to src/intugle/streamlit_app/.streamlit/config.toml b/‎streamlit_app/.streamlit/config.toml‎ ‎…gle/streamlit_app/.streamlit/config.toml‎streamlit_app/.streamlit/config.toml renamed to src/intugle/streamlit_app/.streamlit/config.toml
diff --git a/‎streamlit_app/README.md‎ ‎src/intugle/streamlit_app/README.md‎streamlit_app/README.md renamed to src/intugle/streamlit_app/README.md
Lines changed: 1 addition & 1 deletion b/‎streamlit_app/README.md‎ ‎src/intugle/streamlit_app/README.md‎streamlit_app/README.md renamed to src/intugle/streamlit_app/README.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎streamlit_app/helper.py‎ ‎src/intugle/streamlit_app/helper.py‎streamlit_app/helper.py renamed to src/intugle/streamlit_app/helper.py
Lines changed: 11 additions & 5 deletions b/‎streamlit_app/helper.py‎ ‎src/intugle/streamlit_app/helper.py‎streamlit_app/helper.py renamed to src/intugle/streamlit_app/helper.py
Lines changed: 11 additions & 5 deletions
diff --git a/‎…app/intugle_assets/Intugle_main_logo.png‎ ‎…app/intugle_assets/Intugle_main_logo.png‎streamlit_app/intugle_assets/Intugle_main_logo.png renamed to src/intugle/streamlit_app/intugle_assets/Intugle_main_logo.png b/‎…app/intugle_assets/Intugle_main_logo.png‎ ‎…app/intugle_assets/Intugle_main_logo.png‎streamlit_app/intugle_assets/Intugle_main_logo.png renamed to src/intugle/streamlit_app/intugle_assets/Intugle_main_logo.png
@@ -275,6 +275,26 @@ For detailed instructions on setting up the server and connecting your favorite
 
 <!-- mcp-name: io.github.intugle/intugle-vibe-mcp -->
 
+### Streamlit App
+
+The `intugle` library includes a Streamlit application that provides an interactive web interface for building and visualizing semantic data models.
+
+To use the Streamlit app, install `intugle` with the `streamlit` extra:
+
+```bash
+pip install intugle[streamlit]
+```
+
+You can launch the Streamlit application using the `intugle-mcp` command or `uvx`:
+
+```bash
+intugle-streamlit
+# Or using uvx
+uvx --from intugle intugle-streamlit
+```
+
+Open the URL provided in your terminal (usually `http://localhost:8501`) to access the application. For more details, refer to the [Streamlit App documentation](https://intugle.github.io/data-tools/docs/streamlit-app).
+
 ## Community
 
 Join our community to ask questions, share your projects, and connect with other users.
 
@@ -0,0 +1,66 @@
+---
+sidebar_position: 8
+title: Streamlit App
+---
+
+# Intugle - Streamlit App
+
+This Streamlit application provides an interactive web interface for the `intugle` library. It allows users to upload their tabular data (CSV/Excel), configure a Large Language Model (LLM), and step through the process of building a semantic data model. The app profiles the data, generates a business glossary, identifies relationships between datasets, and visualizes the resulting semantic graph.
+
+## ✨ Features
+
+- **File Upload**: Upload multiple CSV or Excel files directly in the browser.
+- **Interactive Data Prep**: Interactively rename tables and select, rename, or drop columns before processing.
+- **LLM Configuration**: Securely configure and connect to your preferred LLM provider (OpenAI, Azure OpenAI, Gemini).
+- **Automated Data Profiling**: Automatically calculates key metrics like uniqueness, completeness, and data types for every column.
+- **AI-Powered Business Glossary**: Leverages an LLM to generate a business glossary for all tables and columns, adding crucial context.
+- **Automated Link Prediction**: Discovers potential relationships (foreign keys) between your tables.
+- **Interactive Visualization**: Displays the final semantic model as an interactive network graph.
+- **Detailed Results**: Provides a tabular view of all predicted links with detailed metrics.
+- **Export Artifacts**: Download the generated semantic model artifacts (`.yml` files) as a ZIP archive for use in other systems.
+
+## 🚀 Getting Started
+
+Follow these instructions to set up and run the application on your local machine.
+
+### Prerequisites
+
+- Python 3.10+
+- `uv` (Optional: for `uvx` command)
+
+### 1. Installation
+
+To use the Streamlit app, install `intugle` with the `streamlit` extra:
+
+```bash
+pip install intugle[streamlit]
+```
+
+### 2. Configuration
+
+The application requires credentials for a Large Language Model to generate the business glossary and perform other AI-powered tasks.
+
+You can configure your LLM provider and API keys directly in the application's sidebar after launching it. The app will guide you on which credentials are required for your chosen provider (e.g., `OPENAI_API_KEY` for OpenAI).
+
+### 3. Running the App
+
+You can launch the Streamlit application using the `intugle-streamlit` command or `uvx`:
+
+```bash
+intugle-streamlit
+# Or using uvx
+uvx --from intugle intugle-streamlit
+```
+
+Open the URL provided in your terminal (usually `http://localhost:8501`) to access the application.
+
+## ⚙️ How It Works
+
+The application guides you through a simple, multi-step process, which is tracked in the sidebar:
+
+1.  **Upload Files**: Start by uploading one or more CSV or Excel files. The app will display a summary of the uploaded tables.
+2.  **Configure LLM**: In the sidebar, choose your LLM provider (OpenAI, Azure, or Gemini) and enter the necessary API keys and configuration details.
+3.  **Prepare Data**: Review the uploaded tables. You can rename tables and modify columns (rename, or ignore/drop them). Once you are satisfied, click **"Freeze column names"** to lock in your changes.
+4.  **Build Semantic Model**: After preparing your data, click **"Create Semantic Model"**. You will be prompted to provide a "domain" (e.g., *Healthcare*, *Manufacturing*) to give the LLM context. The app will then profile the data and generate a business glossary for each table.
+5.  **Predict Links**: Once profiling is complete, click **"Run Link Prediction"** to discover the relationships between your datasets.
+6.  **Explore & Download**: View the results as an interactive graph or a detailed table. You can download the underlying YAML configuration files from the sidebar at any time.
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "intugle"
-version = "1.0.11"
+version = "1.0.12"
 authors = [
     { name="Intugle", email="hello@intugle.ai" },
 ]
@@ -69,14 +69,23 @@ postgres = [
     "sqlglot>=27.20.0",
 ]
 
+streamlit = [
+    "streamlit==1.50.0",
+    "pyngrok==7.4.0",
+    "python-dotenv==1.1.1",
+    "xlsxwriter==3.2.9",
+    "plotly",
+    "graphviz"
+]
+
 
 [project.urls]
 "Homepage" = "https://github.com/Intugle/data-tools"
 "Bug Tracker" = "https://github.com/Intugle/data-tools/issues"
 
 [project.scripts]
 intugle-mcp = "intugle.mcp.server:main"
-intugle-streamlit = "intugle.cli:export_data"
+intugle-streamlit = "intugle.cli:run_streamlit_app"
 
 [dependency-groups]
 test = [
@@ -111,7 +120,7 @@ src = ["src"]
 where = ["src"]
 
 [tool.setuptools.package-data]
-"intugle" = ["**/*.yaml", "**/*.txt", "**/*.pkl", "mcp/semantic_layer/prompts/*.md"]
+"intugle" = ["**/*.yaml", "**/*.txt", "**/*.pkl", "mcp/semantic_layer/prompts/*.md", "streamlit_app/**/*"]
 
 [tool.pytest.ini_options]
 markers = [
 
@@ -75,7 +75,7 @@ def get_dataset_data_type(cls) -> Type[Any]:
             return Any
         if len(cls.config_types) == 1:
             return cls.config_types[0]
-        return Union[tuple(cls.config_types)]  # type: ignore
+        return Union[tuple(cls.config_types)]  # noqa: UP007
 
     @classmethod
     def create(cls, df: Any) -> Adapter:
 
@@ -0,0 +1,45 @@
+import importlib.util
+import os
+import subprocess
+
+
+def run_streamlit_app():
+    # A list of the required packages for the Streamlit app to run.
+    # These correspond to the dependencies in the `[project.optional-dependencies].streamlit` section of pyproject.toml.
+    required_modules = {
+        "streamlit": "streamlit",
+        "pyngrok": "pyngrok",
+        "dotenv": "python-dotenv",
+        "xlsxwriter": "xlsxwriter",
+        "plotly": "plotly",
+        "graphviz": "graphviz",
+    }
+
+    missing_modules = []
+    for module_name, package_name in required_modules.items():
+        if not importlib.util.find_spec(module_name):
+            missing_modules.append(package_name)
+
+    if missing_modules:
+        print("Error: The Streamlit app is missing required dependencies.")
+        print("The following packages are not installed:", ", ".join(missing_modules))
+        print("\nTo use the Streamlit app, please install 'intugle' with the 'streamlit' extra:")
+        print("  pip install 'intugle[streamlit]'")
+        return
+
+    # Get the absolute path to the main.py of the Streamlit app
+    app_dir = os.path.join(os.path.dirname(__file__), 'streamlit_app')
+    app_path = os.path.join(app_dir, 'main.py')
+    
+    # Ensure the app_path exists
+    if not os.path.exists(app_path):
+        print(f"Error: Streamlit app not found at {app_path}")
+        return
+
+    # Run the Streamlit app using subprocess, setting the working directory
+    print(f"Launching Streamlit app from: {app_path} with working directory {app_dir}")
+    subprocess.run(["streamlit", "run", app_path], cwd=app_dir)
+
+
+if __name__ == "__main__":
+    run_streamlit_app()
@@ -1,4 +1,4 @@
-# Intugle - Streamlit App
+w# Intugle - Streamlit App
 
 This Streamlit application provides an interactive web interface for the `intugle` library. It allows users to upload their tabular data (CSV/Excel), configure a Large Language Model (LLM), and step through the process of building a semantic data model. The app profiles the data, generates a business glossary, identifies relationships between datasets, and visualizes the resulting semantic graph.
 
 
@@ -175,6 +175,7 @@ def safe_filename(name: str, ext: str) -> str:
     str
         A sanitized filename like 'my_table.csv'.
     """
+    name = os.path.basename(name)  # Sanitize against path traversal
     base = re.sub(r"[^A-Za-z0-9_.-]+", "_", name).strip("._")
     if not base:
         base = "table"
@@ -904,7 +905,8 @@ def plotly_table_graph(
     node_x, node_y, node_text, node_deg, node_labels = [], [], [], [], []
     for n in G.nodes():
         x, y = pos[n]
-        node_x.append(x); node_y.append(y)
+        node_x.append(x)
+        node_y.append(y)
         indeg = G.in_degree(n)
         outdeg = G.out_degree(n)
         node_text.append(f"<b>{n}</b><br>in: {indeg} • out: {outdeg}")
@@ -947,14 +949,16 @@ def edge_hover(u: str, v: str, data: Mapping[str, Any]) -> str:
         # Fast path: one trace with constant width (keeps things interactive for big graphs)
         edge_x, edge_y, edge_hover_texts = [], [], []
         for u, v, data in edges_list:
-            x0, y0 = pos[u]; x1, y1 = pos[v]
+            x0, y0 = pos[u]
+            x1, y1 = pos[v]
             edge_x += [x0, x1, None]
             edge_y += [y0, y1, None]
             edge_hover_texts.append(edge_hover(u, v, data))
 
             # midpoint label text
             mx, my = (x0 + x1) / 2, (y0 + y1) / 2
-            edge_label_x.append(mx); edge_label_y.append(my)
+            edge_label_x.append(mx)
+            edge_label_y.append(my)
             if len(data["labels"]) == 1:
                 edge_label_text.append(data["labels"][0])
             else:
@@ -974,7 +978,8 @@ def edge_hover(u: str, v: str, data: Mapping[str, Any]) -> str:
     else:
         # Accurate path: one trace per edge so we can vary width by mean accuracy
         for u, v, data in edges_list:
-            x0, y0 = pos[u]; x1, y1 = pos[v]
+            x0, y0 = pos[u]
+            x1, y1 = pos[v]
             acc_mean = sum(data["accs"]) / max(1, len(data["accs"]))
             width = max(edge_min_width, acc_mean * edge_width_scale)
 
@@ -992,7 +997,8 @@ def edge_hover(u: str, v: str, data: Mapping[str, Any]) -> str:
 
             # midpoint label
             mx, my = (x0 + x1) / 2, (y0 + y1) / 2
-            edge_label_x.append(mx); edge_label_y.append(my)
+            edge_label_x.append(mx)
+            edge_label_y.append(my)
             if len(data["labels"]) == 1:
                 edge_label_text.append(data["labels"][0])
             else:
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-# Intugle - Streamlit App`
	`1`	`+w# Intugle - Streamlit App`
`2`	`2`
`3`	`3`	This Streamlit application provides an interactive web interface for the `intugle` library. It allows users to upload their tabular data (CSV/Excel), configure a Large Language Model (LLM), and step through the process of building a semantic data model. The app profiles the data, generates a business glossary, identifies relationships between datasets, and visualizes the resulting semantic graph.
`4`	`4`