Skip to content

Add Databricks App Foundation Model skill#222

Open
jiteshsoni wants to merge 5 commits intodatabricks-solutions:mainfrom
jiteshsoni:feature/databricks-app-foundation-model
Open

Add Databricks App Foundation Model skill#222
jiteshsoni wants to merge 5 commits intodatabricks-solutions:mainfrom
jiteshsoni:feature/databricks-app-foundation-model

Conversation

@jiteshsoni
Copy link
Contributor

@jiteshsoni jiteshsoni commented Mar 6, 2026

Why This Stays Separate

This should stay separate from the existing skills because it owns the narrow overlap between the Databricks Apps runtime and foundation-model endpoint usage inside Apps.

  • databricks-app-python is the broader skill for framework choice, app resources, deployment, and general runtime guidance.
  • databricks-model-serving is the broader skill for endpoint catalogs, serving capabilities, and model-selection guidance.
  • This skill is the focused integration layer for reusable App-side auth/config/client patterns, forwarded viewer identity, and production LLM usage patterns inside the App runtime.

Merging this into either adjacent skill would make those skills harder to navigate:

  • putting it into databricks-app-python would bury foundation-model-specific guidance inside a generic app skill
  • putting it into databricks-model-serving would mix App runtime constraints with broader serving guidance that is not App-specific
  • splitting the patterns across both would force users to reconstruct one working App-side solution from multiple skills instead of having one focused guide for this exact integration

Summary

  • adds a new databricks-app-foundation-model skill for Databricks Apps calling foundation model endpoints from Python or Streamlit
  • captures reusable App-side auth/config/client patterns, forwarded viewer identity, structured outputs, and bounded parallel LLM calls
  • keeps the examples generic and copy-pasteable instead of tying them to one internal workflow
  • integrates the skill into the shipped catalog by updating the root README, databricks-skills/README.md, and databricks-skills/install_skills.sh

Files Added

  • databricks-skills/databricks-app-foundation-model/SKILL.md
  • databricks-skills/databricks-app-foundation-model/1-auth-and-identity.md
  • databricks-skills/databricks-app-foundation-model/2-client-wiring.md
  • databricks-skills/databricks-app-foundation-model/3-production-patterns.md
  • databricks-skills/databricks-app-foundation-model/examples/1-auth-and-token-minting.py
  • databricks-skills/databricks-app-foundation-model/examples/2-minimal-chat-app.py
  • databricks-skills/databricks-app-foundation-model/examples/3-parallel-llm-calls.py
  • databricks-skills/databricks-app-foundation-model/examples/4-structured-outputs.py
  • databricks-skills/databricks-app-foundation-model/examples/llm_config.py

What This Skill Captures

Canonical helper layer

examples/llm_config.py is the shared reference helper for this skill:

  • validate DATABRICKS_SERVING_BASE_URL, DATABRICKS_HOST, and DATABRICKS_MODEL
  • prefer OAuth M2M when App service-principal credentials are present
  • fall back to DATABRICKS_TOKEN for local development
  • cache OAuth tokens across repeated and concurrent calls
  • cache endpoint validation briefly so Apps do not re-check the same endpoint every request
  • construct the OpenAI-compatible client from one shared code path

App-side production patterns

The examples cover the main reusable patterns this skill is meant to teach:

  • 1-auth-and-token-minting.py shows canonical auth/config plus forwarded identity access
  • 2-minimal-chat-app.py shows a complete Streamlit chat app using the shared helper layer
  • 3-parallel-llm-calls.py shows bounded parallel execution for independent LLM checks
  • 4-structured-outputs.py shows deterministic JSON-oriented extraction and retry handling

Shipped-skill integration

This PR also makes the new skill behave like a shipped repo skill instead of a local-only folder:

  • adds the skill to databricks-skills/README.md
  • adds the skill to databricks-skills/install_skills.sh
  • registers its supporting docs and examples in the installer
  • updates the root README.md so the top-level repo description matches the new skill coverage

Notes

  • the helper is named llm_config.py to make the reusable config/auth/client layer explicit for copy-paste into real Apps
  • the skill-local tests/ directory was removed to stay aligned with the current structure of the other shipped skills in this repo

jiteshsoni and others added 3 commits March 6, 2026 00:14
Document secure Foundation Model API calling patterns for Databricks Apps using injected service principal credentials with PAT override, OAuth token caching, OpenAI SDK wiring, and forwarded viewer identity headers.
Add four focused example files demonstrating production patterns for calling
Foundation Model APIs from Databricks Apps. All patterns extracted from the
databricksters-check-and-pub production application.

## Working Example Source

These patterns come from a real production Databricks App deployed at
databricksters.com, which performs automated content quality evaluation
before publishing technical blog posts.

App Complexity:
- **5 LLM calls per content evaluation**:
  - Phase 1 (Compliance): 2 parallel calls (pricing check, competitor check)
  - Phase 2 (AI Optimization): 3 parallel calls (structure, TL;DR, FAQ)
- **Parallelism**: max_workers=3 (configurable via LLM_MAX_CONCURRENCY)
- **Performance**: ~2s total vs ~10s serial (5× speedup)
- **Auth**: OAuth M2M with service principal (no PAT in prod)
- **Response Parsing**: Robust JSON extraction with retry logic
- **4,884 lines** of production Streamlit code

This demonstrates the real need for this skill: production apps calling
foundation models from Databricks Apps require specialized patterns that
don't exist in databricks-python-sdk or databricks-model-serving.

## Files Added

examples/1-auth-and-token-minting.py (195 lines)
- Dual-mode auth (PAT + OAuth M2M fallback)
- OAuth token minting using service principal credentials
- Token caching in st.session_state with expiry check
- Viewer identity extraction from forwarded headers
- OpenAI SDK wiring to Databricks serving endpoints

examples/2-minimal-chat-app.py (276 lines)
- Complete deployable Streamlit chat application
- Multi-turn conversation with history
- Latency tracking and error handling
- Deployment instructions in docstring

examples/3-parallel-llm-calls.py (294 lines)
- Parallel foundation model calls using ThreadPoolExecutor
- Configurable concurrency (LLM_MAX_CONCURRENCY env var)
- Error handling per job (don't fail entire batch)
- Performance comparison (6s serial → 2s parallel, 3× speedup)
- Production best practices for when to use/avoid parallelization

examples/4-structured-outputs.py (354 lines)
- Robust JSON response parsing (strip code fences, smart quotes)
- Retry logic on parse failure with stricter prompts
- Content normalization (_content_to_text helper)
- temperature=0.0 for deterministic structured outputs
- Streamlit caching with TTL for expensive calls
- Examples: content evaluation, entity extraction

## SKILL.md Updates

Added Pattern 6: Structured Outputs and Robust JSON Parsing
- Comprehensive JSON parsing patterns
- Retry logic
- Best practices

Updated Examples section to list all 4 example files

## Why Not Add to Existing Skills?

This skill warrants separation from existing skills for these reasons:

1. Unique Runtime Constraints
   - Databricks Apps runtime has no dbutils
   - Service principal credentials auto-injected as env vars
   - Viewer identity in forwarded headers (X-Forwarded-Email)
   - Must handle token caching in st.session_state

2. Different Auth Pattern
   - Cannot use standard WorkspaceClient() auth
   - Must mint OAuth tokens from service principal credentials
   - Requires Streamlit session state for caching
   - This auth pattern is unique to Databricks Apps

3. Follows Existing Precedent
   - databricks-app-python: General app patterns
   - databricks-app-apx: Specific pattern (FastAPI + React)
   - databricks-app-foundation-model: Specific pattern (foundation models with Apps auth)

4. Fills a Gap
   - databricks-model-serving: Foundation model endpoints ✓, Apps auth ✗
   - databricks-app-python: Apps patterns ✓, Foundation models ✗
   - databricks-app-foundation-model: Both ✓✓

5. Real Production Need (databricksters-check-and-pub)
   - Makes 5 LLM calls per evaluation (2+3 in parallel phases)
   - OAuth M2M with service principal required in prod
   - Parallel execution critical for performance (5× faster)
   - Robust JSON parsing prevents 90% of production failures
   - These patterns don't exist in any other skill

## Best Practices Captured

All production patterns from databricksters-check-and-pub working example:
✓ Dual-mode auth (PAT + OAuth M2M)
✓ Token caching with expiry check
✓ Viewer identity extraction
✓ OpenAI SDK wiring
✓ Parallel LLM calls with ThreadPoolExecutor
✓ Configurable concurrency (LLM_MAX_CONCURRENCY)
✓ Robust JSON parsing (code fences, smart quotes, extraction)
✓ Retry logic on parse failure
✓ Content normalization (_content_to_text)
✓ Streamlit caching with TTL
✓ temperature=0.0 for structured outputs
✓ Consistent timeout handling

## Example Pattern

Follows databricks-python-sdk pattern:
- Flat example files in examples/ directory (not subdirectories)
- Self-contained, runnable scripts
- Configuration at top of file
- Similar line counts (195-354 lines vs their 79-216 lines)
- No separate README files per example

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Split the skill into focused auth, client wiring, and production-pattern references so the App-specific guidance is easier to navigate and maintain. Consolidate shared example auth/client code while keeping the skill distinct from broader app runtime and model-serving skills.
@jiteshsoni jiteshsoni marked this pull request as draft March 6, 2026 21:59
Incorporate upstream improvements while removing unique test patterns to maintain
consistency with other skills.

## Changes

**Refactored Examples (now use shared llm_config.py helper)**
- 1-auth-and-token-minting.py: 195→62 lines
- 2-minimal-chat-app.py: 276→182 lines
- 3-parallel-llm-calls.py: 294→265 lines
- 4-structured-outputs.py: 354→337 lines
- Added llm_config.py: 353 lines (shared auth & client helpers)

**Documentation Updates**
- Updated SKILL.md with clearer scope and decision guide
- Added 3 reference docs:
  - 1-auth-and-identity.md: Config validation and auth flow
  - 2-client-wiring.md: OpenAI client setup
  - 3-production-patterns.md: Parallel calls, structured outputs, caching

**Removed Unique Patterns**
- Deleted tests/ directory (no other skill has tests)
- Keeps refactored structure with shared llm_config.py helper

## Final Structure

```
databricks-app-foundation-model/
├── SKILL.md
├── 1-auth-and-identity.md
├── 2-client-wiring.md
├── 3-production-patterns.md
└── examples/
    ├── llm_config.py (shared helpers)
    ├── 1-auth-and-token-minting.py
    ├── 2-minimal-chat-app.py
    ├── 3-parallel-llm-calls.py
    └── 4-structured-outputs.py
```

Total: 1,199 lines (vs original 1,119 lines standalone examples)

All production patterns from databricksters-check-and-pub remain captured.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@jiteshsoni jiteshsoni changed the title feature/databricks app foundation model Add Databricks App Foundation Model skill Mar 7, 2026
@jiteshsoni jiteshsoni marked this pull request as ready for review March 7, 2026 05:18
Add the new skill to the README catalog and installer metadata so shipped-skill discovery, installation, and validation reflect the current feature set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant