Skip to content

Add OBO credential flow integration tests#352

Open
dhruv0811 wants to merge 22 commits intomainfrom
obo-integration-tests
Open

Add OBO credential flow integration tests#352
dhruv0811 wants to merge 22 commits intomainfrom
obo-integration-tests

Conversation

@dhruv0811
Copy link
Contributor

@dhruv0811 dhruv0811 commented Mar 2, 2026

Summary

  • End-to-end integration tests for OBO (On-Behalf-Of) identity forwarding across both Model Serving and Databricks Apps CUJs
  • Invokes pre-deployed agents as two different service principals, asserts each sees their own identity via a whoami() UC function tool
  • App fixture code committed to repo — CI redeploys the app with latest code on each run, then stops it after tests
  • Serving endpoint deploy script for weekly redeploy with latest SDK (which also scales to zero to reduce cost)
  • Fixes typo in ModelServingUserCredentials docstring (credential_strategycredentials_strategy)

Test plan

dhruv0811 and others added 6 commits March 2, 2026 12:00
Test that identity is forwarded correctly through both the Model Serving
(ModelServingUserCredentials) and Databricks Apps (direct token) OBO paths
using two different service principals and a whoami() UC function.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ials

WorkspaceClient doesn't accept credential_strategy directly.
Use Config object as shown in the existing unit tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t kwarg

The docstring had a typo (credential_strategy vs credentials_strategy).
Fixed both the test and the source docstring to use the correct parameter
name that WorkspaceClient actually accepts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The SQL current_user() returns the SP's UUID, not its display_name.
Compare the two whoami() results against each other instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
dhruv0811 and others added 5 commits March 2, 2026 15:05
Invoke pre-deployed Model Serving endpoint and Databricks App as two
different SPs, assert each sees their own identity via whoami() tool.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- App fixture: committed agent code so CI redeploys with latest on each run
- deploy_serving_agent.py: script to log + deploy ChatModel with OBO to serving endpoint
- Warm-start fixture: polls serving endpoint until scaled up before tests
- Remove -k TestAppsOBO filter — both Apps and Serving tests run

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Hatch couldn't find the package directory because the project name
didn't match any directory. Explicitly list agent_server and scripts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- whoami_serving_agent.py: ResponsesAgent using SQL Statement Execution
  with ModelServingUserCredentials for OBO
- deploy_serving_agent.py: logs with AuthPolicy + deploys with scale_to_zero
- Warehouse ID from env var (not hardcoded)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dhruv0811 dhruv0811 force-pushed the obo-integration-tests branch from 4dfdae7 to 2f6ae09 Compare March 4, 2026 19:44
dhruv0811 and others added 8 commits March 4, 2026 11:50
- Remove databricks-openai from test deps (breaks core_test lowest-direct)
- Use pytest.importorskip instead
- Convert print() to logging in deploy script
- Fix ruff/format issues in all OBO files
- Remove hardcoded warehouse ID, use env var

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The serving endpoint returns the SP's UUID via SQL current_user(),
not the display_name. Use the client ID from env var which matches.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These are model artifacts and deploy scripts that use MLflow/agents
types not available in the core type checking environment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The serving env doesn't have OBO_TEST_WAREHOUSE_ID. The deploy script
now replaces the placeholder in the agent file before logging.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…serving

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
agents.deploy() auto-derives endpoint name from UC model name.
Passing endpoint_name was creating a new endpoint instead of
updating the existing one. Match notebook pattern exactly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dhruv0811 dhruv0811 requested a review from bbqiu March 5, 2026 00:07
@bbqiu bbqiu removed their request for review March 5, 2026 05:56
Copy link
Contributor

@aravind-segu aravind-segu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple comments but looks good overall. Can I also get the url to the jobs in the workspace.

- Consolidate invoke/stream into create_whoami_agent() helper
- Use auth_type="pat" instead of env var pop/restore hack
- Increase SQL warehouse wait_timeout to 120s for cold starts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dhruv0811 dhruv0811 requested a review from aravind-segu March 6, 2026 23:57
- Increase warmup from 10 to 20 attempts (10 min total)
- Poll endpoint state via SDK before sending expensive LLM requests
- Only send real request once endpoint reports READY

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dhruv0811 dhruv0811 force-pushed the obo-integration-tests branch from 4997076 to 96e6b72 Compare March 10, 2026 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants