Add OBO credential flow integration tests by dhruv0811 · Pull Request #352 · databricks/databricks-ai-bridge

dhruv0811 · 2026-03-02T21:01:18Z

Summary

End-to-end integration tests for OBO (On-Behalf-Of) identity forwarding across both Model Serving and Databricks Apps CUJs
Invokes pre-deployed agents as two different service principals, asserts each sees their own identity via a whoami() UC function tool
App fixture code committed to repo — CI redeploys the app with latest code on each run, then stops it after tests
Serving endpoint deploy script for weekly redeploy with latest SDK (which also scales to zero to reduce cost)
Fixes typo in ModelServingUserCredentials docstring (credential_strategy → credentials_strategy)

Test plan

OBO Test CI: https://github.com/databricks-eng/ai-oss-integration-tests-runner/actions/runs/22693563383
Model Serving Re-deploy CI: https://github.com/databricks-eng/ai-oss-integration-tests-runner/actions/runs/22694859161

Test that identity is forwarded correctly through both the Model Serving (ModelServingUserCredentials) and Databricks Apps (direct token) OBO paths using two different service principals and a whoami() UC function. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ials WorkspaceClient doesn't accept credential_strategy directly. Use Config object as shown in the existing unit tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…t kwarg The docstring had a typo (credential_strategy vs credentials_strategy). Fixed both the test and the source docstring to use the correct parameter name that WorkspaceClient actually accepts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The SQL current_user() returns the SP's UUID, not its display_name. Compare the two whoami() results against each other instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Invoke pre-deployed Model Serving endpoint and Databricks App as two different SPs, assert each sees their own identity via whoami() tool. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- App fixture: committed agent code so CI redeploys with latest on each run - deploy_serving_agent.py: script to log + deploy ChatModel with OBO to serving endpoint - Warm-start fixture: polls serving endpoint until scaled up before tests - Remove -k TestAppsOBO filter — both Apps and Serving tests run Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Hatch couldn't find the package directory because the project name didn't match any directory. Explicitly list agent_server and scripts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- whoami_serving_agent.py: ResponsesAgent using SQL Statement Execution with ModelServingUserCredentials for OBO - deploy_serving_agent.py: logs with AuthPolicy + deploys with scale_to_zero - Warehouse ID from env var (not hardcoded) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Remove databricks-openai from test deps (breaks core_test lowest-direct) - Use pytest.importorskip instead - Convert print() to logging in deploy script - Fix ruff/format issues in all OBO files - Remove hardcoded warehouse ID, use env var Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The serving endpoint returns the SP's UUID via SQL current_user(), not the display_name. Use the client ID from env var which matches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

These are model artifacts and deploy scripts that use MLflow/agents types not available in the core type checking environment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The serving env doesn't have OBO_TEST_WAREHOUSE_ID. The deploy script now replaces the placeholder in the agent file before logging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…serving Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

agents.deploy() auto-derives endpoint name from UC model name. Passing endpoint_name was creating a new endpoint instead of updating the existing one. Match notebook pattern exactly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

aravind-segu

Couple comments but looks good overall. Can I also get the url to the jobs in the workspace.

tests/integration_tests/obo/app_fixture/agent_server/agent.py

tests/integration_tests/obo/app_fixture/agent_server/utils.py

tests/integration_tests/obo/model_serving_fixture/whoami_serving_agent.py

tests/integration_tests/obo/deploy_serving_agent.py

dhruv0811 · 2026-03-06T23:51:23Z

Relevant links to compute @aravind-segu :

Model Serving Endpoint: https://ai-oss-ecosystem-integration-testing.cloud.databricks.com/ml/endpoints/agents_integration_testing-databricks_ai_bridge_mcp_test-test_e?o=3272836215725701

App: https://ai-oss-ecosystem-integration-testing.cloud.databricks.com/apps/agent-obo-test?o=3272836215725701

Relevant links to Job Runs in ai-oss CI:

OBO Test CI: https://github.com/databricks-eng/ai-oss-integration-tests-runner/actions/runs/22786939289

Model Serving Re-deploy CI: https://github.com/databricks-eng/ai-oss-integration-tests-runner/actions/runs/22694859161

- Consolidate invoke/stream into create_whoami_agent() helper - Use auth_type="pat" instead of env var pop/restore hack - Increase SQL warehouse wait_timeout to 120s for cold starts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Increase warmup from 10 to 20 attempts (10 min total) - Poll endpoint state via SDK before sending expensive LLM requests - Only send real request once endpoint reports READY Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dhruv0811 and others added 6 commits March 2, 2026 12:00

Fix: use Config(credentials_strategy=...) for ModelServingUserCredent…

f7ab1ca

…ials WorkspaceClient doesn't accept credential_strategy directly. Use Config object as shown in the existing unit tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix whoami assertions: compare deployer vs end-user SQL results directly

1f189c1

The SQL current_user() returns the SP's UUID, not its display_name. Compare the two whoami() results against each other instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Format test file with ruff

1b5a825

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix type checker errors: add None guards for SDK optional types

d6a6208

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dhruv0811 requested review from annzhang-db and aravind-segu March 2, 2026 21:58

dhruv0811 and others added 5 commits March 2, 2026 15:05

Replace simulated OBO tests with end-to-end agent invocation tests

8876ea4

Invoke pre-deployed Model Serving endpoint and Databricks App as two different SPs, assert each sees their own identity via whoami() tool. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add databricks-openai to test dependencies for OBO e2e tests

0aafbae

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix app fixture: add hatch wheel packages config

11609a9

Hatch couldn't find the package directory because the project name didn't match any directory. Explicitly list agent_server and scripts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dhruv0811 force-pushed the obo-integration-tests branch from 4dfdae7 to 2f6ae09 Compare March 4, 2026 19:44

dhruv0811 and others added 8 commits March 4, 2026 11:50

Fix SP-B identity check: use OBO_TEST_CLIENT_ID directly

eefadf0

The serving endpoint returns the SP's UUID via SQL current_user(), not the display_name. Use the client ID from env var which matches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move whoami_serving_agent.py into model_serving_fixture/

a4701f7

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Exclude OBO fixture/deploy files from type checking

fe4feca

These are model artifacts and deploy scripts that use MLflow/agents types not available in the core type checking environment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Inject warehouse ID at deploy time instead of reading env at import

47bb76b

The serving env doesn't have OBO_TEST_WAREHOUSE_ID. The deploy script now replaces the placeholder in the agent file before logging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix app whoami tool: return user_name (UUID for SPs) for parity with …

363611d

…serving Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix ruff: remove unused imports (shutil, os)

5d2787c

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dhruv0811 requested a review from bbqiu March 5, 2026 00:07

Merge remote-tracking branch 'origin/main' into obo-integration-tests

54494aa

bbqiu removed their request for review March 5, 2026 05:56

aravind-segu reviewed Mar 6, 2026

View reviewed changes

Address PR review comments

7e17920

- Consolidate invoke/stream into create_whoami_agent() helper - Use auth_type="pat" instead of env var pop/restore hack - Increase SQL warehouse wait_timeout to 120s for cold starts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dhruv0811 requested a review from aravind-segu March 6, 2026 23:57

dhruv0811 force-pushed the obo-integration-tests branch from 4997076 to 96e6b72 Compare March 10, 2026 00:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OBO credential flow integration tests#352

Add OBO credential flow integration tests#352
dhruv0811 wants to merge 22 commits intomainfrom
obo-integration-tests

dhruv0811 commented Mar 2, 2026 •

edited

Loading

Uh oh!

aravind-segu left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dhruv0811 commented Mar 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dhruv0811 commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

aravind-segu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dhruv0811 commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dhruv0811 commented Mar 2, 2026 •

edited

Loading

dhruv0811 commented Mar 6, 2026 •

edited

Loading