Skip to content

feat(provider): Integrate vLLM Semantic Router as a model provider#33

Merged
Xunzhuo merged 1 commit into
agentic-in:mainfrom
FAUST-BENCHOU:feat/semantic-provider
May 21, 2026
Merged

feat(provider): Integrate vLLM Semantic Router as a model provider#33
Xunzhuo merged 1 commit into
agentic-in:mainfrom
FAUST-BENCHOU:feat/semantic-provider

Conversation

@FAUST-BENCHOU
Copy link
Copy Markdown

@FAUST-BENCHOU FAUST-BENCHOU commented May 18, 2026

Summary

Why Now

Linked Issue

Affected Surfaces

  • repo-docs
  • agent-text
  • agent-exec
  • release-ops
  • app-scaffold

Validation

(base) zhoujinyu@zhoujinyudeMacBook-Air elephant-agent % python -m pytest tests/unit/models/test_vllm_semantic_router_provider.py tests/integration/models_auth/test_vllm_semantic_router_provider.py -q

/Users/zhoujinyu/miniconda3/lib/python3.12/site-packages/requests/__init__.py:113: RequestsDependencyWarning: urllib3 (2.3.0) or chardet (7.4.2)/charset_normalizer (3.4.4) doesn't match a supported version!
  warnings.warn(
..........                                                                                        [100%]
10 passed in 21.17s

i use a most simple semantic-router config.yml like

version: v0.3

listeners:
  - name: http-8899
    address: 0.0.0.0
    port: 8899
    timeout: 300s

providers:
  defaults:
    default_model: deepseek-chat
  models:
    - name: deepseek-chat
      provider_model_id: deepseek-chat
      backend_refs:
        - name: primary
          base_url: https://api.deepseek.com/v1
          provider: openai
          protocol: https
          weight: 100
          api_key_env: OPENAI_API_KEY

routing:
  modelCards:
    - name: deepseek-chat
      modality: text
  decisions:
    - name: default-route
      description: Route all requests to deepseek-chat
      priority: 100
      rules:
        operator: AND
        conditions: []
      modelRefs:
        - model: deepseek-chat
          use_reasoning: false

And after elephant provider configure to my own config it is like

(base) zhoujinyu@zhoujinyudeMacBook-Air elephant-agent % elephant provider status

╭──────────────────────────────────────  🐘 Provider status  ───────────────────────────────────────╮
│                                                                                                   │
│  🐘 Provider status                                                                               │
│  The active provider and model posture Elephant Agent will use for the next turn.                 │
│                                                                                                   │
│                                             /  \~~~/  \                                           │
│                                           (     ..    )---.                                       │
│                                            \__     __/    \                                       │
│                                             )|  /)         |                                      │
│                                            / | / /~~~\    /                                       │
│                                           '-'-'     `---'                                         │
│                                                                                                   │
│  Provider                                                                                         │
│  • provider_id · vllm-semantic-router                                                             │
│  • display_name · vLLM Semantic Router                                                            │
│  • base_url · http://127.0.0.1:8899/v1                                                            │
│  • transport · OpenAI Chat-Compatible                                                             │
│  • secret_status · not-required                                                                   │
│  • secret_source · not-required                                                                   │
│  • discovery_status · configured                                                                  │
│  • discovery_source · profile                                                                     │
│                                                                                                   │
│                                                                                                   │
│  Model selection                                                                                  │
│  • model · deepseek-chat                                                                          │
│  • context_window_tokens · 163840                                                                 │
│  • context_window_mode · auto                                                                     │
│  • reasoning_effort · <unset>                                                                     │
│  • reasoning_efforts · <none>                                                                     │
│                                                                                                   │
│                                                                                                   │
│  Embedding selection                                                                              │
│  • source · local-default                                                                         │
│  • provider_id · elephant-local-embed                                                             │
│  • model_id · llm-semantic-router/elephant-embeddings-v1-text-small                               │
│  • dimensions · 256                                                                               │
│  • base_url · <unset>                                                                             │
│  • secret_status · not-required                                                                   │
│  • embedding_bootstrap_status · ready                                                             │
│  • embedding_bootstrap_ready · ready                                                              │
│  • embedding_bootstrap_summary · local embedding root is available at /Users/zhoujinyu/.elephant  │
│                                                                                                   │
│                                                                                                   │
│  Background bootstrap                                                                             │
│  • elephant-embed is ready for local semantic recall.                                             │
│  • elephant status will continue to show this path as ready.                                      │
│                                                                                                   │
│                                                                                                   │
│  Next invocations                                                                                 │
│  🧩 elephant provider                                                                             │
│  🧩 elephant provider models                                                                      │
│  🧩 elephant provider embeddings status                                                           │
│                                                                                                   │
│                                                                                                   │

And through adpter we can get

{
  "vsr_response_header:x-vsr-selected-decision": "default-route",
  "vsr_response_header:x-vsr-selected-reasoning": "off",
  "vsr_response_header:x-vsr-selected-model": "deepseek-chat",
  "vsr_response_header:x-vsr-injected-system-prompt": "false"
}

these for elephant-agnent for now

Checklist

  • Ran make agent-report CHANGED_FILES="..." or provided the equivalent changed-file context
  • Updated the matching source of truth when docs, workflows, templates, or executable rules changed
  • Kept the PR to one atomic behavior, harness, or documentation unit
  • Commits are signed and use scoped Conventional Commit subjects

Follow-Ups Or Debt

@netlify
Copy link
Copy Markdown

netlify Bot commented May 18, 2026

Deploy Preview for rad-granita-26ed35 ready!

Name Link
🔨 Latest commit 8b51aca
🔍 Latest deploy log https://app.netlify.com/projects/rad-granita-26ed35/deploys/6a0adbc125bfdf0008442848
😎 Deploy Preview https://deploy-preview-33--rad-granita-26ed35.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

@FAUST-BENCHOU FAUST-BENCHOU changed the title feat(provider): Integrate vLLM Semantic Router as a model provider [WIP]feat(provider): Integrate vLLM Semantic Router as a model provider May 18, 2026
@FAUST-BENCHOU FAUST-BENCHOU marked this pull request as ready for review May 18, 2026 07:37
@FAUST-BENCHOU FAUST-BENCHOU changed the title [WIP]feat(provider): Integrate vLLM Semantic Router as a model provider feat(provider): Integrate vLLM Semantic Router as a model provider May 18, 2026
@FAUST-BENCHOU
Copy link
Copy Markdown
Author

/cc @Xunzhuo ptal.Btw i also found some integration tests failing ,which has nothing to do with my changes.

(base) zhoujinyu@zhoujinyudeMacBook-Air elephant-agent % python -m pytest tests/integration/models_auth/ -v

/Users/zhoujinyu/miniconda3/lib/python3.12/site-packages/requests/__init__.py:113: RequestsDependencyWarning: urllib3 (2.3.0) or chardet (7.4.2)/charset_normalizer (3.4.4) doesn't match a supported version!
  warnings.warn(
========================================== test session starts ==========================================
platform darwin -- Python 3.12.2, pytest-8.0.0, pluggy-1.6.0 -- /Users/zhoujinyu/miniconda3/bin/python
cachedir: .pytest_cache
rootdir: /Users/zhoujinyu/CursorProject/elephant-agent
plugins: flask-1.3.0, langsmith-0.8.3, mock-3.14.0, hydra-core-1.3.2, cov-7.0.0, anyio-4.13.0
collected 70 items                                                                                      

tests/integration/models_auth/test_anthropic_provider.py::AnthropicProviderAdapterTests::test_capability_bridge_uses_shared_runtime_contract PASSED [  1%]
tests/integration/models_auth/test_anthropic_provider.py::AnthropicProviderAdapterTests::test_copilot_claude_uses_bearer_auth_and_default_headers PASSED [  2%]
tests/integration/models_auth/test_anthropic_provider.py::AnthropicProviderAdapterTests::test_embed_is_explicitly_unsupported PASSED [  4%]
tests/integration/models_auth/test_anthropic_provider.py::AnthropicProviderAdapterTests::test_generate_parses_native_tool_use_blocks PASSED [  5%]
tests/integration/models_auth/test_anthropic_provider.py::AnthropicProviderAdapterTests::test_generate_returns_native_result_without_leaking_secret_material PASSED [  7%]
tests/integration/models_auth/test_anthropic_provider.py::AnthropicProviderAdapterTests::test_native_request_preserves_history_and_tool_result_blocks PASSED [  8%]
tests/integration/models_auth/test_anthropic_provider.py::AnthropicProviderAdapterTests::test_native_request_uses_anthropic_messages_shape PASSED [ 10%]
tests/integration/models_auth/test_anthropic_provider.py::AnthropicProviderAdapterTests::test_native_request_uses_bearer_headers_for_anthropic_oauth_tokens PASSED [ 11%]
tests/integration/models_auth/test_anthropic_provider.py::AnthropicProviderAdapterTests::test_reasoning_effort_maps_to_thinking_payload PASSED [ 12%]
tests/integration/models_auth/test_anthropic_provider.py::AnthropicProviderAdapterTests::test_session_header_does_not_override_explicit_extra_header PASSED [ 14%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_api_provider_list_surfaces_codex_and_copilot_discovery PASSED [ 15%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_auth_profile_factory_supports_compatible_endpoint_inputs PASSED [ 17%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_auth_profiles_persist_provider_metadata_and_secret_references PASSED [ 18%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_model_registry_routes_provider_neutral_adapters PASSED [ 20%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_model_request_can_be_constructed_for_preview_runtime PASSED [ 21%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_preview_auth_provider_selects_provider_profile PASSED [ 22%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_preview_model_capability_uses_resolved_credentials PASSED [ 24%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_provider_auth_state_persists_discovery_metadata FAILED [ 25%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_provider_runtime_lists_catalog_and_guided_setup PASSED [ 27%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_provider_runtime_resolves_transport_and_runtime_metadata PASSED [ 28%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_secret_reference_resolution_returns_redacted_bundle PASSED [ 30%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_adds_tool_fallback_prompt_without_native_tool_calling PASSED [ 31%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_detects_copilot_claude_context_with_bearer_auth PASSED [ 32%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_detects_ollama_runtime_context_from_show_api PASSED [ 34%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_discovers_claude_code_from_local_credentials PASSED [ 35%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_discovers_copilot_acp_process PASSED [ 37%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_discovers_copilot_models_from_provider_specific_catalog_path PASSED [ 38%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_discovers_copilot_skips_classic_pat_env PASSED [ 40%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_discovers_external_provider_credentials PASSED [ 41%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_discovers_models_with_saved_non_active_provider_key PASSED [ 42%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_does_not_invent_placeholder_models_for_openai_compatible PASSED [ 44%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_falls_back_to_curated_codex_models_when_live_probe_fails PASSED [ 45%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_includes_enabled_custom_mcp_tools_in_model_request PASSED [ 47%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_uses_model_specific_context_hints_when_live_probe_fails PASSED [ 48%]
tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_surface_runtime_uses_models_dev_fallback_after_endpoint_metadata_miss PASSED [ 50%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_chat_request_flattens_all_system_context_into_one_system_message PASSED [ 51%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_chat_request_preserves_history_and_tool_result_roles PASSED [ 52%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_chat_requests_accept_base_url_without_v1_suffix PASSED [ 54%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_chat_transport_strips_tagged_reasoning_from_final_content PASSED [ 55%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_codex_responses_backfills_completed_response_from_stream_items PASSED [ 57%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_codex_responses_does_not_duplicate_output_text_done_content PASSED [ 58%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_codex_responses_omits_internal_metadata_from_request_payload PASSED [ 60%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_copilot_sanitizes_tool_schema_for_strict_function_contracts PASSED [ 61%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_embed_requests_use_the_shared_compatible_transport PASSED [ 62%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_generate_streams_and_parses_native_tool_calls PASSED [ 64%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_generate_streams_chat_completions_when_observer_is_present PASSED [ 65%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_plans_chat_requests_with_custom_base_url_and_headers FAILED [ 67%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_rendered_prompt_is_forwarded_without_provider_guardrail_prepended PASSED [ 68%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_responses_stream_reasoning_collapses_fragmented_newlines_without_breaking_mixed_language_text PASSED [ 70%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_responses_stream_reasoning_is_split_from_final_answer PASSED [ 71%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_responses_stream_reasoning_prioritizes_spaces_and_uses_completed_reasoning_when_available PASSED [ 72%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_responses_strict_schema_adds_array_items_for_tool_properties PASSED [ 74%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_responses_transport_includes_reasoning_effort_when_supported PASSED [ 75%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_responses_transport_parses_native_tool_calls PASSED [ 77%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_session_header_does_not_override_explicit_extra_header PASSED [ 78%]
tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_usage_accepts_openai_compatible_cache_token_aliases PASSED [ 80%]
tests/integration/models_auth/test_openai_compatible_provider.py::UrllibJSONHTTPTransportFallbackTests::test_default_timeout_allows_long_live_model_responses PASSED [ 81%]
tests/integration/models_auth/test_openai_compatible_provider.py::UrllibJSONHTTPTransportFallbackTests::test_html_http_errors_are_summarized_with_codex_reauth_hint FAILED [ 82%]
tests/integration/models_auth/test_openai_compatible_provider.py::UrllibJSONHTTPTransportFallbackTests::test_retries_with_curl_on_tls_unexpected_eof PASSED [ 84%]
tests/integration/models_auth/test_openai_compatible_provider.py::UrllibJSONHTTPTransportFallbackTests::test_retries_with_curl_on_tls_version_mismatch PASSED [ 85%]
tests/integration/models_auth/test_openai_compatible_provider.py::UrllibJSONHTTPTransportFallbackTests::test_stream_retries_with_curl_on_tls_unexpected_eof PASSED [ 87%]
tests/integration/models_auth/test_openai_provider.py::OpenAIProviderAdapterTests::test_openai_adapter_metadata_is_exportable PASSED [ 88%]
tests/integration/models_auth/test_openai_provider.py::OpenAIProviderAdapterTests::test_openai_credentials_flow_reuses_shared_secret_resolution PASSED [ 90%]
tests/integration/models_auth/test_openai_provider.py::OpenAIProviderAdapterTests::test_openai_profile_builder_preserves_first_party_metadata PASSED [ 91%]
tests/integration/models_auth/test_openai_provider.py::OpenAIProviderAdapterTests::test_openai_provider_exposes_reasoning_for_gpt5_models PASSED [ 92%]
tests/integration/models_auth/test_openai_provider.py::OpenAIProviderAdapterTests::test_openai_provider_uses_first_party_responses_transport FAILED [ 94%]
tests/integration/models_auth/test_vllm_semantic_router_provider.py::VllmSemanticRouterProviderIntegrationTests::test_build_model_adapter_registry_selects_vsr_builder PASSED [ 95%]
tests/integration/models_auth/test_vllm_semantic_router_provider.py::VllmSemanticRouterProviderIntegrationTests::test_integration_fallback_on_router_503 PASSED [ 97%]
tests/integration/models_auth/test_vllm_semantic_router_provider.py::VllmSemanticRouterProviderIntegrationTests::test_integration_routes_chat_and_records_vsr_metadata PASSED [ 98%]
tests/integration/models_auth/test_vllm_semantic_router_provider.py::VllmSemanticRouterProviderIntegrationTests::test_integration_sends_routing_policy_header PASSED [100%]

=============================================== FAILURES ================================================
____________ ModelsAuthIntegrationTests.test_provider_auth_state_persists_discovery_metadata ____________

self = <models_auth.test_models_auth_integration.ModelsAuthIntegrationTests testMethod=test_provider_auth_state_persists_discovery_metadata>

    def test_provider_auth_state_persists_discovery_metadata(self) -> None:
        database_path = Path(self.tempdir.name) / "auth-state.sqlite3"
        repository = RuntimeStorageRepository(database_path)
        repository.bootstrap()
    
        repository.upsert_provider_auth_state(
            ProviderAuthState(
                provider_id="copilot",
                auth_type="api_key",
                status="authenticated",
                source="gh-cli",
                transport_id="openai_responses",
                provider_kind="aggregator",
                base_url="https://api.githubcopilot.com",
                default_model="gpt-5.4",
                runtime_enabled=True,
                summary="authenticated via gh-cli",
                metadata={"reasoning_efforts": "minimal,low,medium,high"},
                discovered_at=datetime(2026, 4, 13),
                updated_at=datetime(2026, 4, 13),
            )
        )
    
        loaded = repository.load_provider_auth_state("copilot")
>       self.assertIsNotNone(loaded)
E       AssertionError: unexpectedly None

tests/integration/models_auth/test_models_auth_integration.py:736: AssertionError
________ OpenAICompatibleProviderTests.test_plans_chat_requests_with_custom_base_url_and_headers ________

self = <models_auth.test_openai_compatible_provider.OpenAICompatibleProviderTests testMethod=test_plans_chat_requests_with_custom_base_url_and_headers>

    def test_plans_chat_requests_with_custom_base_url_and_headers(self) -> None:
        adapter = OpenAICompatibleProviderAdapter(
            config=OpenAICompatibleProviderConfig(
                provider_id="openai-compatible",
                base_url=self.server.openai_base_url,
                model_id="openai/gpt-4o-mini",
                extra_headers={"x-tenant": "elephant"},
            ),
            runtime_resolver=ProviderRuntimeResolver.default(),
            credential_source=_StaticCredentialSource(
                {"openai-compatible": {"api_key": "sk-test-123"}}
            ),
        )
        request = ModelRequest(
            request_id="request-1",
            profile_id="profile-companion",
            session_id="session-1",
            provider_id="openai-compatible",
            model_id="openai/gpt-4o-mini",
            prompt="Summarize the provider runtime.",
            metadata={"trace_id": "trace-123"},
        )
    
        plan = adapter.plan_request(request)
        result = adapter.generate(request, {"api_key": "sk-test-123"})
    
        self.assertEqual(plan.url, self.server.openai_base_url + "/chat/completions")
        self.assertEqual(plan.request_family, "chat_completions")
        self.assertEqual(plan.transport_id, "openai_chat_compatible")
        self.assertEqual(plan.headers["Authorization"], "Bearer sk-test-123")
        self.assertEqual(plan.headers["x-tenant"], "elephant")
        self.assertEqual(plan.headers["x-session-id"], "session-1")
        self.assertEqual(plan.payload["model"], "openai/gpt-4o-mini")
        self.assertEqual(plan.payload["messages"][0]["role"], "system")
>       self.assertIn("### System Layer Contract", plan.payload["messages"][0]["content"])
E       AssertionError: '### System Layer Contract' not found in "#### Understanding System\n- You are the active Elephant Agent identity for one durable elephant.\n- Personal Model is the durable understanding layer: active claims grouped by Identity, World, Pulse, and Journey.\n- Evidence explains why a claim exists; it is recalled only when useful and is not prompt truth by itself.\n- Elephant State is identity plus a background-learned continuation note; live commitments belong in Episode, Step, recall, or explicit task tools.\n- Episode is the current wake/open runtime window; Step is one atomic event inside it.\n- This prompt is the current Episode projection, not the durable source of truth.\n- Stay truthful and bounded; never fake recall, certainty, capability, intimacy, or identity.\n#### Episode Continuity\n- Use active Personal Model claims, the Elephant context note, Episode summaries, and current-turn recall before asking the user to repeat context.\n- Keep the active elephant explicit when there is real ongoing work; otherwise let the user's current message set the pace.\n- Do not promise a hidden planner or background workflow that is not represented in Episode, Step, or tools.\n- Treat the Episode resume snapshot as stable during the wake window and current-turn recall support attachments as current-turn evidence only.\n- Keep updates concise, inspectable, and tied to one Personal Model lens/topic or Elephant context note.\n#### Session Work\n- Ongoing work continuity lives in Episode summaries and current-turn recall, not in a durable blocker/next-step board.\n- Use `tool.todo.manage` only as an in-session execution board when the active task benefits from explicit step tracking.\n- Do not use todos for greetings, biography, identity facts, preferences, relationship notes, ordinary social chat, one-off answers, or completed-work logs.\n#### Understanding tools\n- Use tools silently when needed; do not narrate routing, storage, or internal state mechanics unless the user asks.\n- Use `tool.personal_model.search` for durable claims and `tool.personal_model.update` for user-stated changes.\n- If the user explicitly asks you to remember, save, note, or keep a durable personal fact, call `tool.personal_model.update` before replying; do not say it was remembered unless the update tool succeeded.\n- Use `tool.conversation.search` for prior conversation history: mode=discover finds relevant ranges; mode=recall returns details from a selected range.\n- Be patient with time wording. If the user says yesterday, last night, this morning, recently, or gives dates, first construct top-level `expr` carefully (`last_night`, `yesterday`, `last:3d`, `this:week`, or an ISO interval); do not run discover without `expr` or explicit `start_at`/`end_at`.\n- Prefer mode=discover for broad windows, then copy the returned range `start_at`, `end_at`, and `timezone` into mode=recall; keep default `view=conversation` and do not include the current episode for historical recall.\n- Keep durable writes small, human-legible, grounded in the user's words, and owned by one lens/topic.\n- Use `tool.personal_model.questions` only when one timely question would improve future help.\n- Route in-session execution boards through `tool.todo.manage`.\n- If a default elephant file path is provided, use it for user-requested files, downloads, repositories, and generated artifacts unless the user gives another path."

tests/integration/models_auth/test_openai_compatible_provider.py:592: AssertionError
___ UrllibJSONHTTPTransportFallbackTests.test_html_http_errors_are_summarized_with_codex_reauth_hint ____

self = <models_auth.test_openai_compatible_provider.UrllibJSONHTTPTransportFallbackTests testMethod=test_html_http_errors_are_summarized_with_codex_reauth_hint>

    def test_html_http_errors_are_summarized_with_codex_reauth_hint(self) -> None:
        transport = UrllibJSONHTTPTransport()
        exc = error.HTTPError(
            url="https://chatgpt.com/backend-api/codex/v1/responses",
            code=403,
            msg="Forbidden",
            hdrs=None,
            fp=io.BytesIO(
                b"<html><head><title>Forbidden</title></head><body><h1>Forbidden</h1><p>Access denied.</p></body></html>"
            ),
        )
    
>       message = transport._error_message(exc, url="https://chatgpt.com/backend-api/codex/v1/responses")
E       AttributeError: 'UrllibJSONHTTPTransport' object has no attribute '_error_message'

tests/integration/models_auth/test_openai_compatible_provider.py:1345: AttributeError
_________ OpenAIProviderAdapterTests.test_openai_provider_uses_first_party_responses_transport __________

self = <models_auth.test_openai_provider.OpenAIProviderAdapterTests testMethod=test_openai_provider_uses_first_party_responses_transport>

    def test_openai_provider_uses_first_party_responses_transport(self) -> None:
        adapter = OPENAI.OpenAIProviderAdapter()
    
        manifest = adapter.manifest
        resolution = adapter.runtime_resolution(model_id="gpt-4.1", base_url="https://api.openai.com/v1")
        guide = adapter.setup_guide()
    
        self.assertEqual(manifest.provider_id, "openai")
        self.assertEqual(manifest.transport_id, "openai_responses")
        self.assertEqual(resolution.transport_id, "openai_responses")
        self.assertEqual(resolution.request_family, "responses")
>       self.assertFalse(resolution.supports_streaming)
E       AssertionError: True is not false

tests/integration/models_auth/test_openai_provider.py:35: AssertionError
======================================== short test summary info ========================================
FAILED tests/integration/models_auth/test_models_auth_integration.py::ModelsAuthIntegrationTests::test_provider_auth_state_persists_discovery_metadata - AssertionError: unexpectedly None
FAILED tests/integration/models_auth/test_openai_compatible_provider.py::OpenAICompatibleProviderTests::test_plans_chat_requests_with_custom_base_url_and_headers - AssertionError: '### System Layer Contract' not found in "#### Understanding System\n- You are the a...
FAILED tests/integration/models_auth/test_openai_compatible_provider.py::UrllibJSONHTTPTransportFallbackTests::test_html_http_errors_are_summarized_with_codex_reauth_hint - AttributeError: 'UrllibJSONHTTPTransport' object has no attribute '_error_message'
FAILED tests/integration/models_auth/test_openai_provider.py::OpenAIProviderAdapterTests::test_openai_provider_uses_first_party_responses_transport - AssertionError: True is not false
===================================== 4 failed, 66 passed in 37.62s ===============

Though for now we dont get CI to test these.But I think we should fix it first. I'll check what caused the error later on.

@Xunzhuo
Copy link
Copy Markdown
Contributor

Xunzhuo commented May 18, 2026

@FAUST-BENCHOU can you rebase and check if has any errors?

@FAUST-BENCHOU FAUST-BENCHOU force-pushed the feat/semantic-provider branch from 4571146 to 780c10f Compare May 18, 2026 09:24
@FAUST-BENCHOU
Copy link
Copy Markdown
Author

FAUST-BENCHOU commented May 18, 2026

ya sure.

Signed-off-by: zhoujinyu <2319109590@qq.com>
@FAUST-BENCHOU FAUST-BENCHOU force-pushed the feat/semantic-provider branch from 780c10f to 8b51aca Compare May 18, 2026 09:28
@Xunzhuo
Copy link
Copy Markdown
Contributor

Xunzhuo commented May 21, 2026

@FAUST-BENCHOU more integrations and UX improves can be added later

@Xunzhuo
Copy link
Copy Markdown
Contributor

Xunzhuo commented May 21, 2026

can you add this into macos app as well? if you are using mac
Clipboard_Screenshot_1779372814

@Xunzhuo Xunzhuo merged commit 95f483a into agentic-in:main May 21, 2026
5 checks passed
@FAUST-BENCHOU
Copy link
Copy Markdown
Author

ya i'm using mac.i will add this.Thanks for your review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integrate vLLM Semantic Router as a model provider

2 participants