-
Notifications
You must be signed in to change notification settings - Fork 60
LCORE-332: Lightspeed core needs to fully support WatsonX LLM provider #943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
added override model and provider to e2e tests patched conversation tests to include model and provider in the call
WalkthroughThis PR adds comprehensive WatsonX provider support to the Llama Stack project, including CI/CD workflow configuration with watsonx environment matrix entry, docker-compose environment variable additions, new example and E2E test configurations, documentation updates, and E2E test enhancements to support explicit model/provider parameters with environment override capability. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
PS: Please ignore issues with VertexAI provider as they are independent of these changes; the project has been migrated by the cloud team, I will need to talk to them to re-enable it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
.github/workflows/e2e_tests.yaml (1)
263-268: Document the watsonx override as temporary technical debt.The override
E2E_DEFAULT_MODEL_OVERRIDE=watsonx/watsonx/meta-llama/llama-3-3-70b-instructduplicates the provider name, which is a workaround for WatsonX's model identifier format containing/. Per the PR description, this is intentional until an upstream fix is applied.Consider tracking this as technical debt with a TODO comment or issue reference to ensure it's revisited when the upstream fix is available.
🔎 Suggested improvement
# watsonx has a different convention than "<provider>/<model>" +# TODO(LCORE-XXX): Remove this workaround once Llama Stack properly handles WatsonX model identifiers with embedded slashes - name: Set watsonx test overrides if: matrix.environment == 'watsonx' run: | echo "E2E_DEFAULT_MODEL_OVERRIDE=watsonx/watsonx/meta-llama/llama-3-3-70b-instruct" >> $GITHUB_ENV echo "E2E_DEFAULT_PROVIDER_OVERRIDE=watsonx" >> $GITHUB_ENVtests/e2e/features/environment.py (1)
61-86: Consider validating partial override configuration.The current logic requires both
E2E_DEFAULT_MODEL_OVERRIDEandE2E_DEFAULT_PROVIDER_OVERRIDEto be set (line 65). If only one is provided, both are silently ignored and the code falls back to service detection. This could be confusing for users who expect a partial override to work.Consider adding a warning when only one override is set, or raising an error if they should be used together.
🔎 Proposed validation for partial overrides
# Check for environment variable overrides first model_override = os.getenv("E2E_DEFAULT_MODEL_OVERRIDE") provider_override = os.getenv("E2E_DEFAULT_PROVIDER_OVERRIDE") +# Validate that overrides are provided together +if bool(model_override) != bool(provider_override): + print( + "⚠ Warning: Both E2E_DEFAULT_MODEL_OVERRIDE and E2E_DEFAULT_PROVIDER_OVERRIDE " + "must be set together. Falling back to service detection." + ) + if model_override and provider_override: context.default_model = model_override context.default_provider = provider_override print( f"Using override LLM: {context.default_model} (provider: {context.default_provider})" )examples/watsonx-run.yaml (2)
65-65: Clarify the hardcoded asterisks for openai_api_key.The
openai_api_keyis set to a literal string of asterisks'********'. If the braintrust scoring provider is intended to be functional, this will fail authentication. If braintrust is disabled or not used in this example, consider removing this provider entry or adding a comment explaining it's a placeholder.
50-50: Track the workaround for disabled safety shields.Safety shields are disabled with a warning comment about infinite loop issues with LLM calls (lines 50 and 149). This is a significant security/safety feature being disabled.
Consider creating a tracking issue for this known limitation so it can be re-enabled once the upstream issue is resolved.
Would you like me to help create a GitHub issue to track this limitation?
Also applies to: 149-149
tests/e2e/configs/run-watsonx.yaml (2)
65-65: Clarify the hardcoded asterisks for openai_api_key.The
openai_api_keyis set to a literal string of asterisks'********'. If the braintrust scoring provider is needed for E2E tests, this will fail authentication. If braintrust is not used in these tests, consider removing this provider entry or documenting that it's intentionally disabled.
50-50: Document the workaround for disabled safety shields in E2E tests.Safety shields are disabled with a warning comment about infinite loop issues with LLM calls (lines 50 and 149). This affects the test coverage for safety features.
Consider adding a comment or documentation about which safety-related test scenarios are skipped due to this limitation.
Also applies to: 149-149
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
.github/workflows/e2e_tests.yamlREADME.mddocker-compose-library.yamldocker-compose.yamldocs/providers.mdexamples/watsonx-run.yamlsrc/app/endpoints/query.pytests/e2e/configs/run-watsonx.yamltests/e2e/features/conversations.featuretests/e2e/features/environment.pytests/e2e/features/query.featuretests/e2e/features/streaming_query.feature
🧰 Additional context used
📓 Path-based instructions (5)
tests/e2e/features/**/*.feature
📄 CodeRabbit inference engine (CLAUDE.md)
Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Files:
tests/e2e/features/conversations.featuretests/e2e/features/streaming_query.featuretests/e2e/features/query.feature
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use absolute imports for internal modules in LCS project (e.g.,from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) andOptional[Type]orType | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Useasync deffor I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes:Configurationfor config classes,Error/Exceptionfor exceptions,Resolverfor strategy patterns,Interfacefor abstract base classes
Abstract classes must use ABC with@abstractmethoddecorators
Include complete type annotations for all class attributes in Python classes
Useimport loggingand module logger pattern with standard log levels: debug, info, warning, error
Files:
src/app/endpoints/query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use FastAPI
HTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/query.py
src/**/{client,app/endpoints/**}.py
📄 CodeRabbit inference engine (CLAUDE.md)
Handle
APIConnectionErrorfrom Llama Stack in integration code
Files:
src/app/endpoints/query.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use
pytest-mockwith AsyncMock objects for mocking in tests
Files:
tests/e2e/features/environment.py
🧠 Learnings (1)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
examples/watsonx-run.yamltests/e2e/configs/run-watsonx.yaml
🧬 Code graph analysis (2)
src/app/endpoints/query.py (2)
tests/unit/app/endpoints/test_streaming_query.py (1)
ToolExecutionStep(207-221)src/models/responses.py (1)
QueryResponse(341-452)
tests/e2e/features/environment.py (1)
src/models/responses.py (1)
model_override(1497-1516)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: unit_tests (3.13)
- GitHub Check: Pyright
- GitHub Check: Pylinter
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: E2E: library mode / vertexai
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: server mode / vertexai
- GitHub Check: E2E: library mode / azure
- GitHub Check: E2E: server mode / azure
- GitHub Check: E2E: library mode / ci
🔇 Additional comments (11)
docs/providers.md (1)
58-58: LGTM!The provider change from
ibm_watsonx_aitolitellmcorrectly reflects the implementation approach for WatsonX support, with the supported status appropriately maintained.README.md (1)
125-125: LGTM!The WatsonX provider documentation additions are properly formatted and include appropriate references to setup documentation and example configurations.
Also applies to: 181-181
.github/workflows/e2e_tests.yaml (1)
13-13: LGTM!The watsonx environment additions to the test matrix and the credential configuration for both server and library modes are properly structured and consistent with other provider configurations.
Also applies to: 203-204, 231-232
docker-compose.yaml (1)
35-38: LGTM!The WatsonX environment variables follow the same pattern as other provider configurations, with appropriate default empty values.
tests/e2e/features/conversations.feature (1)
14-14: LGTM!The payload extensions to include explicit
modelandproviderfields align with the PR's goal of supporting provider-specific testing. The placeholder substitution pattern is consistent across all scenarios.Also applies to: 31-31, 53-53, 100-100, 138-138, 152-152, 190-190
docker-compose-library.yaml (1)
37-40: LGTM!The WatsonX environment variables are consistently configured for library mode, mirroring the server mode configuration in docker-compose.yaml.
tests/e2e/features/query.feature (1)
13-13: LGTM!The test payload extensions appropriately add explicit
modelandproviderfields to positive test scenarios while preserving error-condition tests that validate missing parameters. This aligns with the PR's goal of supporting explicit provider/model specification.Also applies to: 25-25, 37-37, 54-54, 72-72
src/app/endpoints/query.py (1)
542-551: Provider filtering provides sufficient protection against identifier collisions.The matching logic at line 543-544 uses an AND condition between the identifier check and provider_id check:
m.identifier in (llama_stack_model_id, model_id) and m.provider_id == provider_id. This ensures that only models from the specified provider are matched, preventing unintended collisions across providers even when multiple providers have models with the same plain identifier (e.g., "custom-model"). While the logic accepts bothprovider_id/model_idformat and plainmodel_id, the strict provider_id filter isolates the search to a single provider, eliminating the collision risk.examples/watsonx-run.yaml (1)
144-148: Verify the provider_model_id format aligns with the upstream fix.The
provider_model_idon line 148 uses the formatwatsonx/meta-llama/llama-3-3-70b-instruct, which includes the provider prefix. According to the PR description, this is a workaround because WatsonX model identifiers contain slashes that cause parsing issues.Ensure that when the upstream fix is applied, this configuration format remains compatible or that migration guidance is provided.
tests/e2e/configs/run-watsonx.yaml (1)
144-148: Verify the provider_model_id format aligns with the upstream fix.The
provider_model_idon line 148 uses the formatwatsonx/meta-llama/llama-3-3-70b-instruct, which includes the provider prefix. According to the PR description, this is a workaround because WatsonX model identifiers contain slashes that cause parsing issues.Ensure that when the upstream fix is applied, this E2E test configuration format remains compatible or is updated accordingly.
tests/e2e/features/streaming_query.feature (1)
14-14: Placeholder substitution is properly configured.The
{MODEL}and{PROVIDER}placeholders are correctly substituted at runtime. Thebefore_allhook inenvironment.pyinitializescontext.default_modelandcontext.default_provider, and thereplace_placeholders()function is consistently called by all step implementations (inllm_query_response.pyandcommon_http.py) before processing JSON payloads. The mechanism is working as intended.
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
query.pyfor the issue mentioned belowWhen calling
querywithout explicitly specifying a model or provider,query.pyin LCS checks whether the model is registered. In this case, LCS fails to find the WatsonX model. This happens because WatsonX model identifiers are not built in the expected<provider_id>/<model_id>format. WatsonX is one of the few providers that registers amodel_idcontaining a/, for example:meta-llama/llama-4-maverick-17b-128e-instruct-fp8, which in llama-stack is stored as themodel.identifier, while LCS expects it to bewatsonx/meta-llama/llama-4-maverick-17b-128e-instruct-fp8.As a workaround, I propose using environment variables to override the model and provider used in end-to-end tests (
E2E_DEFAULT_MODEL_OVERRIDEandE2E_DEFAULT_PROVIDER_OVERRIDE). Additionally, the e2e tests should explicitly specify the model and provider instead of sending only thequeryfield.We can keep the existing tests in
query.featureandstreaming.featurethat verify calling the endpoint without specifying a provider or model. These tests will continue to fail for WatsonX until the issue is fixed upstream.Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
NA
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Documentation
Tests
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.