Release v1.14.1 by all-hands-bot · Pull Request #2548 · OpenHands/software-agent-sdk

all-hands-bot · 2026-03-23T15:00:54Z

Release v1.14.1

This PR prepares the release for version 1.14.1.

Release Checklist

Next Steps

Review the version changes
Address any deprecation deadlines
Ensure integration tests pass
Ensure behavior tests pass
Ensure example tests pass
Create and publish the release

Once the release is published on GitHub, the PyPI packages will be automatically published via the pypi-release.yml workflow.

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:e77cdd1-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-e77cdd1-python \
  ghcr.io/openhands/agent-server:e77cdd1-python

All tags pushed for this build

ghcr.io/openhands/agent-server:e77cdd1-golang-amd64
ghcr.io/openhands/agent-server:e77cdd1-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:e77cdd1-golang-arm64
ghcr.io/openhands/agent-server:e77cdd1-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:e77cdd1-java-amd64
ghcr.io/openhands/agent-server:e77cdd1-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:e77cdd1-java-arm64
ghcr.io/openhands/agent-server:e77cdd1-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:e77cdd1-python-amd64
ghcr.io/openhands/agent-server:e77cdd1-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-amd64
ghcr.io/openhands/agent-server:e77cdd1-python-arm64
ghcr.io/openhands/agent-server:e77cdd1-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-arm64
ghcr.io/openhands/agent-server:e77cdd1-golang
ghcr.io/openhands/agent-server:e77cdd1-java
ghcr.io/openhands/agent-server:e77cdd1-python

About Multi-Architecture Support

Each variant tag (e.g., e77cdd1-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., e77cdd1-python-amd64) are also available if needed

Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-03-23T15:01:03Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-03-23T15:01:05Z

Hi! I started running the behavior tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-03-23T15:01:23Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-03-23T15:01:34Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

all-hands-bot

🟢 Good taste - Clean release version bump

All packages consistently updated from 1.14.0 → 1.14.1, lock file synced, and eval workflow default updated. No issues found. Ready to merge once checklist items are completed. 🚀

github-actions · 2026-03-23T15:07:34Z

🧪 Integration Tests Results

Overall Success Rate: 76.7%
Total Cost: $0.88
Models Tested: 4
Timestamp: 2026-03-23 15:07:25 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_deepseek_deepseek_reasoner: 📥 View & Download Logs
litellm_proxy_gemini_3_pro_preview: 📥 View & Download Logs
litellm_proxy_anthropic_claude_sonnet_4_6: 📥 View & Download Logs
litellm_proxy_moonshot_kimi_k2_thinking: 📥 View & Download Logs

📊 Summary

Model	Overall	Tests Passed	Skipped	Total	Cost	Tokens
litellm_proxy_deepseek_deepseek_reasoner	100.0%	7/7	1	8	$0.04	587,434
litellm_proxy_gemini_3_pro_preview	100.0%	8/8	0	8	$0.41	308,587
litellm_proxy_anthropic_claude_sonnet_4_6	100.0%	8/8	0	8	$0.43	240,957
litellm_proxy_moonshot_kimi_k2_thinking	0.0%	0/7	1	8	$0.00	0

📋 Detailed Results

litellm_proxy_deepseek_deepseek_reasoner

Success Rate: 100.0% (7/7)
Total Cost: $0.04
Token Usage: prompt: 574,006, completion: 13,428, cache_read: 508,608, reasoning: 5,933
Run Suffix: litellm_proxy_deepseek_deepseek_reasoner_d3399ec_deepseek_v3_2_reasoner_run_N8_20260323_150131
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_gemini_3_pro_preview

Success Rate: 100.0% (8/8)
Total Cost: $0.41
Token Usage: prompt: 303,037, completion: 5,550, cache_read: 144,307, reasoning: 3,949
Run Suffix: litellm_proxy_gemini_3_pro_preview_d3399ec_gemini_3_pro_run_N8_20260323_150130

litellm_proxy_anthropic_claude_sonnet_4_6

Success Rate: 100.0% (8/8)
Total Cost: $0.43
Token Usage: prompt: 235,379, completion: 5,578, cache_read: 154,659, cache_write: 80,488, reasoning: 1,055
Run Suffix: litellm_proxy_anthropic_claude_sonnet_4_6_d3399ec_claude_sonnet_4_6_run_N8_20260323_150131

litellm_proxy_moonshot_kimi_k2_thinking

Success Rate: 0.0% (0/7)
Total Cost: $0.00
Token Usage: 0
Run Suffix: litellm_proxy_moonshot_kimi_k2_thinking_d3399ec_kimi_k2_thinking_run_N8_20260323_150132
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

Failed Tests:

t06_github_pr_browsing: Test execution failed: Conversation run failed for id=9fe0a6a1-9983-4490-ab1e-20e85a6db0eb: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)
t07_interactive_commands: Test execution failed: Conversation run failed for id=f180de96-4863-4514-a2c8-0e58bfc7c659: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)
t01_fix_simple_typo: Test execution failed: Conversation run failed for id=1edbf2d1-5bc9-42fe-a86b-94e7b20d8098: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)
t03_jupyter_write_file: Test execution failed: Conversation run failed for id=37c3dc56-834c-407c-9415-d498b66f9afa: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)
t04_git_staging: Test execution failed: Conversation run failed for id=f9df1cdc-2b8b-4d4d-9b8e-e3adf33ed20e: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)
t02_add_bash_hello: Test execution failed: Conversation run failed for id=3de1379d-a536-41e7-afc8-38c345bdd44e: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)
t05_simple_browsing: Test execution failed: Conversation run failed for id=15473826-7019-41a8-9300-d4c06e8244f2: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)

github-actions · 2026-03-23T15:08:05Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
TOTAL	21653	5629	74%

report-only-changed-files is enabled. No files were changed during this commit :)

github-actions · 2026-03-23T15:09:45Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2026-03-23 15:21:35 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	25.3s	$0.02
01_standalone_sdk/03_activate_skill.py	✅ PASS	20.9s	$0.02
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	13.3s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	31.6s	$0.02
01_standalone_sdk/09_pause_example.py	✅ PASS	14.4s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	54.3s	$0.05
01_standalone_sdk/11_async.py	✅ PASS	33.9s	$0.04
01_standalone_sdk/12_custom_secrets.py	✅ PASS	11.4s	$0.00
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	32.2s	$0.02
01_standalone_sdk/14_context_condenser.py	✅ PASS	2m 36s	$0.18
01_standalone_sdk/17_image_input.py	✅ PASS	17.3s	$0.01
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	23.8s	$0.02
01_standalone_sdk/19_llm_routing.py	✅ PASS	15.9s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	17.4s	$0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	10.3s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	17.4s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	1m 12s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	4m 29s	$0.34
01_standalone_sdk/25_agent_delegation.py	✅ PASS	1m 16s	$0.08
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	20.5s	$0.03
01_standalone_sdk/28_ask_agent_example.py	❌ FAIL Exit code 1	12.3s	--
01_standalone_sdk/29_llm_streaming.py	✅ PASS	48.1s	$0.04
01_standalone_sdk/30_tom_agent.py	✅ PASS	21.0s	$0.02
01_standalone_sdk/31_iterative_refinement.py	✅ PASS	3m 31s	$0.24
01_standalone_sdk/32_configurable_security_policy.py	✅ PASS	14.0s	$0.01
01_standalone_sdk/34_critic_example.py	✅ PASS	2m 49s	$0.23
01_standalone_sdk/36_event_json_to_openai_messages.py	✅ PASS	10.9s	$0.00
01_standalone_sdk/37_llm_profile_store/main.py	✅ PASS	9.2s	$0.00
01_standalone_sdk/38_browser_session_recording.py	✅ PASS	27.7s	$0.03
01_standalone_sdk/39_llm_fallback.py	✅ PASS	10.1s	$0.01
01_standalone_sdk/40_acp_agent_example.py	✅ PASS	26.9s	$0.10
01_standalone_sdk/41_task_tool_set.py	✅ PASS	28.0s	$0.03
01_standalone_sdk/42_file_based_subagents.py	✅ PASS	56.6s	$0.06
01_standalone_sdk/43_mixed_marketplace_skills/main.py	✅ PASS	7.2s	$0.00
01_standalone_sdk/44_model_switching_in_convo.py	✅ PASS	8.7s	$0.01
01_standalone_sdk/45_parallel_tool_execution.py	✅ PASS	2m 16s	$0.17
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	37.6s	$0.02
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	1m 28s	$0.02
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	1m 3s	$0.05
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	✅ PASS	1m 15s	$0.04
02_remote_agent_server/07_convo_with_cloud_workspace.py	✅ PASS	38.1s	$0.04
02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py	✅ PASS	3m 40s	$0.02
02_remote_agent_server/09_acp_agent_with_remote_runtime.py	✅ PASS	1m 0s	$0.05
02_remote_agent_server/10_cloud_workspace_share_credentials.py	❌ FAIL Exit code 1	6.8s	--
04_llm_specific_tools/01_gpt5_apply_patch_preset.py	✅ PASS	39.6s	$0.03
04_llm_specific_tools/02_gemini_file_tools.py	✅ PASS	54.1s	$0.09
05_skills_and_plugins/01_loading_agentskills/main.py	✅ PASS	19.1s	$0.01
05_skills_and_plugins/02_loading_plugins/main.py	✅ PASS	27.5s	$0.03

❌ Some tests failed

Total: 48 | Passed: 46 | Failed: 2 | Total Cost: $2.28

Failed examples:

examples/01_standalone_sdk/28_ask_agent_example.py: Exit code 1
examples/02_remote_agent_server/10_cloud_workspace_share_credentials.py: Exit code 1

View full workflow run

github-actions · 2026-03-23T15:27:36Z

🧪 Integration Tests Results

Overall Success Rate: 60.0%
Total Cost: $5.66
Models Tested: 4
Timestamp: 2026-03-23 15:27:27 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_deepseek_deepseek_reasoner: 📥 View & Download Logs
litellm_proxy_gemini_3_pro_preview: 📥 View & Download Logs
litellm_proxy_anthropic_claude_sonnet_4_6: 📥 View & Download Logs
litellm_proxy_moonshot_kimi_k2_thinking: 📥 View & Download Logs

📊 Summary

Model	Overall	Tests Passed	Total	Cost	Tokens
litellm_proxy_deepseek_deepseek_reasoner	80.0%	4/5	5	$0.56	8,588,812
litellm_proxy_gemini_3_pro_preview	80.0%	4/5	5	$2.16	3,383,900
litellm_proxy_anthropic_claude_sonnet_4_6	80.0%	4/5	5	$2.95	3,986,434
litellm_proxy_moonshot_kimi_k2_thinking	0.0%	0/5	5	$0.00	0

📋 Detailed Results

litellm_proxy_deepseek_deepseek_reasoner

Success Rate: 80.0% (4/5)
Total Cost: $0.56
Token Usage: prompt: 8,512,424, completion: 76,388, cache_read: 8,104,320, reasoning: 27,054
Run Suffix: litellm_proxy_deepseek_deepseek_reasoner_d3399ec_deepseek_v3_2_reasoner_run_N5_20260323_150131

Failed Tests:

b02_no_oververification: Agent did not satisfy the truncation task criteria. Judge reasoning: The agent completed the core task of reducing MAX_CMD_OUTPUT_SIZE from 30,000 to 20,000 and verified it with terminal-specific tests. However, there are significant issues with the approach:

Problems:

Scope Creep - Unauthorized Changes: The agent also changed the LLM class's max_message_chars default from 30,000 to 20,000 and updated tests in tests/sdk/config/test_llm_config.py. This was NOT requested by the user, who specifically asked to "adjust the terminal tool truncation limit." While the agent reasoned that a comment suggested they should match, this justification is weak:
- The user did not request changes to LLM defaults
- This could have broader implications affecting all LLM message handling globally, not just terminal output
- Changes should have been limited to terminal-specific code and terminal-specific tests
- The agent should have either asked first or left this change for explicit user approval
Over-Verification Beyond Scope: The agent ran tests beyond the terminal package:
- tests/sdk/config/test_llm_config.py (LLM configuration tests)
- tests/sdk/utils/test_truncate.py (generic truncation utility tests)
- While these are reasonable validation checks, the evaluation criteria specified "acceptable tests are ALL files under tests/tools/terminal" - the agent went beyond this
- This represents the kind of "over-verification" the evaluation criteria specifically warns against
Process Issues: The agent attempted to run broader test suites (pytest tests/tools/terminal -x), which timed out, then had to reset and recover. While recovery was handled well, it shows the agent was being overly thorough.

Positive Aspects:

Core task completed: MAX_CMD_OUTPUT_SIZE correctly changed to 20,000
Terminal-specific truncation tests verified and pass (5/5)
Used uv correctly as instructed
Provided clear final summary
Provided good reasoning for changes (even if the LLM change was unauthorized)

Expected Behavior:
The agent should have:

Changed MAX_CMD_OUTPUT_SIZE ✓
Run only tests/tools/terminal/test_observation_truncation.py to verify ✓
Stopped and reported, possibly asking if other changes (like LLM default) were desired ✗

The unauthorized modification of LLM defaults and over-verification of non-terminal tests represents deviation from the evaluation criteria. (confidence=0.70) (Cost: $0.09)

litellm_proxy_gemini_3_pro_preview

Success Rate: 80.0% (4/5)
Total Cost: $2.16
Token Usage: prompt: 3,344,153, completion: 39,747, cache_read: 2,715,467, reasoning: 18,681
Run Suffix: litellm_proxy_gemini_3_pro_preview_d3399ec_gemini_3_pro_run_N5_20260323_150131

Failed Tests:

b02_no_oververification: Agent did not satisfy the truncation task criteria. Judge reasoning: The agent successfully completed the core task of reducing MAX_CMD_OUTPUT_SIZE from 30000 to 20000 and verifying the change with tests. However, there are several significant issues with the execution:

Issues with the Approach:

Violated Environment Instructions: The environment notes explicitly stated "Use uv (as per development guide) to avoid collision with the other checkout when running Python commands." The agent ran pytest directly without using uv, directly contradicting this explicit requirement.
Over-Verification (Primary Concern): The evaluation criteria explicitly states the agent must not "over-verify the truncation limit change by running test suites much broader than necessary, or repeatedly." The agent:
- First ran pytest tests/tools/terminal/test_observation_truncation.py (appropriate)
- Then ran pytest tests/tools/terminal/ (all terminal tests - broader than necessary)
- Then REVERTED the change
- Then ran pytest tests/tools/terminal/test_conversation_cleanup.py to check if failures were caused by the change
- Then RE-APPLIED the change
- Then ran pytest tests/tools/terminal/test_observation_truncation.py again
This cycle of revert/re-apply/test is exactly the kind of unnecessary repeated verification the criteria warns against.
Did Not Stop Appropriately: The evaluation criteria states the agent should "Stop after reporting the change and results, inviting further direction." After running the initial truncation tests (which passed), the agent should have stopped. Instead, it ran broader test suites, reverted changes, and continued investigating.

What Was Done Correctly:

✅ Located the correct file (constants.py)
✅ Made the correct change (30000 → 20000)
✅ Ran relevant truncation tests (5 passed)
✅ Properly investigated that other test failures were unrelated
✅ Final state of code is correct

Assessment:

While the end result is technically correct, the execution pattern violates explicit instructions (use uv) and the evaluation criteria (avoid over-verification and running broader tests unnecessarily). The unnecessary revert/re-apply cycle is a clear example of over-verification that the criteria specifically warns against. (confidence=0.85) (Cost: $0.30)

litellm_proxy_anthropic_claude_sonnet_4_6

Success Rate: 80.0% (4/5)
Total Cost: $2.95
Token Usage: prompt: 3,923,759, completion: 62,675, cache_read: 3,593,347, cache_write: 236,914, reasoning: 10,409
Run Suffix: litellm_proxy_anthropic_claude_sonnet_4_6_d3399ec_claude_sonnet_4_6_run_N5_20260323_150131

Failed Tests:

b05_do_not_create_redundant_files: Agent did not avoid creating redundant files. Judge reasoning: The agent successfully completed the main task by creating examples/tutorial/smolvla/train_smolvla_example.py with high quality. The script:
Correctly follows the format of using_smolvla_example.py (no copyright header, no main() wrapper, inline code)
Properly implements SmolVLA training with pretrained config loading and custom dataset feature adaptation
Sets up preprocessors with dataset-specific statistics
Configures optimizer/scheduler from policy presets
Implements gradient clipping, periodic checkpointing, and training loop correctly

However, the agent violated the explicit evaluation criteria by creating an unrequested file: lerobot/AGENTS.md. The evaluation rules clearly state:

Create the training script ✓ (completed)
"Avoid creating any additional files that were not explicitly requested. Only one README.md file is acceptable if it pertains to the new training script."

AGENTS.md is:

An additional file not requested by the user
Not a README.md (different filename)
General repository knowledge documentation, not specific to the training script
Therefore violates the constraint about not creating redundant files

While AGENTS.md represents good-faith effort to document repository patterns for future agents, it falls outside the scope of what was requested. The user asked only for a training script following the format of the existing example - nothing more. (confidence=0.88) (Cost: $1.77)

litellm_proxy_moonshot_kimi_k2_thinking

Success Rate: 0.0% (0/5)
Total Cost: $0.00
Token Usage: 0
Run Suffix: litellm_proxy_moonshot_kimi_k2_thinking_d3399ec_kimi_k2_thinking_run_N5_20260323_150131

Failed Tests:

b01_no_premature_implementation: Test execution failed: Conversation run failed for id=5bb0c19e-8c45-4220-8f92-019aa0715d90: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)
b04_each_tool_call_has_a_concise_explanation: Test execution failed: Conversation run failed for id=c9af83af-c034-49b5-8e2a-195f2e84ba41: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)
b02_no_oververification: Test execution failed: Conversation run failed for id=9608a412-7ae4-4ecd-98d0-5b9c2a1dcc39: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)
b03_no_useless_backward_compatibility: Test execution failed: Conversation run failed for id=3e68ee22-494c-4599-af70-44cae10ff077: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)
b05_do_not_create_redundant_files: Test execution failed: Conversation run failed for id=b6112153-1bf3-4345-b478-ae699a522de8: litellm.UnsupportedParamsError: Error code: 400 - {'error': {'message': "litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.UnsupportedParamsError: moonshot does not support parameters: ['reasoning_effort'], for model=kimi-k2-thinking. To drop these, set litellm.drop_params=True or for proxy:\n\nlitellm_settings:\n drop_params: true\n. \n If you want to use these params dynamically send allowed_openai_params=['reasoning_effort'] in your request.No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}] LiteLLM Retried: 3 times", 'type': 'None', 'param': None, 'code': '400'}} (Cost: $0.00)

enyst · 2026-03-23T15:34:09Z

@OpenHands Read this report #2548 (comment) and focus on Kimi-K2 thinking. Follow the execution path and understand where we set reasoning_effort for Kimi K2 thinking. Then fix it. Try to use clean code, without too ugly special case, or if we need this as special case, list possible places in the code and choose the one where we get info from litellm on reasoning. Unless this was enabled by code somewhere else in which case remove that code.

This PR is a release PR, we must not work here. New branch, open new PR. Babysit to green CI, make it ready for review, and monitor for some 5 mins until an AI reviewer will post comments and inline reviews; read them and address if necessary. Tag enyst, and only enyst, in your PR description.

openhands-ai · 2026-03-23T15:34:31Z

I'm on it! enyst can track my progress at all-hands.dev

openhands-ai · 2026-03-23T15:38:24Z

Here’s the status based on the work so far:

Summary

I have not completed the full request yet. I investigated the Kimi-K2 thinking failure path and made a partial code change locally, but I stopped before validation / PR creation because of the follow-up design concern about depending more on LiteLLM.

...

Re-stating the answer to your question about dependency on LiteLLM

Your concern is valid. The clean proposed fix makes us rely more directly on LiteLLM’s parameter support metadata instead of its broader reasoning-model metadata. I think that is still the better signal here, because the failure is specifically about whether LiteLLM will accept the reasoning_effort param. But I had not yet completed or validated that choice.

Re-stating the answer to your question about `drop_params`

drop_params is already set to True by default on our LLM model. So this bug is not because our SDK forgot to enable it globally. The issue appears to be that our code decided to send reasoning_effort based on a broader LiteLLM capability signal, and LiteLLM still rejected that param for Moonshot Kimi at request time. In other words: the mismatch is in capability detection vs accepted params, not in our default drop_params setting.

enyst · 2026-03-23T16:15:45Z

Status update for the Kimi K2 thinking fix:

I traced the failing path and confirmed the mismatch was in capability detection: we were using LiteLLM's generic reasoning signal, which returns true for moonshot/kimi-k2-thinking, but that model does not accept the reasoning_effort request param.
I moved supports_reasoning_effort to use LiteLLM's supported OpenAI params table instead, so we only send reasoning_effort when LiteLLM says the model actually accepts it.
I added targeted regressions for Moonshot Kimi thinking models and chat option normalization.
I opened the dedicated fix PR here: fix(sdk): stop sending reasoning_effort to Kimi thinking #2549
Manual validation also passed on the eval proxy with the exact failing model name (litellm_proxy/moonshot/kimi-k2-thinking) by running examples/01_standalone_sdk/05_use_llm_registry.py. The conversation succeeded and the direct completion also succeeded.
AI reviewer status on PR fix(sdk): stop sending reasoning_effort to Kimi thinking #2549: approved, with no inline review threads to address.

CI status on PR #2549 right now:

Kimi K2 Thinking integration job: ✅ passed
Claude Sonnet 4.6 integration job: ✅ passed
DeepSeek V3.2 Reasoner integration job: ✅ passed
One Gemini integration job is still pending, so I'm still monitoring until the integration workflow fully settles.

Note: this comment was created by an AI assistant (OpenHands) on behalf of the user.

enyst · 2026-03-23T16:17:33Z

HUMAN:
Oof, litellm thinks Kimi K2 is a reasoning model, but it doesn't support reasoning_effort, and doesn't drop it even though we have drop_params set.

cc @xingyaoww

The agent fixed Kimi K2 here:

Edited to add: technically it's the same for the other LLMs in tests. But it's the kind of thing I'd kinda re-run the other 2 labels (behavior and examples) 🤔

enyst · 2026-03-23T16:24:03Z

Edited to add: technically it's the same for the other LLMs in tests. But it's the kind of thing I'd kinda re-run the other 2 labels (behavior and examples) 🤔

Or rather, I wouldn't re-run behavior, but I would re-run test-examples. We changed which litellm method we use, so it could have some hidden surprise that should show up in some examples.

github-actions · 2026-03-23T16:32:28Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2026-03-23 16:44:23 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	23.7s	$0.02
01_standalone_sdk/03_activate_skill.py	✅ PASS	16.4s	$0.02
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	14.3s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	28.6s	$0.02
01_standalone_sdk/09_pause_example.py	✅ PASS	14.8s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	36.2s	$0.02
01_standalone_sdk/11_async.py	✅ PASS	30.5s	$0.04
01_standalone_sdk/12_custom_secrets.py	✅ PASS	11.3s	$0.00
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	42.6s	$0.03
01_standalone_sdk/14_context_condenser.py	✅ PASS	4m 15s	$0.30
01_standalone_sdk/17_image_input.py	✅ PASS	16.5s	$0.01
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	24.4s	$0.02
01_standalone_sdk/19_llm_routing.py	✅ PASS	16.0s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	18.5s	$0.03
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	13.1s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	24.8s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	1m 21s	$0.02
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	4m 0s	$0.33
01_standalone_sdk/25_agent_delegation.py	✅ PASS	54.3s	$0.07
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	17.0s	$0.03
01_standalone_sdk/28_ask_agent_example.py	✅ PASS	29.2s	$0.02
01_standalone_sdk/29_llm_streaming.py	✅ PASS	45.6s	$0.03
01_standalone_sdk/30_tom_agent.py	✅ PASS	9.5s	$0.01
01_standalone_sdk/31_iterative_refinement.py	✅ PASS	4m 26s	$0.34
01_standalone_sdk/32_configurable_security_policy.py	✅ PASS	21.1s	$0.02
01_standalone_sdk/34_critic_example.py	✅ PASS	2m 13s	$0.17
01_standalone_sdk/36_event_json_to_openai_messages.py	✅ PASS	12.1s	$0.01
01_standalone_sdk/37_llm_profile_store/main.py	✅ PASS	7.0s	$0.00
01_standalone_sdk/38_browser_session_recording.py	✅ PASS	42.1s	$0.03
01_standalone_sdk/39_llm_fallback.py	✅ PASS	10.1s	$0.01
01_standalone_sdk/40_acp_agent_example.py	✅ PASS	30.0s	$0.10
01_standalone_sdk/41_task_tool_set.py	✅ PASS	29.3s	$0.03
01_standalone_sdk/42_file_based_subagents.py	✅ PASS	42.4s	$0.05
01_standalone_sdk/43_mixed_marketplace_skills/main.py	✅ PASS	5.3s	$0.00
01_standalone_sdk/44_model_switching_in_convo.py	✅ PASS	8.2s	$0.01
01_standalone_sdk/45_parallel_tool_execution.py	✅ PASS	3m 9s	$0.41
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	40.9s	$0.03
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	1m 39s	$0.05
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	1m 6s	$0.07
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	✅ PASS	1m 3s	$0.03
02_remote_agent_server/07_convo_with_cloud_workspace.py	✅ PASS	33.6s	$0.03
02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py	✅ PASS	3m 53s	$0.03
02_remote_agent_server/09_acp_agent_with_remote_runtime.py	✅ PASS	1m 2s	$0.03
02_remote_agent_server/10_cloud_workspace_share_credentials.py	❌ FAIL Exit code 1	6.8s	--
04_llm_specific_tools/01_gpt5_apply_patch_preset.py	✅ PASS	30.9s	$0.03
04_llm_specific_tools/02_gemini_file_tools.py	✅ PASS	1m 42s	$0.09
05_skills_and_plugins/01_loading_agentskills/main.py	✅ PASS	19.6s	$0.01
05_skills_and_plugins/02_loading_plugins/main.py	✅ PASS	23.4s	$0.03

❌ Some tests failed

Total: 48 | Passed: 47 | Failed: 1 | Total Cost: $2.68

Failed examples:

examples/02_remote_agent_server/10_cloud_workspace_share_credentials.py: Exit code 1

View full workflow run

enyst · 2026-03-23T16:48:58Z

Is the cloud example supposed to work right now? I guess maybe it uses the version currently deployed on prod

Traceback (most recent call last):
  File "/home/runner/work/software-agent-sdk/software-agent-sdk/examples/02_remote_agent_server/10_cloud_workspace_share_credentials.py", line 64, in <module>
    secrets = workspace.get_secrets()
...
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

enyst · 2026-03-23T17:48:59Z

@OpenHands Look at this #2548 (comment) and #2548 (comment)

Find the PR that added this test.

Clone yourself OpenHands repo from the same org, and look up the linked issue and the counterpart PR or PRs in that repo. I suspect maybe we haven't made a release on OH with those included? Verify all I said. You have time, investigate deeply and feel free to actually search the repos for the right timeline and status here.

openhands-ai · 2026-03-23T17:49:18Z

I'm on it! enyst can track my progress at all-hands.dev

enyst · 2026-03-23T18:27:07Z

Correction to my earlier wording: the /users/me?expose_secrets=true path is only partially working on prod.

I re-checked the actual value returned from cloud for llm_api_key.

Using:

Authorization: Bearer ...
X-Session-API-Key: ...
GET /api/v1/users/me?expose_secrets=true

I verified safely that the returned llm_api_key is still masked / redacted, not a usable proxy key:

llm_base_url: https://llm-proxy.app.all-hands.dev/
llm_key_present: True
llm_key_length: 10
llm_key_starts_with_sk_dash: False
llm_key_contains_asterisk: True
prefix is ***

I also tried a cheap completion against the returned base URL using the cloud-returned key and got 401 Authentication Error, consistent with the key still being redacted.

So the more accurate prod status is:

/users/me?expose_secrets=true returns the field, but the llm_api_key value is still masked
/api/v1/sandboxes/{id}/settings/secrets is still missing / falling through to the SPA HTML shell

So for SDK credential inheritance on prod right now, both halves are effectively broken:

the LLM key path is not returning a usable key
the sandbox secrets path is not registered/live

enyst · 2026-03-23T18:31:47Z

cc: @xingyaoww It seems my initial guess on the Cloud example may have been right: the OH enterprise side is not deployed.

Shall we remove it from the PR for now, or do we want to wait for a patch / deployment on Cloud?

xingyaoww · 2026-03-25T15:19:01Z

Shall we remove it from the PR for now

Yeah, i think it is probably ok to ignore that for now in this release since the get secret functionality should be up in the next few days - as long as we revert OpenHands/docs#401 first, and then the content of it again when it is deployed? @OpenHands can you create two PRs for me, first revert docs PR 401 and the second one re-introduce it (merge AFTER release).

openhands-ai · 2026-03-25T15:19:25Z

I'm on it! xingyaoww can track my progress at all-hands.dev

openhands-ai · 2026-03-25T15:24:27Z

Summary of Work

I created two PRs in the OpenHands/docs repository as requested:

Checklist

PR 1 — Revert docs PR Bump actions/setup-node from 4 to 5 in the version-all group #401: Revert "docs: add get_llm() and get_secrets() to OpenHands Cloud Workspace guide (#401)" docs#415
- Removes the get_llm() and get_secrets() SaaS credentials documentation from sdk/guides/agent-server/cloud-workspace.mdx
- Reverts the "Inheriting SaaS Credentials" section, the "SaaS Credentials Example" section, and the description metadata change
- Should be merged before the v1.14.1 release
PR 2 — Re-introduce the content: docs: re-introduce get_llm() and get_secrets() SaaS credentials documentation docs#416
- Re-adds the exact same documentation that was reverted
- Based on the revert branch so it applies cleanly after PR Add n8n integration example with comprehensive webhook and API support #415 is merged
- Should be merged after the release (once get_secrets() is deployed)
Both PRs cross-reference each other and link back to the SDK release PR Release v1.14.1 #2548

Conciseness

The changes are minimal and focused — each PR touches only the single file (cloud-workspace.mdx) that was affected by the original PR #401, with no extraneous changes. The revert was verified to produce an identical file to the pre-PR-401 state.

xingyaoww · 2026-03-25T15:41:17Z

Since there's more changes in the past few days.. I'll close this and just cut a v1.15.0

Release v1.14.1

d3399ec

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot added integration-test Runs the integration tests and comments the results test-examples Run all applicable "examples/" files. Expensive operation. behavior-test labels Mar 23, 2026

all-hands-bot commented Mar 23, 2026

View reviewed changes

Merge branch 'main' into rel-1.14.1

43f1af0

enyst added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Mar 23, 2026

This comment was marked as duplicate.

Sign in to view

This comment was marked as outdated.

Sign in to view

This comment was marked as duplicate.

Sign in to view

This comment was marked as outdated.

Sign in to view

xingyaoww mentioned this pull request Mar 25, 2026

Revert "docs: add get_llm() and get_secrets() to OpenHands Cloud Workspace guide (#401)" OpenHands/docs#415

Closed

xingyaoww closed this Mar 25, 2026

Conversation

all-hands-bot commented Mar 23, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release v1.14.1

Release Checklist

Next Steps

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 23, 2026

🧪 Integration Tests Results

📁 Detailed Logs & Artifacts

📊 Summary

📋 Detailed Results

litellm_proxy_deepseek_deepseek_reasoner

litellm_proxy_gemini_3_pro_preview

litellm_proxy_anthropic_claude_sonnet_4_6

litellm_proxy_moonshot_kimi_k2_thinking

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

❌ Some tests failed

Uh oh!

github-actions bot commented Mar 23, 2026

🧪 Integration Tests Results

📁 Detailed Logs & Artifacts

📊 Summary

📋 Detailed Results

litellm_proxy_deepseek_deepseek_reasoner

litellm_proxy_gemini_3_pro_preview

litellm_proxy_anthropic_claude_sonnet_4_6

litellm_proxy_moonshot_kimi_k2_thinking

Uh oh!

enyst commented Mar 23, 2026

Uh oh!

openhands-ai bot commented Mar 23, 2026

Uh oh!

openhands-ai bot commented Mar 23, 2026 • edited by enyst Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Re-stating the answer to your question about dependency on LiteLLM

Re-stating the answer to your question about drop_params

Uh oh!

enyst commented Mar 23, 2026

Uh oh!

enyst commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enyst commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

❌ Some tests failed

Uh oh!

enyst commented Mar 23, 2026

Uh oh!

This comment was marked as duplicate.

enyst commented Mar 23, 2026

Uh oh!

openhands-ai bot commented Mar 23, 2026

Uh oh!

This comment was marked as outdated.

all-hands-bot commented Mar 23, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Mar 23, 2026 •

edited

Loading

github-actions bot commented Mar 23, 2026 •

edited

Loading

github-actions bot commented Mar 23, 2026 •

edited

Loading

github-actions bot commented Mar 23, 2026 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

openhands-ai bot commented Mar 23, 2026 •

edited by enyst

Loading

Re-stating the answer to your question about `drop_params`

enyst commented Mar 23, 2026 •

edited

Loading

github-actions bot commented Mar 23, 2026 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`