Skip to content

new tests#362

Merged
shreymodi1 merged 1 commit intomainfrom
shrey/glm-streaming-compliance
Dec 9, 2025
Merged

new tests#362
shreymodi1 merged 1 commit intomainfrom
shrey/glm-streaming-compliance

Conversation

@shreymodi1
Copy link
Contributor

@shreymodi1 shreymodi1 commented Dec 9, 2025


name: Pull Request
about: Propose changes to the codebase
title: "Brief description of changes"
labels: ''
assignees: ''


Description

Please include a summary of the change and which issue is fixed or feature is implemented. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)
Implements # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Refactoring/Code cleanup
  • Build/CI/CD related changes
  • Other (please describe):

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration.

  • Test A
  • Test B

Test Configuration:

  • Firmware version:
  • Hardware:
  • Toolchain:
  • SDK:

Checklist:

  • My code follows the style guidelines of this project (ran black ., isort ., flake8 .)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Screenshots (if applicable)

If applicable, add screenshots to help showcase your changes.

Additional context

Add any other context about the PR here.


Note

Switches tests to a new default model, generalizes content handling, inlines a tool schema, replaces a required-params test, removes a recovery/retry test, and updates reasoning tests to use the default model.

  • Benchmarks/Test Suite:
    • Model config: Set DEFAULT_MODEL_ID to fireworks_ai/accounts/pyroworks/deployedModels/minimax-m2-zmi4qk9f; updated all reasoning/structured-output tests to use DEFAULT_MODEL_ID instead of deepseek-v3p1.
    • Content handling: Relaxed _coerce_content_to_str param typing to list[Any] and removed ChatCompletionContentPartTextParam import.
    • Tool payloads:
      • Inlined test_brace_bug tool schema in PEER_TOOL_BRACE_PAYLOAD (was referencing external definition).
      • Removed recovery scenario: deleted PEER_TOOL_RECOVERY_FAILURE_PAYLOAD, its row, and test_streaming_tool_retry_behavior.
    • Tool param validation test:
      • Renamed and inverted test from missing required param to presence check: _PEER_TOOL_REQUIRED_PARAMS_ROW and test_streaming_tool_required_params_present; updated metrics (required_params_present) and reasons.
    • Misc: Updated multiple evaluation blocks (streaming and non-streaming) to reference the new default model for reasoning and tools + reasoning cases.

Written by Cursor Bugbot for commit d0570aa. This will update automatically on new commits. Configure here.



DEFAULT_MODEL_ID = "fireworks_ai/accounts/fireworks/models/glm-4p6"
DEFAULT_MODEL_ID = "fireworks_ai/accounts/pyroworks/deployedModels/minimax-m2-zmi4qk9f"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets keep it as serverless

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless this was meant to be used only for internal use and we don't mind keeping it pointed at pyroworks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it is just internal use

@shreymodi1 shreymodi1 merged commit de3aba0 into main Dec 9, 2025
9 checks passed
@shreymodi1 shreymodi1 deleted the shrey/glm-streaming-compliance branch December 9, 2025 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants