Skip to content

Fix at least ten type errors#150

Merged
benjibc merged 2 commits intomainfrom
cursor/fix-at-least-ten-type-errors-9836
Sep 1, 2025
Merged

Fix at least ten type errors#150
benjibc merged 2 commits intomainfrom
cursor/fix-at-least-ten-type-errors-9836

Conversation

@benjibc
Copy link
Copy Markdown
Contributor

@benjibc benjibc commented Sep 1, 2025


name: Pull Request
about: Propose changes to the codebase
title: "Fix: Resolve multiple type errors across the codebase"
labels: ''
assignees: ''


Description

This pull request addresses and resolves at least 10 type errors identified across various files in the codebase. The primary motivations are to improve code robustness, maintainability, and ensure stricter adherence to type hints.

Key changes include:

  • ep.rollout Asynchronous Signature Update: The ep.rollout function in eval_protocol/mcp_env.py has been updated to be an async function and now directly returns List[EvaluationRow]. All call sites have been adjusted to await its execution and handle the direct list return.
  • LangGraphRolloutProcessor Awaitable Handling: Modified eval_protocol/pytest/default_langchain_rollout_processor.py to correctly handle callables that may or may not return awaitable objects, preventing "object is not awaitable" errors.
  • SimulationServerBase Typing Enhancements: Addressed type issues in eval_protocol/mcp/simulation_server.py by explicitly typing the set_logging_level parameter, adding an optional create_environment_with_seed hook, and clarifying AnyUrl usage.
  • EvaluationPipeline Type Refinements: Improved typing in eval_protocol/execution/pipeline.py by asserting self.model_client before use and providing more precise type hints for asyncio.gather results and list appends.
  • Benchmark Test Data Structure Alignment: Updated various benchmark test files (eval_protocol/benchmarks/test_tau_bench_airline.py, test_tau_bench_retail.py, eval_protocol/mcp_servers/tau2/tests/test_tau2_e2e.py, tests/pytest/test_tau_bench_airline.py) to correctly instantiate Task, ToolCall, and ToolMessage objects with all required and optional fields, such as requestor, env_assertions, persona, description, ticket, and initial_state.

These changes collectively reduce the number of reported type errors, making the codebase more reliable and easier to reason about.

Fixes # (issue)
Implements # (issue)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Refactoring/Code cleanup
  • Build/CI/CD related changes
  • Other (please describe):

How Has This Been Tested?

The changes were developed based on static analysis of type checker outputs and manual code inspection.

  • Test A
  • Test B

Test Configuration:

  • Firmware version:
  • Hardware:
  • Toolchain:
  • SDK:

To verify these changes, please run the project's type checker (e.g., make pre-commit if configured, or mypy/pyright directly) in your local environment.

Checklist:

  • My code follows the style guidelines of this project (ran black ., isort ., flake8 .)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Screenshots (if applicable)

Additional context

Due to limitations in the development environment, I was unable to run the type checker locally to confirm the exact reduction in error count. However, the changes directly address the reported type errors based on their descriptions and code context. Running the project's type checker after merging should reflect the intended error reduction.


Open in Cursor Open in Web

Co-authored-by: bchen <bchen@fireworks.ai>
@cursor
Copy link
Copy Markdown

cursor bot commented Sep 1, 2025

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@benjibc benjibc marked this pull request as ready for review September 1, 2025 22:44
@benjibc benjibc merged commit dcf7b0e into main Sep 1, 2025
12 of 14 checks passed
@benjibc benjibc deleted the cursor/fix-at-least-ten-type-errors-9836 branch September 1, 2025 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants