Skip to content

fix: preserve API-provided verifier SHA in bundle creation#62

Open
shihan-fleet wants to merge 65 commits intomainfrom
verifier-issue
Open

fix: preserve API-provided verifier SHA in bundle creation#62
shihan-fleet wants to merge 65 commits intomainfrom
verifier-issue

Conversation

@shihan-fleet
Copy link
Copy Markdown

No description provided.

mikesklar and others added 30 commits January 13, 2026 12:21
When verifier code contains multiple functions (e.g., a main verifier
function and helper functions), the helper functions were not accessible
from the main function due to namespace isolation.

The exec() call created functions in local_namespace, but the main
function's __globals__ pointed to exec_globals which didn't contain
the helper functions. This caused NameError when the main function
tried to call helpers, which was silently caught and returned 0.0.

Fix: Merge local_namespace into exec_globals after exec() so all
defined functions are accessible when the verifier is called.
…mespace

fix: allow verifier helper functions to be called from main verifier
InstanceRequest changes:
- Add: profile_id, async_provision, instance_mode, ssh_public_keys, snapshot_interval_minutes, version (deprecated)
- Fix: region default changed from 'us-west-1' to None (server decides)
- Fix: created_from default changed from None to 'api'

TaskRequest changes:
- Add: verifier_func, project_key, data_id, data_version, writer_metadata
- Add: model_config with extra='ignore' and populate_by_name=True
- Add: alias='env_id' for environment_id field
- Remove: metadata (doesn't exist in orchestrator TaskRequest, only in TaskResponse)
…odels

Add factual_answer field to support research/factual tasks:
- Task model: stores expected answer for verification
- TaskRequest: accept factual_answer when creating tasks
- TaskResponse: return factual_answer from API

Part of: https://linear.app/fleet-ai/issue/ENG-843/import-script-needs-to-support-output-json-schemas

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
feat: add factual_answer field to Task and API models
Add task_modality field to Task and TaskResponse models to support
copying task modality (computer_use, tool_use, browser) when importing
tasks via the SDK.

Changes:
- Add task_modality to TaskResponse model (API response)
- Add task_modality to Task model (SDK model)
- Pass task_modality from TaskResponse to Task in load_tasks

Co-authored-by: Cursor <cursoragent@cursor.com>
Addresses Bugbot comment: load_task_from_json wasn't extracting
task_modality from JSON data, causing tasks loaded from JSON files
to have task_modality=None even when the JSON contains this field.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Add task_modality field to async Task model, TaskResponse model,
and update load_task_from_json and load_tasks to preserve task_modality.

Co-authored-by: Cursor <cursoragent@cursor.com>
andrew-stelmach-fleet and others added 28 commits February 4, 2026 22:15
- Change ScenarioResponse.id from str to int
- Change task_scenario_id from Optional[str] to Optional[int] in Task and TaskResponse models
- Bump version to 0.2.112

Co-authored-by: Cursor <cursoragent@cursor.com>
fix: use int for scenario IDs to match database schema
The API returns `environment_id` but load_task_from_json was only
looking for `env_id` or `env_key`. Now it checks all three field names.

Bump version to 0.2.113.

Co-authored-by: Cursor <cursoragent@cursor.com>
fix: handle environment_id in load_task_from_json
Previously, import_single_task would catch all exceptions and return
None, making it impossible to debug import failures. Now it raises
the exception so callers can handle or report the actual error.

Bump version to 0.2.114.

Co-authored-by: Cursor <cursoragent@cursor.com>
…dling

fix: propagate errors from import_single_task instead of swallowing
This field was missing from the SDK, causing the lifecycle status
to be lost when copying tasks. The API returns this field but the
SDK wasn't capturing it.

Changes:
- Add task_lifecycle_status field to Task model (sync and async)
- Map task_lifecycle_status in load_task_from_json (sync and async)
- Bump version to 0.2.115

Co-authored-by: Cursor <cursoragent@cursor.com>
The API returns 'environment_id', so just use that directly instead of
a fallback chain of env_id/env_key/environment_id.

Co-authored-by: Cursor <cursoragent@cursor.com>
The database uses env_key, so the SDK model should match.
Added alias="environment_id" so the API response still maps correctly.

Updated all references:
- Task.env_id -> Task.env_key
- TaskInfo.env_id -> TaskInfo.env_key
- Updated docstrings and examples

Co-authored-by: Cursor <cursoragent@cursor.com>
The API expects env_id (or environment_id), so we map env_key to env_id
in import_single_task before sending. This keeps the SDK using env_key
internally (matching DB) while maintaining API compatibility.

No API changes needed - this is SDK-only.

Co-authored-by: Cursor <cursoragent@cursor.com>
The task_lifecycle_status field was added to the Task model but was
missing from:
- TaskResponse model (sync and async) - needed to parse API response
- load_tasks method - needed to pass the field to Task constructor

This completes the task_lifecycle_status support in the SDK.

Co-authored-by: Cursor <cursoragent@cursor.com>
The field was renamed to env_key but there was already a property with
the same name, causing infinite recursion. Renamed the property to
get_env_key() method.

Also restored fallback for env_key in load_task_from_json to support
JSON files that use env_key field.

Co-authored-by: Cursor <cursoragent@cursor.com>
The field was renamed to env_key but there was already a property with
the same name, causing infinite recursion. Renamed the property to
get_env_key() method.

Also restored env_id fallback in load_task_from_json for backward
compatibility with existing JSON files.

Co-authored-by: Cursor <cursoragent@cursor.com>
The make() method was using self.env_key (raw field) instead of
self.get_env_key() (computed method with version). This would cause
environments to be created without the version suffix.

Co-authored-by: Cursor <cursoragent@cursor.com>
The API returns env_id but TaskInfo was renamed to use env_key.
Added alias="env_id" so Pydantic accepts both field names during
deserialization of API responses.

Co-authored-by: Cursor <cursoragent@cursor.com>
When export_tasks serializes tasks, it outputs env_key. The loading
function needs to check for env_key first (canonical name), then
fallback to environment_id (API) and env_id (legacy).

Co-authored-by: Cursor <cursoragent@cursor.com>
- TaskResponse: rename environment_id -> env_key (alias="environment_id")
- TaskRequest: rename environment_id -> env_key (alias="environment_id")
- Add ConfigDict(populate_by_name=True) for alias support
- Add Task.env_spec property for env_key:version string
- Use task.env_spec in Task.make() and make_for_task()
- Clean up load_tasks to use task_response.env_key directly
- Remove scattered inline env_key:version string building

Co-authored-by: Cursor <cursoragent@cursor.com>
- data_spec: renamed from data_key (data_key kept as alias)
- has_verifier: whether task has verifier_func or verifier
- is_research_based: whether task has a factual_answer
- is_action_based: inverse of is_research_based

Co-authored-by: Cursor <cursoragent@cursor.com>
TaskInfo has alias="env_id" on env_key field but was missing
model_config = ConfigDict(populate_by_name=True). Without this,
creating TaskInfo(env_key="...") would fail since only the alias
name was accepted.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
feat: Add task_lifecycle_status field to Task model
The PUT /v1/tasks/{task_key} endpoint can return environment_id: null,
which caused a Pydantic validation error since env_key was required.
This made update_task crash instead of returning a TaskResponse.

- TaskResponse.env_key: str -> Optional[str]
- Task.env_key: str -> Optional[str]
- Task.env_spec now returns None when env_key is absent

Co-authored-by: Cursor <cursoragent@cursor.com>
When a task has env_key=None, make_for_task would pass None to make()
causing a TypeError at ":" in env_key. Now raises a clear ValueError
matching the guard in Task.make().

Co-authored-by: Cursor <cursoragent@cursor.com>
fix: make TaskResponse.env_key optional to handle null API responses
…al-env-key"

This reverts commit 3a4f711, reversing
changes made to 7ec526b.
…v-key

revert: restore env_key as required in TaskResponse and Task
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

raise OSError(f"Cannot create bundle for {self.key}: {e}")

if self._bundle_sha is None:
self._bundle_sha = _get_bundle_sha(self._bundle_data)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty SHA now bypasses hash generation

Medium Severity

_get_or_create_bundle now recomputes the hash only when _bundle_sha is None, but API-loaded verifiers pass missing hashes as "". That preserves an empty sha256 instead of deriving one from bundle_data, so later check/execute calls can run with an invalid SHA.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants