Skip to content

Seaprate run_level information from task-level in the JSONL#118

Closed
areebtariq-skylabs wants to merge 1 commit intomainfrom
chore/separate-run-and-task-json
Closed

Seaprate run_level information from task-level in the JSONL#118
areebtariq-skylabs wants to merge 1 commit intomainfrom
chore/separate-run-and-task-json

Conversation

@areebtariq-skylabs
Copy link
Copy Markdown
Collaborator

Currently, the JSONL schema treats every item as a task_result. This forces us to repeat run-level metadata (like run_id, dataset_id, and agent_provenance) within every single task entry, often requiring prefixes like TASKS_ to distinguish between scopes. ( Like in the TAGS )

This PR introduces a dedicated run item type at the start of the JSONL file. By decoupling run-level information from individual tasks.

we can also store provenance data once at the beginning of the file and read it directly, removing our dependency on TEMPO for t data ingestion endpoint. cc @jhaag-skylabs-ai

@areebtariq-skylabs
Copy link
Copy Markdown
Collaborator Author

Right now I just have modified the JSONL Output schema , if it's look okay I can move towards updating the Backend ingestion logic.

@skylabs-ai-ci
Copy link
Copy Markdown

skylabs-ai-ci Bot commented Jan 19, 2026

CI summary (Details)

Active Repos

Repo Job Branch Job Commit Base commit PR
fmdeps/rocq-agent-toolkit/ chore/separate-run-and-task-json 88eb505 704b99e #118

Passive Repos

Repo Job Branch Job Commit
./ main f1faa4e
fmdeps/BRiCk/ main b62cb51
fmdeps/auto/ main 2a53fc0
fmdeps/auto-docs/ main b22de96
bluerock/NOVA/ skylabs-proof 220d4a8
bluerock/bhv/ skylabs-main 20ba397
fmdeps/brick-libcpp/ main 204cf18
fmdeps/ci/ main 68178de
vendored/elpi/ skylabs-master aa4475f
fmdeps/fm-ci/ main cfedfa4
fmdeps/fm-tools/ main 6e85551
psi/protos/ main 8fe3e7c
psi/backend/ main bcf0062
psi/ide/ main 6b596cf
psi/data/ main d7b9b68
vendored/rocq/ skylabs-master 6d192b5
vendored/rocq-elpi/ skylabs-master e7c8227
vendored/rocq-equations/ skylabs-main 737fdf9
vendored/rocq-ext-lib/ skylabs-master 8172052
vendored/rocq-iris/ skylabs-master 51c753a
vendored/rocq-lsp/ skylabs-main a8b7272
vendored/rocq-stdlib/ skylabs-master 10bd9d7
vendored/rocq-stdpp/ skylabs-master 8307c10
fmdeps/skylabs-fm/ main 9de54e4
vendored/vsrocq/ skylabs-main 39c9c5b

Performance

Relative Master MR Change Filename
-0.00% 121503.0 121503.0 -0.0 total
-0.00% 24210.1 24210.1 -0.0 ├ translation units
+0.00% 97292.9 97292.9 +0.0 └ proofs and tests
Full Results
Relative Master MR Change Filename
-0.00% 121503.0 121503.0 -0.0 total
-0.00% 24210.1 24210.1 -0.0 ├ translation units
+0.00% 97292.9 97292.9 +0.0 └ proofs and tests

Copy link
Copy Markdown
Contributor

@jhaag-skylabs-ai jhaag-skylabs-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added Rodolphe+Lennart as reviewers as well. Fwiw, I like the proposed changes; deduplicating verbose data is always a good idea in my book :~)

Comment on lines +69 to +70
agent_cls <doc text="Agent class provenance."> : agent_class_info;
agent <doc text="Agent instance provenance."> : agent_info;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will always deploy a single agent class at a time over a batch of tasks, so putting the agent_class_info in the header makes sense.


While we currently construct agent instances in the same way for each task, that might not always be the case. For the sake of forwards compatibility I propose:

  1. in run_task we add agent_info into a (shared/threadsafe) map that uses the Agent instance checksum as a key
  2. we include a footer with agent-instance specific data

which should still ensure that we deduplicate this verbose data in the current/common case of having agent instances with identical provenance.

Note: we should ensure that the (potentially partial) footer is always written into the output task file, even if some run_tasks raise errors; we may not want to add the agent_info into the map until after run returns without error.

for reproducibility/comparison purposes
*)
agent_cls_checksum <doc text="Pseudo-unique checksum of the Agent class that attempted the task."> : string;
agent_checksum <doc text="Pseudo-unique checksum of the Agent instance that attempted the task."> : string;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cf. comment above regarding run_header; if we adopt my suggestion, we should retain the agent_checksum here for correlation purposes.

Comment on lines +345 to +351
dataset_id = None
if getattr(arguments, "task_file", None) is not None:
project_name = getattr(project, "name", "").strip()
if project_name:
dataset_id = project_name
if not dataset_id:
dataset_id = os.getenv("DATASET_NAME") or "default"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc/ @rlepigre-skylabs-ai @LennartATSkylabsAI

I propose:

  1. we use Project.get_id() as a stable interface for project identity instead of directly accessing project.name
  2. we adopt the invariant that project.get_id() must yield a non-empty value -- eagerly porting existing task.yaml file as necessary, and failing fast with an explicit error instead of using "default" if this invariant is violated
    • Note: an invariant is something that is required for the code to behave correctly, and invariant violations are unremediable at runtime. This is different from Exceptions and other forms of "soft" errors which callers may be able to (attempt to) remediate at runtime, e.g. by retrying a network call that timed out.
  3. we rename dataset_id to project_id to match the semantic change in this PR
  4. re dataset_id: if we care to "name" a particular set of tasks in addition to tracking the project_id, we do so using tags -- which already support effective filtering in the dashboard
    • Note: if we find it too annoying to supply these ENV variables on the command line, we can probably add extra metadata (i.e. "tags") to the TaskFile

cc/ @ehtesham-zahoor @gmalecha-at-skylabs

I realize that adopting my suggestions will require DB/schema changes; I think it's prudent to eagerly make these changes now since we're already planning to archive/refresh the data in the dashboard (maybe this week, otherwise next week; cf. https://github.com/SkyLabsAI/psi-verifier/issues/1001)

@jhaag-skylabs-ai
Copy link
Copy Markdown
Contributor

@areebtariq-skylabs should this PR be revived? If not, should any parts of it be cannibalized for other PRs/issues?

@jhaag-skylabs-ai
Copy link
Copy Markdown
Contributor

Based on discussion w/Areeb, we're going to close this for now. We're now relying more on telemetry, and we can always change the JSONL in the future if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants