Skip to content

Add Generalist agent environment plugin#13

Open
eric-tramel wants to merge 4 commits intomainfrom
codex/generalist-agent-env
Open

Add Generalist agent environment plugin#13
eric-tramel wants to merge 4 commits intomainfrom
codex/generalist-agent-env

Conversation

@eric-tramel
Copy link
Copy Markdown
Contributor

@eric-tramel eric-tramel commented May 7, 2026

What

Add data-designer-generalist-agent-env, a Data Designer plugin with two column types:

  • generalist-agent-environment assembles generated topics, constraints, database schemas, and database records into executable tool environments.
  • generalist-agent-task consumes those environments and emits task prompts, tool-only solutions, verifiers, reference answers, and simple-to-hard task traces.

The old single-column compatibility surface was removed. The plugin now expects grounding data to be generated by upstream Data Designer columns rather than fabricated inside the plugin.

Why

This better matches the Generalist workflow: generate an environment first, including schema and data, then generate tasks from that environment. It lets Data Designer use model generation for novel grounding data while keeping the RL rollout verifier deterministic and executable.

Usage

builder.add_column(
    name="task_topic",
    column_type="llm-text",
    model_alias="deepseek-v4-pro-live",
    prompt="From {{ seed }} and {{ brief }}, write a concise task topic.",
)
builder.add_column(
    name="task_constraints",
    column_type="llm-structured",
    model_alias="deepseek-v4-pro-live",
    prompt="Generate constraints for {{ task_topic }}.",
    output_format=constraint_schema,
)
builder.add_column(
    name="database_schema",
    column_type="llm-structured",
    model_alias="deepseek-v4-pro-live",
    prompt="Generate a database schema for {{ task_topic }} and {{ task_constraints }}.",
    output_format=database_schema_format,
)
builder.add_column(
    name="database_records",
    column_type="llm-structured",
    model_alias="deepseek-v4-pro-live",
    prompt="Generate 8 records that follow {{ database_schema }}.",
    output_format=records_format,
)
builder.add_column(
    name="agent_environment",
    column_type="generalist-agent-environment",
    task_topic_column="task_topic",
    task_constraints_column="task_constraints",
    database_schema_column="database_schema",
    database_records_column="database_records",
    context_columns=["brief"],
)
builder.add_column(
    name="agent_task",
    column_type="generalist-agent-task",
    environment_column="agent_environment",
    difficulty="hard",
    required_tag="reliable",
)

Generated records must include record_id, name, summary, cost, duration, score, and tags; additional generated fields are preserved.

How

  • Split the plugin surface into environment assembly and task synthesis columns.
  • Removed the legacy generalist-agent-env entry point and compatibility wrapper.
  • Removed search/MCP/source-record configuration and deterministic fallback record generation.
  • Added validation/normalization for generated schema and generated record columns.
  • Added a describe_schema tool so rollouts can inspect generated schemas through tools.
  • Kept executable helper validation for tools, solutions, verifiers, task iterations, and Parquet-restored rows.

Validation

  • make all
  • .venv/bin/python -m pytest plugins/data-designer-generalist-agent-env/tests/ -q
  • make lint
  • make validate
  • make check
  • DATA_DESIGNER_HOME=/private/tmp/ddp-live-preview-home-nvidia-internal .venv/bin/data-designer validate /private/tmp/ddp-live-preview/generalist_agent_env_preview_validated_8.py

Live data-designer preview with deepseek-ai/deepseek-v4-pro on nvidia-internal is currently blocked by provider rate limiting during model health checks before row generation starts.

@eric-tramel eric-tramel requested a review from a team as a code owner May 7, 2026 00:53
@eric-tramel eric-tramel self-assigned this May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant