Skip to content

LLM-as-a-judge: use same model list as prompts #4005

@mmabrouk

Description

@mmabrouk

Problem

The LLM-as-a-judge evaluator has a hardcoded list of models in its settings template (multiple_choice type with static options). Meanwhile, prompts in the playground use the dynamic model list from the model registry / vault.

Users expect to use the same models for evaluation that they use for their prompts.

Blocked By

This is blocked by the evaluator playground migration (AGE-3656). The fix requires rethinking how evaluator schemas work — specifically, the model field in LLM-as-a-judge should reference the same dynamic model list instead of being a hardcoded multiple_choice in the evaluator catalogue.

Notes from Sprint Planning (Mar 11)

  • Related to the new schema design for evaluators
  • Should be addressed as part of the evaluator playground migration work

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions