** Please make sure you read the contribution guide and file the issues in the right place. **
Contribution guide.
🔴 Required Information
Please ensure all items in this section are completed to allow for efficient
triaging. Requests without complete information may be rejected / deprioritized.
If an item is not applicable to you - please mark it as N/A
Is your feature request related to a specific problem?
Yes. Currently, when utilizing custom metrics defined in eval_config.custom_metrics, the adk eval CLI command successfully registers them into DEFAULT_METRIC_EVALUATOR_REGISTRY via inline logic in cli_eval.
However, this registration logic is completely absent in cli_optimize and the underlying LocalEvalSampler. As a result, developers leveraging custom metrics in prompt optimization workflows face two frustrating friction points:
- Running
adk optimize agent/ agent/sampler_config.json via the CLI fails with an unregistered metric KeyError.
- Initializing
LocalEvalSampler programmatically in custom Python scripts or Jupyter Notebooks forces developers to write boilerplate code to manually register their custom evaluators into DEFAULT_METRIC_EVALUATOR_REGISTRY prior to running the optimizer.
Describe the Solution You'd Like
We propose centralizing the custom metric registration logic and moving it directly into LocalEvalSampler.
Requested Changes:
- Shared Helper (
eval_config.py): Create a modular helper function register_custom_metrics_from_config(eval_config: EvalConfig) to eliminate duplicate registration logic across the codebase.
- Sampler Integration (
local_eval_sampler.py): Invoke the shared helper directly inside LocalEvalSampler.__init__. This guarantees that any sampler instantiated with an EvalConfig containing custom metrics automatically registers them before LocalEvalService is invoked.
- CLI Refactor (
cli_tools_click.py): Clean up cli_eval by replacing its verbose inline registration loop with the new clean helper function.
Impact on your work
This feature directly impacts our automated prompt engineering workflows for healthcare and ambient clinical note generation (e.g., some healthcare customer's Ambient Scribe project). We utilize custom clinical evaluation metrics (measuring hallucination and omission rates) to optimize scribe agent prompts. Currently, we are forced to maintain custom runner scripts and local monkey-patches to bypass the CLI limitation. Having native custom metric support in adk optimize will allow us to execute clean, automated prompt tuning directly in our CI/CD pipelines.
Willingness to contribute
Are you interested in implementing this feature yourself or submitting a PR?
(Yes/No) Yes
🟡 Recommended Information
Describe Alternatives You've Considered
- Patching
cli_optimize only: We considered adding the registration loop directly to cli_optimize in cli_tools_click.py (matching cli_eval). However, this is architecturally suboptimal because it leaves LocalEvalSampler dependent on the CLI layer. If a developer initializes LocalEvalSampler programmatically in Python/Jupyter, cli_optimize is bypassed, and custom metrics would still fail without manual boilerplate registration.
Proposed API / Implementation
**1. Shared Helper (`google/adk/evaluation/eval_config.py`):**
def register_custom_metrics_from_config(eval_config: EvalConfig) -> None:
"""Registers custom metrics defined in EvalConfig into the default registry."""
if not eval_config or not eval_config.custom_metrics:
return
metric_evaluator_registry = DEFAULT_METRIC_EVALUATOR_REGISTRY
for metric_name, config in eval_config.custom_metrics.items():
if config.metric_info:
metric_info = config.metric_info.model_copy()
metric_info.metric_name = metric_name
else:
from ..cli.cli_eval import get_default_metric_info
metric_info = get_default_metric_info(
metric_name=metric_name, description=config.description
)
metric_evaluator_registry.register_evaluator(
metric_info, _CustomMetricEvaluator
)
**2. Sampler Integration (google/adk/optimization/local_eval_sampler.py):**
class LocalEvalSampler(Sampler[UnstructuredSamplingResult]):
def __init__(
self,
config: LocalEvalSamplerConfig,
eval_sets_manager: EvalSetsManager,
):
self._config = config
self._eval_sets_manager = eval_sets_manager
# Automatically register custom metrics if present
if self._config.eval_config:
register_custom_metrics_from_config(self._config.eval_config)
# ... existing init logic ...
Additional Context
We have already created and verified a local patch using this exact logic in our development environment. It successfully runs our custom clinical metrics via adk optimize without any errors. We have a fully structured implementation and unit testing plan ready, and we will submit the formal Pull Request (PR) as soon as this issue is approved.
** Please make sure you read the contribution guide and file the issues in the right place. **
Contribution guide.
🔴 Required Information
Please ensure all items in this section are completed to allow for efficient
triaging. Requests without complete information may be rejected / deprioritized.
If an item is not applicable to you - please mark it as N/A
Is your feature request related to a specific problem?
Yes. Currently, when utilizing custom metrics defined in
eval_config.custom_metrics, theadk evalCLI command successfully registers them intoDEFAULT_METRIC_EVALUATOR_REGISTRYvia inline logic incli_eval.However, this registration logic is completely absent in
cli_optimizeand the underlyingLocalEvalSampler. As a result, developers leveraging custom metrics in prompt optimization workflows face two frustrating friction points:adk optimize agent/ agent/sampler_config.jsonvia the CLI fails with an unregistered metricKeyError.LocalEvalSamplerprogrammatically in custom Python scripts or Jupyter Notebooks forces developers to write boilerplate code to manually register their custom evaluators intoDEFAULT_METRIC_EVALUATOR_REGISTRYprior to running the optimizer.Describe the Solution You'd Like
We propose centralizing the custom metric registration logic and moving it directly into
LocalEvalSampler.Requested Changes:
eval_config.py): Create a modular helper functionregister_custom_metrics_from_config(eval_config: EvalConfig)to eliminate duplicate registration logic across the codebase.local_eval_sampler.py): Invoke the shared helper directly insideLocalEvalSampler.__init__. This guarantees that any sampler instantiated with anEvalConfigcontaining custom metrics automatically registers them beforeLocalEvalServiceis invoked.cli_tools_click.py): Clean upcli_evalby replacing its verbose inline registration loop with the new clean helper function.Impact on your work
This feature directly impacts our automated prompt engineering workflows for healthcare and ambient clinical note generation (e.g., some healthcare customer's Ambient Scribe project). We utilize custom clinical evaluation metrics (measuring hallucination and omission rates) to optimize scribe agent prompts. Currently, we are forced to maintain custom runner scripts and local monkey-patches to bypass the CLI limitation. Having native custom metric support in
adk optimizewill allow us to execute clean, automated prompt tuning directly in our CI/CD pipelines.Willingness to contribute
Are you interested in implementing this feature yourself or submitting a PR?
(Yes/No) Yes
🟡 Recommended Information
Describe Alternatives You've Considered
cli_optimizeonly: We considered adding the registration loop directly tocli_optimizeincli_tools_click.py(matchingcli_eval). However, this is architecturally suboptimal because it leavesLocalEvalSamplerdependent on the CLI layer. If a developer initializesLocalEvalSamplerprogrammatically in Python/Jupyter,cli_optimizeis bypassed, and custom metrics would still fail without manual boilerplate registration.Proposed API / Implementation
Additional Context
We have already created and verified a local patch using this exact logic in our development environment. It successfully runs our custom clinical metrics via adk optimize without any errors. We have a fully structured implementation and unit testing plan ready, and we will submit the formal Pull Request (PR) as soon as this issue is approved.