Add quality tests for MCP servers

Having some automated tests for tool metadata (tool name and parameter names/descriptions) quality, while not foolproof, would make it much easier to confidently make changes to existing servers without worrying about regressing existing use cases. These tests could catch things like a new tool having a description that collides with another tool and confuses LLMs. We should be able to write some example tests using e.g. [Mosaic AI Agent Evaluation](https://docs.databricks.com/aws/en/generative-ai/agent-evaluation) or open source eval frameworks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add quality tests for MCP servers #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add quality tests for MCP servers #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions