Skip to content

Commit dec9a84

Browse files
committed
Fix: Correct healthcare MCP server in xlam2_70b eval script
use healthcare server instead of airbnb server
1 parent 31ea45a commit dec9a84

1 file changed

Lines changed: 1 addition & 3 deletions

File tree

benchmarks/healthcare/eval_script_xlam2_70b.sh

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,7 @@ MODEL="xlam_2_70b"
66
MAX_TURNS=30
77
REPORT_MODEL="gpt-4.1-2025-04-14"
88
JUDGE_MODEL="gpt-4o"
9-
SERVER="@openbnb/mcp-server-airbnb"
10-
SERVER_ARGS="--ignore-robots-txt"
9+
SERVER="mcp_servers/${DOMAIN}/server.py"
1110
MODEL_CONFIG="benchmarks/${DOMAIN}/eval_models/${MODEL}.json"
1211
TASKS_FILE="data/${DOMAIN}/evaluation_tasks_verified.jsonl"
1312
OUTPUT="benchmarks/${DOMAIN}/results/${MODEL}_task_evaluation.json"
@@ -22,7 +21,6 @@ REPORT_DIR="benchmarks/${DOMAIN}/report"
2221

2322
mcp-eval evaluate \
2423
--server $SERVER \
25-
--server-args="$SERVER_ARGS" \
2624
--model-config $MODEL_CONFIG \
2725
--tasks-file $TASKS_FILE \
2826
--output $OUTPUT \

0 commit comments

Comments
 (0)