@@ -16,72 +16,122 @@ Each evaluator comes with a predefined input and output schema. When using an ev
1616
1717## Evaluator Types
1818
19+ ### Style
20+
1921<CardGroup cols = { 3 } >
2022 <Card title = " Character Count" icon = " text" >
2123 Analyze response length and verbosity to ensure outputs meet specific length requirements.
2224 </Card >
23-
25+
2426 <Card title = " Character Count Ratio" icon = " hashtag" >
2527 Measure the ratio of characters to the input to assess response proportionality and expansion.
2628 </Card >
27-
29+
2830 <Card title = " Word Count" icon = " align-left" >
2931 Ensure appropriate response detail level by tracking the total number of words in outputs.
3032 </Card >
31-
33+
3234 <Card title = " Word Count Ratio" icon = " hashtag" >
3335 Measure the ratio of words to the input to compare input/output verbosity and expansion patterns.
3436 </Card >
35-
37+
38+ <Card title = " Tone Detection" icon = " smile" >
39+ Classify emotional tone of responses (joy, anger, sadness, etc.).
40+ </Card >
41+ </CardGroup >
42+
43+ ### Quality & Correctness
44+
45+ <CardGroup cols = { 3 } >
3646 <Card title = " Answer Relevancy" icon = " bullseye" >
3747 Verify responses address the query to ensure AI outputs stay on topic and remain relevant.
3848 </Card >
39-
49+
4050 <Card title = " Faithfulness" icon = " circle-check" >
4151 Detect hallucinations and verify facts to maintain accuracy and truthfulness in AI responses.
4252 </Card >
43-
53+
54+ <Card title = " Answer Correctness" icon = " circle-check" >
55+ Evaluate factual accuracy by comparing answers against ground truth.
56+ </Card >
57+
58+ <Card title = " Answer Completeness" icon = " check-circle" >
59+ Measure how completely responses use relevant context.
60+ </Card >
61+
62+ <Card title = " Topic Adherence" icon = " hashtag" >
63+ Validate topic adherence to ensure responses stay focused on the specified subject matter.
64+ </Card >
65+
66+ <Card title = " Semantic Similarity" icon = " hashtag" >
67+ Validate semantic similarity between expected and actual responses to measure content alignment.
68+ </Card >
69+
70+ <Card title = " Prompt Perplexity" icon = " brain" >
71+ Measure how predictable/familiar a prompt is to a language model.
72+ </Card >
73+
74+ <Card title = " Measure Perplexity" icon = " hashtag" >
75+ Measure text perplexity from logprobs to assess the predictability and coherence of generated text.
76+ </Card >
77+
78+ <Card title = " Uncertainty Detector" icon = " gauge" >
79+ Generate responses and measure model uncertainty from logprobs.
80+ </Card >
81+ </CardGroup >
82+
83+ ### Security & Compliance
84+
85+ <CardGroup cols = { 3 } >
4486 <Card title = " PII Detection" icon = " shield" >
4587 Identify personal information exposure to protect user privacy and ensure data security compliance.
4688 </Card >
47-
89+
4890 <Card title = " Profanity Detection" icon = " triangle-exclamation" >
4991 Flag inappropriate language use to maintain content quality standards and professional communication.
5092 </Card >
51-
93+
94+ <Card title = " Sexism Detection" icon = " triangle-exclamation" >
95+ Detect sexist and discriminatory content.
96+ </Card >
97+
98+ <Card title = " Prompt Injection" icon = " shield-exclamation" >
99+ Detect prompt injection attacks in user inputs.
100+ </Card >
101+
102+ <Card title = " Toxicity Detector" icon = " skull" >
103+ Detect toxic content including personal attacks, mockery, hate, and threats.
104+ </Card >
105+
52106 <Card title = " Secrets Detection" icon = " lock" >
53107 Monitor for credential and key leaks to prevent accidental exposure of sensitive information.
54108 </Card >
55-
109+ </CardGroup >
110+
111+ ### Formatting
112+
113+ <CardGroup cols = { 3 } >
56114 <Card title = " SQL Validation" icon = " database" >
57115 Validate SQL queries to ensure proper syntax and structure in database-related AI outputs.
58116 </Card >
59-
117+
60118 <Card title = " JSON Validation" icon = " code" >
61119 Validate JSON responses to ensure proper formatting and structure in API-related outputs.
62120 </Card >
63-
121+
64122 <Card title = " Regex Validation" icon = " asterisk" >
65123 Validate regex patterns to ensure correct regular expression syntax and functionality.
66124 </Card >
67-
125+
68126 <Card title = " Placeholder Regex" icon = " asterisk" >
69127 Validate placeholder regex patterns to ensure proper template and variable replacement structures.
70128 </Card >
71-
72- < Card title = " Semantic Similarity " icon = " hashtag " >
73- Validate semantic similarity between expected and actual responses to measure content alignment.
74- </ Card >
75-
129+ </ CardGroup >
130+
131+ ### Agents
132+
133+ < CardGroup cols = { 3 } >
76134 <Card title = " Agent Goal Accuracy" icon = " bullseye" >
77135 Validate agent goal accuracy to ensure AI systems achieve their intended objectives effectively.
78136 </Card >
79-
80- <Card title = " Topic Adherence" icon = " hashtag" >
81- Validate topic adherence to ensure responses stay focused on the specified subject matter.
82- </Card >
83-
84- <Card title = " Measure Perplexity" icon = " hashtag" >
85- Measure text perplexity from logprobs to assess the predictability and coherence of generated text.
86- </Card >
87137</CardGroup >
0 commit comments