|
| 1 | +name: starlight_qa_engagement |
| 2 | +type: ai |
| 3 | +target: messages |
| 4 | +description: | |
| 5 | + Evaluates the ENGAGEMENT quality of a Brent Council Housing Benefits call. |
| 6 | + This is 1 of 4 equally-weighted QA categories for the Starlight project. |
| 7 | +
|
| 8 | + IMPORTANT - AUTO-FAIL RULES: |
| 9 | + Questions 1.3, 1.4, and 1.5 are auto-fail. If ANY of these receives a "no" result, |
| 10 | + set auto_fail to true. When auto_fail is true across ANY of the 4 QA categories, |
| 11 | + the ENTIRE call evaluation fails (not just this section). |
| 12 | +
|
| 13 | + MULTILINGUAL TRANSCRIPTS: |
| 14 | + The call may be conducted in any language. Evaluate the transcript in whatever language |
| 15 | + it occurs in. Do not penalise the agent for using a language other than English if the |
| 16 | + caller initiated in that language. |
| 17 | +
|
| 18 | + AI AGENT ADAPTATION NOTES: |
| 19 | + - Question 1.3 (data security check): Use not_applicable if the call scenario did not |
| 20 | + require identity verification (e.g. general enquiry with no account lookup). |
| 21 | + - Question 1.6 (hold time): Use not_applicable if no hold occurred during the call. |
| 22 | + - Question 1.7 (after call work): Use not_applicable as AI agents do not perform ACW. |
| 23 | +
|
| 24 | + GLOSSARY OF BRENT COUNCIL TERMS: |
| 25 | + RSF - Resident Support Fund | DHP - Discretionary Housing Payment | |
| 26 | + CIC/s - Change in Circumstances | CTS - Council Tax Support | |
| 27 | + HB - Housing Benefit | UC - Universal Credit | Recons - Reconsideration | |
| 28 | + Portal/My Account/CAS - Citizen Access Service (customer self-service portal) | |
| 29 | + Non Dep - Non dependants | OP - Overpayments | LHA - Local Housing Allowance | |
| 30 | + HSF - Household Support Fund | SB - Switchboard | |
| 31 | + Welfare Benefit - PIP, Disability Allowance, ESA, etc. |
| 32 | +model: |
| 33 | + provider: openai |
| 34 | + model: gpt-4.1 |
| 35 | + temperature: 0 |
| 36 | +assistant_ids: [] |
| 37 | +workflow_ids: [] |
| 38 | +schema: |
| 39 | + type: object |
| 40 | + description: "Engagement QA evaluation for Brent Council Housing Benefits calls." |
| 41 | + properties: |
| 42 | + question_1_1: |
| 43 | + type: object |
| 44 | + description: "1.1 Warm greeting, gave service and own name and asked for their name if not SB." |
| 45 | + properties: |
| 46 | + result: |
| 47 | + type: string |
| 48 | + description: "yes if the agent provided a warm greeting with service name and own name and asked for caller name; no if not; not_applicable if this was a switchboard transfer." |
| 49 | + enum: |
| 50 | + - "yes" |
| 51 | + - "no" |
| 52 | + - "not_applicable" |
| 53 | + reasoning: |
| 54 | + type: string |
| 55 | + description: "Explanation of why this result was given, referencing specific parts of the conversation." |
| 56 | + evidence: |
| 57 | + type: array |
| 58 | + description: "Relevant excerpts from the transcript supporting the evaluation." |
| 59 | + items: |
| 60 | + type: object |
| 61 | + properties: |
| 62 | + message_text: |
| 63 | + type: string |
| 64 | + description: "The exact text from the transcript." |
| 65 | + timestamp: |
| 66 | + type: string |
| 67 | + description: "The timestamp or position in the conversation where this occurred." |
| 68 | + question_1_2: |
| 69 | + type: object |
| 70 | + description: "1.2 Apology given for the long wait / acknowledged and recognised service failure if mentioned." |
| 71 | + properties: |
| 72 | + result: |
| 73 | + type: string |
| 74 | + description: "yes if an apology or acknowledgement was given when appropriate; no if the caller mentioned a wait or service failure and it was not acknowledged; not_applicable if the caller did not mention any wait or service failure." |
| 75 | + enum: |
| 76 | + - "yes" |
| 77 | + - "no" |
| 78 | + - "not_applicable" |
| 79 | + reasoning: |
| 80 | + type: string |
| 81 | + description: "Explanation of why this result was given." |
| 82 | + evidence: |
| 83 | + type: array |
| 84 | + description: "Relevant excerpts from the transcript." |
| 85 | + items: |
| 86 | + type: object |
| 87 | + properties: |
| 88 | + message_text: |
| 89 | + type: string |
| 90 | + description: "The exact text from the transcript." |
| 91 | + timestamp: |
| 92 | + type: string |
| 93 | + description: "The timestamp or position in the conversation." |
| 94 | + question_1_3: |
| 95 | + type: object |
| 96 | + description: "1.3 Completed data security check. AUTO-FAIL: If result is 'no', the entire evaluation fails." |
| 97 | + properties: |
| 98 | + result: |
| 99 | + type: string |
| 100 | + description: "yes if identity/security verification was completed before accessing account details; no if account details were accessed without verification; not_applicable if the call did not require account access." |
| 101 | + enum: |
| 102 | + - "yes" |
| 103 | + - "no" |
| 104 | + - "not_applicable" |
| 105 | + reasoning: |
| 106 | + type: string |
| 107 | + description: "Explanation of why this result was given." |
| 108 | + evidence: |
| 109 | + type: array |
| 110 | + description: "Relevant excerpts from the transcript." |
| 111 | + items: |
| 112 | + type: object |
| 113 | + properties: |
| 114 | + message_text: |
| 115 | + type: string |
| 116 | + description: "The exact text from the transcript." |
| 117 | + timestamp: |
| 118 | + type: string |
| 119 | + description: "The timestamp or position in the conversation." |
| 120 | + question_1_4: |
| 121 | + type: object |
| 122 | + description: "1.4 Controlled the call and maintained professionalism throughout. AUTO-FAIL: If result is 'no', the entire evaluation fails." |
| 123 | + properties: |
| 124 | + result: |
| 125 | + type: string |
| 126 | + description: "yes if the agent maintained control and professionalism throughout; no if the agent lost control or was unprofessional at any point; not_applicable only in exceptional circumstances." |
| 127 | + enum: |
| 128 | + - "yes" |
| 129 | + - "no" |
| 130 | + - "not_applicable" |
| 131 | + reasoning: |
| 132 | + type: string |
| 133 | + description: "Explanation of why this result was given." |
| 134 | + evidence: |
| 135 | + type: array |
| 136 | + description: "Relevant excerpts from the transcript." |
| 137 | + items: |
| 138 | + type: object |
| 139 | + properties: |
| 140 | + message_text: |
| 141 | + type: string |
| 142 | + description: "The exact text from the transcript." |
| 143 | + timestamp: |
| 144 | + type: string |
| 145 | + description: "The timestamp or position in the conversation." |
| 146 | + question_1_5: |
| 147 | + type: object |
| 148 | + description: "1.5 Listened actively, positive tone, showed interest, empathy, patience and helpfulness. AUTO-FAIL: If result is 'no', the entire evaluation fails." |
| 149 | + properties: |
| 150 | + result: |
| 151 | + type: string |
| 152 | + description: "yes if the agent demonstrated active listening, positive tone, interest, empathy, patience and helpfulness; no if the agent was dismissive, impatient, or unhelpful; not_applicable only in exceptional circumstances." |
| 153 | + enum: |
| 154 | + - "yes" |
| 155 | + - "no" |
| 156 | + - "not_applicable" |
| 157 | + reasoning: |
| 158 | + type: string |
| 159 | + description: "Explanation of why this result was given." |
| 160 | + evidence: |
| 161 | + type: array |
| 162 | + description: "Relevant excerpts from the transcript." |
| 163 | + items: |
| 164 | + type: object |
| 165 | + properties: |
| 166 | + message_text: |
| 167 | + type: string |
| 168 | + description: "The exact text from the transcript." |
| 169 | + timestamp: |
| 170 | + type: string |
| 171 | + description: "The timestamp or position in the conversation." |
| 172 | + question_1_6: |
| 173 | + type: object |
| 174 | + description: "1.6 Explained any hold time, kept the customer updated, apologised for the hold." |
| 175 | + properties: |
| 176 | + result: |
| 177 | + type: string |
| 178 | + description: "yes if hold time was explained and apology given; no if the caller was put on hold without explanation or apology; not_applicable if no hold occurred during the call." |
| 179 | + enum: |
| 180 | + - "yes" |
| 181 | + - "no" |
| 182 | + - "not_applicable" |
| 183 | + reasoning: |
| 184 | + type: string |
| 185 | + description: "Explanation of why this result was given." |
| 186 | + evidence: |
| 187 | + type: array |
| 188 | + description: "Relevant excerpts from the transcript." |
| 189 | + items: |
| 190 | + type: object |
| 191 | + properties: |
| 192 | + message_text: |
| 193 | + type: string |
| 194 | + description: "The exact text from the transcript." |
| 195 | + timestamp: |
| 196 | + type: string |
| 197 | + description: "The timestamp or position in the conversation." |
| 198 | + question_1_7: |
| 199 | + type: object |
| 200 | + description: "1.7 Was the After Call Work necessary and justified for the full duration?" |
| 201 | + properties: |
| 202 | + result: |
| 203 | + type: string |
| 204 | + description: "yes if ACW was necessary and justified; no if ACW was unnecessary or excessive; not_applicable if this is an AI agent call (AI agents do not perform ACW)." |
| 205 | + enum: |
| 206 | + - "yes" |
| 207 | + - "no" |
| 208 | + - "not_applicable" |
| 209 | + reasoning: |
| 210 | + type: string |
| 211 | + description: "Explanation of why this result was given." |
| 212 | + evidence: |
| 213 | + type: array |
| 214 | + description: "Relevant excerpts from the transcript." |
| 215 | + items: |
| 216 | + type: object |
| 217 | + properties: |
| 218 | + message_text: |
| 219 | + type: string |
| 220 | + description: "The exact text from the transcript." |
| 221 | + timestamp: |
| 222 | + type: string |
| 223 | + description: "The timestamp or position in the conversation." |
| 224 | + auto_fail: |
| 225 | + type: boolean |
| 226 | + description: "Set to true if ANY auto-fail question (1.3, 1.4, 1.5) received a 'no' result. When true, the ENTIRE call evaluation fails across all categories." |
| 227 | + overall_pass: |
| 228 | + type: boolean |
| 229 | + description: "Set to true only if auto_fail is false. When auto_fail is true, this must be false regardless of other question results." |
| 230 | + category_score: |
| 231 | + type: string |
| 232 | + description: "Fraction of questions that received 'yes' out of total applicable questions, e.g. '5/7' or '4/5'. Exclude not_applicable questions from both numerator and denominator." |
0 commit comments