Skip to content

Conversation

@peterbjohnson
Copy link
Contributor

Two changes.

  1. Allow the prompts written in the app to include {{answer}}, {{question}}, {{response}} and for these to be parsed in the eval function. This is a much needed change and is quite simple. See
    def process_prompt(prompt, question, response, answer):

Notes: the question is not currently passed to the eval function. This needs to be updated in the app. There is no harm having the eval function ready for this though, as long as users don't read the code and expect it to work (unlikely).

Review needs: check for functionality and good practice.

  1. Introduce a moderator prompt that will rule out deceitful responses before they are evaluated.

This is needed to avoid abuse. However, it's a dangerous game - it means the whole process can be aborted due to an imprecise moderation prompt. The prompt is free for the teacher to define, and they can just say 'Output True' if they want to. But I'm still a bit concerned about this approach.

Review needs 1: check for functionality and good practice.

Review needs 2: is this the best way to moderatre?

Tests have been added to test the new features, and modified because old tests were not working once I switched from 3.5-turbo (now very expensive) to 4o-mini.

@m-messer
Copy link
Member

m-messer commented Dec 8, 2025

This looks good, all well implemented.

I would add at least one test for parsing question as a parameter.

Moderation-wise, I think this is a good approach and will catch some of the issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants