Knowunity Hackathon CLI Scripts

These scripts help you run manual and automated conversations against the Student Simulation API, log full transcripts, generate understanding level predictions, and submit to the /evaluate/mse endpoint.

Setup

Create a .env file in the repo root with:

BASE_URL=...
TEAM_API_KEY=...
OPENAI_API_KEY=...

Optional:

LOG_FILE=logs/conversations.jsonl

Scripts

`scripts/knu_api.sh`

Manual CLI helper to list students/topics and run a chat by hand.

Examples:

./scripts/knu_api.sh list-students mini_dev
./scripts/knu_api.sh student-topics <student_id>
./scripts/knu_api.sh start <student_id> <topic_id>
./scripts/knu_api.sh interact <conversation_id> "Explain how you would solve x^2 - 5x + 6 = 0"
./scripts/knu_api.sh chat

Notes:

Reads BASE_URL and TEAM_API_KEY from .env.
Logs every start and interact to logs/conversations.jsonl (JSONL format).

`scripts/knu_auto_chat.py`

Automates conversations for a set, generates tutor messages with GPT, and produces a predicted understanding level after each conversation.

Examples:

./scripts/knu_auto_chat.py
./scripts/knu_auto_chat.py --set-type mini_dev --model gpt-5.2 --mode responses
./scripts/knu_auto_chat.py --max-turns 6

Notes:

Reads BASE_URL, TEAM_API_KEY, OPENAI_API_KEY from .env.
Writes a conversation_summary entry with the full transcript and prediction to logs/conversations.jsonl.
Use --mode chat if your account does not support the responses API.

`scripts/knu_submit_mse.py`

Submits the latest predictions (per student/topic pair) from logs/conversations.jsonl to /evaluate/mse.

Examples:

./scripts/knu_submit_mse.py --set-type mini_dev
./scripts/knu_submit_mse.py --set-type mini_dev --dry-run

Notes:

Picks the most recent conversation_summary per student/topic pair.
Fails if any required pair is missing.

`scripts/knu_submit_tutoring.py`

Submits a tutoring evaluation request to /evaluate/tutoring.

Examples:

./scripts/knu_submit_tutoring.py --set-type mini_dev

Notes:

Requires at least one conversation per student/topic pair in the set.

`scripts/knu_run_and_submit.sh`

One-shot flow: run conversations, then submit predictions.

Examples:

./scripts/knu_run_and_submit.sh --set-type mini_dev --model gpt-5.2 --mode responses

`scripts/knu_list_pairs.py`

Lists student/topic pairs for a set (one pair per line, space-separated).

Examples:

./scripts/knu_list_pairs.py --set-type dev

`scripts/knu_run_dev_parallel.sh`

Runs all dev student-topic pairs in parallel and writes logs per pair.

Examples:

PARALLEL=4 LOG_DIR=new_logs ./scripts/knu_run_dev_parallel.sh

Notes:

Uses scripts/knu_list_pairs.py to enumerate dev pairs.
Writes one JSONL log per pair to new_logs/.
You can set MODEL, MODE, SLEEP, or MAX_TURNS via env vars.

`scripts/knu_run_eval_parallel.sh`

Runs all eval student-topic pairs in parallel, writes logs per pair, scores with strict student-only diagnostic scoring, and submits to /evaluate/mse.

Examples:

PARALLEL=4 LOG_DIR=eval_logs ./scripts/knu_run_eval_parallel.sh

Notes:

Uses scripts/knu_list_pairs.py to enumerate eval pairs.
Writes one JSONL log per pair to eval_logs/.
Scoring uses scripts/knu_score_only.py --diagnostic-only.

`scripts/knu_score_only.py`

Runs LLM scoring on existing conversations (no new API conversations) and can optionally submit to /evaluate/mse.

Examples:

./scripts/knu_score_only.py --prompt-version A
./scripts/knu_score_only.py --prompt-version B --submit-mse
./scripts/knu_score_only.py --prompt-version C --set-type mini_dev --mode responses

Notes:

Uses the most recent conversation_summary per student/topic pair from logs/conversations.jsonl.
Writes results to logs/score_only_<version>_<timestamp>.json.

`scripts/knu_score_abc.sh`

Runs A/B/C scoring back-to-back and submits each to /evaluate/mse.

Examples:

./scripts/knu_score_abc.sh --set-type mini_dev --model gpt-5.2 --mode responses

Notes:

Forwards any args to knu_score_only.py (except --prompt-version and --submit-mse).

`scripts/knu_self_report.py`

Asks each student to self-report their understanding level (1–5) and submits to /evaluate/mse.

Examples:

./scripts/knu_self_report.py --set-type mini_dev
./scripts/knu_self_report.py --set-type mini_dev --no-submit-mse

Notes:

If the student does not return a number, the script can use an LLM to map their reply to 1–5.
Disable LLM mapping with --no-llm-parse (defaults to 3 when non-numeric).

`scripts/knu_infer_truth.py`

Infers true levels for mini_dev using controlled MSE probes (multiple submissions).

Example:

./scripts/knu_infer_truth.py --set-type mini_dev

Notes:

Uses multiple /evaluate/mse calls; refuses non-mini_dev unless --force.
Writes inferred levels to logs/inferred_levels.json.

Logs

All scripts append to logs/conversations.jsonl (JSON Lines). Each line is a JSON object with event types like start, interact, or conversation_summary.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
api_cache		api_cache
frontend		frontend
scripts		scripts
.DS_Store		.DS_Store
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowunity Hackathon CLI Scripts

Setup

Scripts

`scripts/knu_api.sh`

`scripts/knu_auto_chat.py`

`scripts/knu_submit_mse.py`

`scripts/knu_submit_tutoring.py`

`scripts/knu_run_and_submit.sh`

`scripts/knu_list_pairs.py`

`scripts/knu_run_dev_parallel.sh`

`scripts/knu_run_eval_parallel.sh`

`scripts/knu_score_only.py`

`scripts/knu_score_abc.sh`

`scripts/knu_self_report.py`

`scripts/knu_infer_truth.py`

Logs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Knowunity Hackathon CLI Scripts

Setup

Scripts

scripts/knu_api.sh

scripts/knu_auto_chat.py

scripts/knu_submit_mse.py

scripts/knu_submit_tutoring.py

scripts/knu_run_and_submit.sh

scripts/knu_list_pairs.py

scripts/knu_run_dev_parallel.sh

scripts/knu_run_eval_parallel.sh

scripts/knu_score_only.py

scripts/knu_score_abc.sh

scripts/knu_self_report.py

scripts/knu_infer_truth.py

Logs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`scripts/knu_api.sh`

`scripts/knu_auto_chat.py`

`scripts/knu_submit_mse.py`

`scripts/knu_submit_tutoring.py`

`scripts/knu_run_and_submit.sh`

`scripts/knu_list_pairs.py`

`scripts/knu_run_dev_parallel.sh`

`scripts/knu_run_eval_parallel.sh`

`scripts/knu_score_only.py`

`scripts/knu_score_abc.sh`

`scripts/knu_self_report.py`

`scripts/knu_infer_truth.py`

Packages