Actions: sbintuitions/flexeval
Actions
Showing runs from all workflows
686 workflow runs
686 workflow runs
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run batch-api tests
#146:
Pull request #285
synchronize
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run tests
#993:
Pull request #285
synchronize
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run tests
#992:
Pull request #285
synchronize
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run batch-api tests
#145:
Pull request #285
synchronize
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run tests
#991:
Pull request #285
opened
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run batch-api tests
#144:
Pull request #285
opened
by
junya-takayama
bool / int being saved as string in outputs.jsonl
Run tests
#989:
Pull request #284
synchronize
by
junya-takayama
bool / int being saved as string in outputs.jsonl
Run tests
#988:
Pull request #284
synchronize
by
junya-takayama
bool / int being saved as string in outputs.jsonl
Run tests
#987:
Pull request #284
opened
by
junya-takayama