You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each task includes kernel.py, test_kernel_harness.py (supports --correctness / --full-benchmark), and config.yaml. Harnesses write build/performance_report.json for evaluator compatibility.
Scope
This PR is strictly additive. All existing task types are unaffected — no existing configs, evaluator, scorer, or agent templates are modified.
johayang-amd
changed the title
Add flydsl2flydsl task type with FlyDSL kernel test examples
feat(flydsl2flydsl): add flydsl2flydsl task type with FlyDSL kernel test examples
May 7, 2026
Two medium follow-ups worth addressing before merge:
The compile_command for the new FlyDSL tasks currently only does an AST parse of kernel.py. That catches syntax errors, but it does not validate that FlyDSL can be imported, that the builder/JIT path is valid, or that the generated module can be constructed. Could we strengthen this to at least import the kernel and instantiate/build a small representative module, if runtime cost is acceptable?
The PR introduces a new FlyDSL task family, but I do not see an install/dependency story for FlyDSL in requirements.txt, docs, or task setup. If FlyDSL is expected to be preinstalled in the benchmark image, please document that assumption; otherwise please add the required dependency/setup instructions so these tasks are reproducible in a clean environment.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add
flydsl2flydsltask type with FlyDSL kernel test examplesSummary
Adds a new
flydsl2flydsltask type for benchmarking FlyDSL (FlyDSL Python DSL) kernel optimization on AMD GPUs.Framework changes:
src/prompt_builder.py— routeflydsl2flydslto the new task type personasrc/prompts/task_type.py— add FlyDSL optimization specialist personasrc/prompts/cheatsheet/default_cheatsheet.yaml— register FlyDSL cheatsheetsrc/prompts/cheatsheet/flydsl_cheatsheet.md— FlyDSL optimization guide for agent prompt context4 initial kernel tasks (sourced from FlyDSL):
rmsnorm_kernel— RMSNorm with float32 accumulationlayernorm_kernel— LayerNorm with mean/variance reductionsoftmax_kernel— Numerically stable softmax using exp2fused_rope_cache_kernel— Fused Q/K RoPE rotation + KV cache writeEach task includes
kernel.py,test_kernel_harness.py(supports--correctness/--full-benchmark), andconfig.yaml. Harnesses writebuild/performance_report.jsonfor evaluator compatibility.Scope
This PR is strictly additive. All existing task types are unaffected — no existing configs, evaluator, scorer, or agent templates are modified.
Testing
task_validator