Skip to content

perf(liger/cutile): tune cross_entropy BLOCK_SIZE by GPU arch & Modularize transformer HF benchmark & other updates#153

Open
hannahli-nv wants to merge 4 commits into
mainfrom
tilegym_update
Open

perf(liger/cutile): tune cross_entropy BLOCK_SIZE by GPU arch & Modularize transformer HF benchmark & other updates#153
hannahli-nv wants to merge 4 commits into
mainfrom
tilegym_update

Conversation

@hannahli-nv

@hannahli-nv hannahli-nv commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Description

Update codes.

This PR contains 3 new commit(s).

Commits included:

5322025 Update test_mla_decoding.py
f3fb7f8 Modularize transformer HF benchmark
8f360e3 perf(liger/cutile): tune cross_entropy BLOCK_SIZE by GPU arch

CI Configuration

config:
  build: true
  # valid options are "ops" and "benchmark"
  test: ["ops", "benchmark"]

Checklist

  • Code formatted and imports sorted via repo specifications (./format.sh)
  • Documentation updated (if needed)
  • CI configuration reviewed

@copy-pr-bot

copy-pr-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Rundong Li <davidli@nvidia.com>
@hannahli-nv hannahli-nv changed the title perf(liger/cutile): tune cross_entropy BLOCK_SIZE by GPU arch perf(liger/cutile): tune cross_entropy BLOCK_SIZE by GPU arch & Modularize transformer HF benchmark Jun 12, 2026
@hannahli-nv

Copy link
Copy Markdown
Collaborator Author

/ok to test f3fb7f8

@hannahli-nv

Copy link
Copy Markdown
Collaborator Author

/ok to test 66c3d8b

@hannahli-nv hannahli-nv force-pushed the tilegym_update branch 3 times, most recently from 6566c06 to 53ba9aa Compare June 15, 2026 08:28
@hannahli-nv

Copy link
Copy Markdown
Collaborator Author

/ok to test 53ba9aa

@hannahli-nv

Copy link
Copy Markdown
Collaborator Author

/ok to test 4fdae8c

@hannahli-nv hannahli-nv changed the title perf(liger/cutile): tune cross_entropy BLOCK_SIZE by GPU arch & Modularize transformer HF benchmark perf(liger/cutile): tune cross_entropy BLOCK_SIZE by GPU arch & Modularize transformer HF benchmark & other updates Jun 15, 2026
@hannahli-nv

Copy link
Copy Markdown
Collaborator Author

/ok to test 60acd3d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants