Skip to content

Conversation

@JohnQinAMD
Copy link
Collaborator

update llama and grok training config

Copilot AI review requested due to automatic review settings January 6, 2026 06:28
@Xiaoming-AMD Xiaoming-AMD merged commit 639b793 into main Jan 6, 2026
9 checks passed
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates training configuration parameters for LLaMA 3.1 405B and Grok1 models on MI355X hardware. The changes adjust parallelism and batch size settings that affect training behavior and resource utilization.

  • Reduces tensor parallelism degree for LLaMA 3.1 405B from 8 to 1
  • Increases global batch size for Grok1 (both BF16 and FP8 variants) from 128 to 512

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
examples/torchtitan/configs/MI355X/llama3.1_405B-pretrain.yaml Reduces tensor_parallel_degree from 8 to 1, aligning with other large model configs
examples/megatron/configs/MI355X/grok1-FP8-pretrain.yaml Increases global_batch_size from 128 to 512, quadrupling gradient accumulation steps
examples/megatron/configs/MI355X/grok1-BF16-pretrain.yaml Increases global_batch_size from 128 to 512, quadrupling gradient accumulation steps

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants