Describe the bug
Environment
- torch-neuronx 2.9.0.2.13.26312
- PyTorch 2.9.1
- AMI: Deep Learning AMI Neuron PyTorch Inference vLLM 0.16 (Ubuntu 24.04) 20260511
- Target: inf2
Reproduction
import torch
import torch_neuronx
from transformers import AutoModel
# Model: nvidia/Cosmos-Embed1-336p (1.1B params, EVA-ViT-G + Q-Former)
model = AutoModel.from_pretrained("nvidia/Cosmos-Embed1-336p", trust_remote_code=True, torch_dtype=torch.float32)
model.eval()
class Wrapper(torch.nn.Module):
def __init__(self, m):
super().__init__()
self.m = m
def forward(self, v):
return self.m.get_video_embeddings(v).visual_proj
w = Wrapper(model)
w.eval()
v = torch.randn(1, 8, 3, 336, 336)
# Fails with NCC_IBIR243:
torch_neuronx.trace(w, v, compiler_args=["--target", "inf2", "--optlevel", "1"])
Also tried ViT encoder only (same error), --optlevel 0 (gets NCC_ILSA062 instead).
Full error log
(Load: I-66527-0), tensorizer(output tensor: float32<88 x 512> $66527[i322_0_0, i308_0_1, i289_1_1, i289_1_0_64247, i243_0_45014_64246_64247], id: 47030 output tensor: float32<88 x 65> $66528[i322_0_0, i308_0_1, i289_1_1, i289_1_0_64247, i243_0_45014_64246_64247], id: 47030) [INTERNAL_ERROR] [NCC_IBIR243] Access pattern out of bounds. Pattern: [[1024,88],[1,1024],[1,512]] - Please open a support ticket at https://github.com/aws-neuron/aws-neuron-sdk/issues/new. You may also be able to obtain more information using the 'XLA_IR_DEBUG' and 'XLA_HLO_DEBUG' environment variables.
2026-05-14T03:21:10Z Non-signal exit. Backend exited with code 1 and stderr: (Load: I-66527-0), tensorizer(output tensor: float32<88 x 512> $66527[i322_0_0, i308_0_1, i289_1_1, i289_1_0_64247, i243_0_45014_64246_64247], id: 47030 output tensor: float32<88 x 65> $66528[i322_0_0, i308_0_1, i289_1_1, i289_1_0_64247, i243_0_45014_64246_64247], id: 47030) [INTERNAL_ERROR] [NCC_IBIR243] Access pattern out of bounds. Pattern: [[1024,88],[1,1024],[1,512]] - Please open a support ticket at https://github.com/aws-neuron/aws-neuron-sdk/issues/new. You may also be able to obtain more information using the 'XLA_IR_DEBUG' and 'XLA_HLO_DEBUG' environment variables.
RuntimeError: neuronx-cc failed with 70
With --optlevel 0 (different error)
(Save: spill0_SpillSave) [INTERNAL_ERROR] [NCC_ILSA062] Internal invariant violated - Please open a support ticket at https://github.com/aws-neuron/aws-neuron-sdk/issues/new. You may also be able to obtain more information using the 'XLA_IR_DEBUG' and 'XLA_HLO_DEBUG' environment variables.
2026-05-14T03:38:18Z Non-signal exit. Backend exited with code 1 and stderr: (Save: spill0_SpillSave) [INTERNAL_ERROR] [NCC_ILSA062] Internal invariant violated
Model Name
Cosmos-embeddings
Describe the workload type
model compilation
Instance Type
torch-neuronx 2.9.0.2.13.26312
PyTorch 2.9.1
AMI: Deep Learning AMI Neuron PyTorch Inference vLLM 0.16 (Ubuntu 24.04) 20260511
Target: inf2
Release version
No response
Reproduction Steps
Reproduction
import torch
import torch_neuronx
from transformers import AutoModel
# Model: nvidia/Cosmos-Embed1-336p (1.1B params, EVA-ViT-G + Q-Former)
model = AutoModel.from_pretrained("nvidia/Cosmos-Embed1-336p", trust_remote_code=True, torch_dtype=torch.float32)
model.eval()
class Wrapper(torch.nn.Module):
def __init__(self, m):
super().__init__()
self.m = m
def forward(self, v):
return self.m.get_video_embeddings(v).visual_proj
w = Wrapper(model)
w.eval()
v = torch.randn(1, 8, 3, 336, 336)
# Fails with NCC_IBIR243:
torch_neuronx.trace(w, v, compiler_args=["--target", "inf2", "--optlevel", "1"])
Regression Issue
Possible Solution
No response
Logs/Context/Additional Information
No response
Describe the bug
Environment
Reproduction
Also tried ViT encoder only (same error),
--optlevel 0(getsNCC_ILSA062instead).Full error log
With --optlevel 0 (different error)
Model Name
Cosmos-embeddings
Describe the workload type
model compilation
Instance Type
torch-neuronx 2.9.0.2.13.26312
PyTorch 2.9.1
AMI: Deep Learning AMI Neuron PyTorch Inference vLLM 0.16 (Ubuntu 24.04) 20260511
Target: inf2
Release version
No response
Reproduction Steps
Reproduction
Regression Issue
Possible Solution
No response
Logs/Context/Additional Information
No response