Skip to content

[Bug] NCC_IBIR243 Access pattern out of bounds when compiling EVA-ViT-G for inf2 #1326

@sec486

Description

@sec486

Describe the bug

Environment

  • torch-neuronx 2.9.0.2.13.26312
  • PyTorch 2.9.1
  • AMI: Deep Learning AMI Neuron PyTorch Inference vLLM 0.16 (Ubuntu 24.04) 20260511
  • Target: inf2

Reproduction

import torch
import torch_neuronx
from transformers import AutoModel

# Model: nvidia/Cosmos-Embed1-336p (1.1B params, EVA-ViT-G + Q-Former)
model = AutoModel.from_pretrained("nvidia/Cosmos-Embed1-336p", trust_remote_code=True, torch_dtype=torch.float32)
model.eval()

class Wrapper(torch.nn.Module):
    def __init__(self, m):
        super().__init__()
        self.m = m
    def forward(self, v):
        return self.m.get_video_embeddings(v).visual_proj

w = Wrapper(model)
w.eval()
v = torch.randn(1, 8, 3, 336, 336)

# Fails with NCC_IBIR243:
torch_neuronx.trace(w, v, compiler_args=["--target", "inf2", "--optlevel", "1"])

Also tried ViT encoder only (same error), --optlevel 0 (gets NCC_ILSA062 instead).

Full error log

(Load: I-66527-0), tensorizer(output tensor: float32<88 x 512> $66527[i322_0_0, i308_0_1, i289_1_1, i289_1_0_64247, i243_0_45014_64246_64247], id: 47030 output tensor: float32<88 x 65> $66528[i322_0_0, i308_0_1, i289_1_1, i289_1_0_64247, i243_0_45014_64246_64247], id: 47030) [INTERNAL_ERROR] [NCC_IBIR243] Access pattern out of bounds. Pattern: [[1024,88],[1,1024],[1,512]] - Please open a support ticket at https://github.com/aws-neuron/aws-neuron-sdk/issues/new. You may also be able to obtain more information using the 'XLA_IR_DEBUG' and 'XLA_HLO_DEBUG' environment variables.

2026-05-14T03:21:10Z Non-signal exit. Backend exited with code 1 and stderr: (Load: I-66527-0), tensorizer(output tensor: float32<88 x 512> $66527[i322_0_0, i308_0_1, i289_1_1, i289_1_0_64247, i243_0_45014_64246_64247], id: 47030 output tensor: float32<88 x 65> $66528[i322_0_0, i308_0_1, i289_1_1, i289_1_0_64247, i243_0_45014_64246_64247], id: 47030) [INTERNAL_ERROR] [NCC_IBIR243] Access pattern out of bounds. Pattern: [[1024,88],[1,1024],[1,512]] - Please open a support ticket at https://github.com/aws-neuron/aws-neuron-sdk/issues/new. You may also be able to obtain more information using the 'XLA_IR_DEBUG' and 'XLA_HLO_DEBUG' environment variables.

RuntimeError: neuronx-cc failed with 70

With --optlevel 0 (different error)

(Save: spill0_SpillSave) [INTERNAL_ERROR] [NCC_ILSA062] Internal invariant violated - Please open a support ticket at https://github.com/aws-neuron/aws-neuron-sdk/issues/new. You may also be able to obtain more information using the 'XLA_IR_DEBUG' and 'XLA_HLO_DEBUG' environment variables.

2026-05-14T03:38:18Z Non-signal exit. Backend exited with code 1 and stderr: (Save: spill0_SpillSave) [INTERNAL_ERROR] [NCC_ILSA062] Internal invariant violated

Model Name

Cosmos-embeddings

Describe the workload type

model compilation

Instance Type

torch-neuronx 2.9.0.2.13.26312
PyTorch 2.9.1
AMI: Deep Learning AMI Neuron PyTorch Inference vLLM 0.16 (Ubuntu 24.04) 20260511
Target: inf2

Release version

No response

Reproduction Steps

Reproduction

import torch
import torch_neuronx
from transformers import AutoModel

# Model: nvidia/Cosmos-Embed1-336p (1.1B params, EVA-ViT-G + Q-Former)
model = AutoModel.from_pretrained("nvidia/Cosmos-Embed1-336p", trust_remote_code=True, torch_dtype=torch.float32)
model.eval()

class Wrapper(torch.nn.Module):
    def __init__(self, m):
        super().__init__()
        self.m = m
    def forward(self, v):
        return self.m.get_video_embeddings(v).visual_proj

w = Wrapper(model)
w.eval()
v = torch.randn(1, 8, 3, 336, 336)

# Fails with NCC_IBIR243:
torch_neuronx.trace(w, v, compiler_args=["--target", "inf2", "--optlevel", "1"])

Regression Issue

  • Select this option if this issue appears to be a regression.

Possible Solution

No response

Logs/Context/Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions