[None][feat] EXAONE-4.5 Support#12873
Conversation
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
📝 WalkthroughWalkthroughThe PR adds EXAONE-4.5 multimodal VLM support with new model implementations and weight mappers. Vision models are refactored to derive dtype/device directly from tensors instead of transformers utilities. Qwen vision models undergo significant RoPE computation and positional embedding pipeline updates. Model loader APIs are updated for kosmos-2, and test infrastructure gains skip mechanisms for conditional test execution. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant InputProc as Input Processor
participant VisionEnc as Vision Encoder
participant TextEmbed as Text Embedding
participant LLM as Language Model
participant Output
Client->>InputProc: Text + Images
InputProc->>InputProc: Preprocess text & images
InputProc->>InputProc: Fuse multimodal placeholders
InputProc->>VisionEnc: pixel_values + grid_thw
VisionEnc->>VisionEnc: Compute windowed RoPE (cos, sin)
VisionEnc->>VisionEnc: Apply vision attention with position_ids
VisionEnc-->>InputProc: Vision embeddings
InputProc->>TextEmbed: Fused input_ids + multimodal_data
TextEmbed->>TextEmbed: Embed text tokens
TextEmbed->>LLM: Fused embeddings (text + vision)
LLM->>LLM: Causal language modeling
LLM-->>Output: Logits / Tokens
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (2)
tensorrt_llm/_torch/models/checkpoints/hf/exaone4_5_weight_mapper.py (1)
19-19: Missing return type annotation.The
preprocess_weightsmethod is missing a return type hint. Per coding guidelines, functions should be annotated with type hints.✏️ Proposed fix
- def preprocess_weights(self, weights: dict): + def preprocess_weights(self, weights: dict) -> dict:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/_torch/models/checkpoints/hf/exaone4_5_weight_mapper.py` at line 19, The method preprocess_weights(self, weights: dict) is missing a return type annotation; update its signature to include an explicit return type such as -> Dict[str, Any] (or Mapping[str, Any] if preferred) and ensure you import the corresponding typing symbols (Dict and Any or Mapping) at the top of the module so the signature reads e.g. def preprocess_weights(self, weights: Dict[str, Any]) -> Dict[str, Any]: while keeping the existing behavior in the preprocess_weights implementation.tests/unittest/_torch/modeling/test_modeling_exaone4_5.py (1)
183-186: Hardcoded local path for test weights.The test config contains a hardcoded developer-specific path (
/code/yechan-models/exaone45_beta_2026-03-19_bf16). While theskip_testproperty handles missing paths gracefully, consider using an environment variable (e.g.,EXAONE_4_5_MODEL_PATH) for configurability:♻️ Suggested improvement
+import os + +_EXAONE_4_5_DEFAULT_PATH = "/code/yechan-models/exaone45_beta_2026-03-19_bf16" + EXAONE_4_5_TEST_CONFIG = { # ... other config ... - "_name_or_path": str( - os.path.join("/code/yechan-models", "exaone45_beta_2026-03-19_bf16") - ), # str(os.path.join(llm_models_root(), "Qwen2.5-VL-7B-Instruct")) + "_name_or_path": os.environ.get("EXAONE_4_5_MODEL_PATH", _EXAONE_4_5_DEFAULT_PATH), }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unittest/_torch/modeling/test_modeling_exaone4_5.py` around lines 183 - 186, Replace the hardcoded developer path in the test config for "_name_or_path" with an environment-configurable value: read os.environ.get("EXAONE_4_5_MODEL_PATH") and fall back to the existing os.path.join("/code/yechan-models", "exaone45_beta_2026-03-19_bf16") if the env var is not set; keep the str(...) cast and preserve the existing skip_test behavior that already handles missing paths. Update the assignment where "_name_or_path" is set in tests/unittest/_torch/modeling/test_modeling_exaone4_5.py to use this env var fallback.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/models/core/exaone/README.md`:
- Around line 102-106: Replace the unresolved TODOs in the EXAONE-4.5 README:
set the actual HuggingFace repo name in the git clone command (replace `<TODO:
FILL>` in the git clone URL that follows HF_MODEL_DIR) and update the example's
expected output section (the lines labeled `TODO: FILL` around the expected
output) with the real output produced by the model so the README contains the
correct repository path and sample result.
In `@tensorrt_llm/_torch/models/modeling_exaone4_5.py`:
- Around line 235-237: Replace the runtime type check in load_weights to avoid
using assert: explicitly verify that weight_mapper is an instance of
Exaone4_5HfWeightMapper and if not raise a TypeError with a clear message (e.g.,
indicating expected Exaone4_5HfWeightMapper but got type(weight_mapper)); then
proceed to call weight_mapper.preprocess_weights(weights). Ensure you reference
the load_weights method and the Exaone4_5HfWeightMapper/BaseWeightMapper types
when making the change.
- Around line 175-189: Exaone4_5_VLModel is processing all multimodal_params
including text-only entries; call the parent filter to only keep requests with
actual multimodal data before calling the encoder. Modify the block handling
multimodal_params to first call
self._get_requests_with_mm_data(multimodal_params) (or assign its return to a
local filtered list) and use that filtered list when deciding to call
get_multimodal_embeddings, then pass the filtered mm_embeds into
find_input_mm_embeds; ensure you still raise NotImplementedError for
disaggregated mode in the same place.
In `@tensorrt_llm/_torch/models/modeling_qwen2vl.py`:
- Around line 914-916: The prepare_attn_metadata method in modeling_qwen2vl.py
shadows the incoming batch_size parameter by reassigning it to len(seq_lens);
remove that reassignment so the passed batch_size is used, mirroring the fix
from modeling_qwen3vl.py; then ensure any callers that currently rely on the old
behavior (e.g., the call sites referenced around lines 976-980) are updated to
pass the correct batch_size value (or compute len(seq_lens) before calling) so
prepare_attn_metadata(batch_size: int, seq_lens: List[int], attn_metadata:
AttentionMetadata) uses its batch_size argument as intended.
In `@tensorrt_llm/_torch/models/modeling_qwen3vl.py`:
- Around line 768-771: The prepare_attn_metadata function currently shadows its
batch_size parameter with batch_size = len(seq_lens); remove the parameter from
prepare_attn_metadata's signature and update every call site that passes
batch_size (e.g., where prepare_attn_metadata(...) is invoked) to stop supplying
that argument, or alternatively keep the parameter and delete the reassignment
so the passed batch_size is used; modify the function signature and all
references consistently (function name: prepare_attn_metadata, local variable:
seq_lens) so there is no shadowing or dead parameter.
In `@tensorrt_llm/serve/chat_utils.py`:
- Around line 289-290: The line initializing MultimodalDataTracker is
misformatted causing pre-commit/yapf failures; reformat the statement that
constructs MultimodalDataTracker(type(model_config).model_type,
multimodal_server_config) to satisfy the project's formatter (wrap arguments
across lines or adjust spacing consistent with other calls), then run the
project's pre-commit hooks or `yapf` to enforce line wrapping for this and any
adjacent function calls; ensure you update the call sites referencing
MultimodalDataTracker, model_config, and multimodal_server_config so they
conform to the repository's line-length and formatting rules.
- Around line 305-307: Remove the dead local assignment "model_type =
model_config.model_type" since subsequent calls (e.g.,
MULTIMODAL_PLACEHOLDER_REGISTRY.get_content_format(type(model_config).model_type))
use type(model_config).model_type instead; delete the unused "model_type"
variable or alternatively replace other uses to reference the local variable
consistently, ensuring references around model_config and
MULTIMODAL_PLACEHOLDER_REGISTRY.get_content_format remain correct.
---
Nitpick comments:
In `@tensorrt_llm/_torch/models/checkpoints/hf/exaone4_5_weight_mapper.py`:
- Line 19: The method preprocess_weights(self, weights: dict) is missing a
return type annotation; update its signature to include an explicit return type
such as -> Dict[str, Any] (or Mapping[str, Any] if preferred) and ensure you
import the corresponding typing symbols (Dict and Any or Mapping) at the top of
the module so the signature reads e.g. def preprocess_weights(self, weights:
Dict[str, Any]) -> Dict[str, Any]: while keeping the existing behavior in the
preprocess_weights implementation.
In `@tests/unittest/_torch/modeling/test_modeling_exaone4_5.py`:
- Around line 183-186: Replace the hardcoded developer path in the test config
for "_name_or_path" with an environment-configurable value: read
os.environ.get("EXAONE_4_5_MODEL_PATH") and fall back to the existing
os.path.join("/code/yechan-models", "exaone45_beta_2026-03-19_bf16") if the env
var is not set; keep the str(...) cast and preserve the existing skip_test
behavior that already handles missing paths. Update the assignment where
"_name_or_path" is set in
tests/unittest/_torch/modeling/test_modeling_exaone4_5.py to use this env var
fallback.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a4fc1689-50d7-46db-9958-52340c2e4a19
📒 Files selected for processing (18)
examples/models/core/exaone/README.mdrequirements.txttensorrt_llm/_torch/models/__init__.pytensorrt_llm/_torch/models/checkpoints/hf/exaone4_5_weight_mapper.pytensorrt_llm/_torch/models/modeling_clip.pytensorrt_llm/_torch/models/modeling_exaone4_5.pytensorrt_llm/_torch/models/modeling_exaone_moe.pytensorrt_llm/_torch/models/modeling_llama.pytensorrt_llm/_torch/models/modeling_qwen2vl.pytensorrt_llm/_torch/models/modeling_qwen3vl.pytensorrt_llm/_torch/models/modeling_siglip.pytensorrt_llm/_torch/visual_gen/models/wan/transformer_wan.pytensorrt_llm/models/gpt/convert.pytensorrt_llm/serve/chat_utils.pytensorrt_llm/serve/openai_server.pytensorrt_llm/tools/multimodal_builder.pytests/unittest/_torch/modeling/test_modeling_exaone4_5.pytests/unittest/_torch/modeling/test_modeling_multimodal.py
| ```bash | ||
| export HF_MODEL_DIR=hf_models/exaone4_5 | ||
| git clone https://huggingface.co/LGAI-EXAONE/<TODO: FILL> $HF_MODEL_DIR | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Unresolved TODO placeholders in documentation.
The EXAONE-4.5 section contains placeholders that need to be filled:
- Line 104:
<TODO: FILL>in the git clone URL - Lines 143-145:
TODO: FILLfor expected output
Please update these with the actual HuggingFace model repository name and expected output before merging.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/models/core/exaone/README.md` around lines 102 - 106, Replace the
unresolved TODOs in the EXAONE-4.5 README: set the actual HuggingFace repo name
in the git clone command (replace `<TODO: FILL>` in the git clone URL that
follows HF_MODEL_DIR) and update the example's expected output section (the
lines labeled `TODO: FILL` around the expected output) with the real output
produced by the model so the README contains the correct repository path and
sample result.
| multimodal_params = kwargs.get("multimodal_params", []) | ||
| mm_embeds = [] | ||
|
|
||
| if len(multimodal_params) > 0: | ||
| if not _is_disagg(): | ||
| mm_embeds = get_multimodal_embeddings( | ||
| encoder_forward_fn=self.mm_encoder.forward, | ||
| multimodal_params=multimodal_params, | ||
| ) | ||
| else: | ||
| raise NotImplementedError( | ||
| "Exaone4.5-VL does not support disaggregated inference yet. " | ||
| "Unset TLLM_MULTIMODAL_DISAGGREGATED or set it to '0'." | ||
| ) | ||
| mm_embeds = find_input_mm_embeds(mm_embeds, multimodal_params) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if Exaone4_5InputProcessor populates multimodal_params for text-only prompts
rg -n "multimodal_data" tensorrt_llm/_torch/models/modeling_exaone4_5.py -A 3 -B 3Repository: NVIDIA/TensorRT-LLM
Length of output: 1333
🏁 Script executed:
# First, let's see the full Exaone4_5InputProcessor implementation to understand multimodal_params creation
fd "modeling_exaone4_5.py" | xargs wc -lRepository: NVIDIA/TensorRT-LLM
Length of output: 193
🏁 Script executed:
# Get the complete InputProcessor class and how it creates multimodal_params
cat -n tensorrt_llm/_torch/models/modeling_exaone4_5.py | sed -n '60,115p'Repository: NVIDIA/TensorRT-LLM
Length of output: 2460
🏁 Script executed:
# Check how forward method at lines 175-189 uses multimodal_params
cat -n tensorrt_llm/_torch/models/modeling_exaone4_5.py | sed -n '170,195p'Repository: NVIDIA/TensorRT-LLM
Length of output: 1268
🏁 Script executed:
# Look for the get_multimodal_embeddings function to understand what it does with multimodal_params
rg -n "def get_multimodal_embeddings" tensorrt_llm/_torch/ -A 10Repository: NVIDIA/TensorRT-LLM
Length of output: 1115
🏁 Script executed:
# Compare with Qwen2VL's _get_requests_with_mm_data filtering mentioned in the review
rg -n "_get_requests_with_mm_data|get_multimodal_embeddings" tensorrt_llm/_torch/models/ -B 2 -A 5Repository: NVIDIA/TensorRT-LLM
Length of output: 16857
🏁 Script executed:
# Check Exaone4_5ForCausalLM class definition and inheritance
cat -n tensorrt_llm/_torch/models/modeling_exaone4_5.py | sed -n '130,170p'Repository: NVIDIA/TensorRT-LLM
Length of output: 1907
🏁 Script executed:
# Check if Exaone4_5ForCausalLM or its parent has _get_requests_with_mm_data
rg -n "_get_requests_with_mm_data" tensorrt_llm/_torch/models/modeling_exaone4_5.pyRepository: NVIDIA/TensorRT-LLM
Length of output: 45
🏁 Script executed:
# Check Qwen2VLModelBase to see if it has _get_requests_with_mm_data
rg -n "class Qwen2VLModelBase|def _get_requests_with_mm_data" tensorrt_llm/_torch/models/modeling_qwen2vl.py -A 1Repository: NVIDIA/TensorRT-LLM
Length of output: 221
🏁 Script executed:
# Check the full class declaration for Exaone4_5ForCausalLM
rg -n "class Exaone4_5ForCausalLM" tensorrt_llm/_torch/models/modeling_exaone4_5.py -A 5Repository: NVIDIA/TensorRT-LLM
Length of output: 45
🏁 Script executed:
# Get the full _get_requests_with_mm_data method from Qwen2VLModelBase
cat -n tensorrt_llm/_torch/models/modeling_qwen2vl.py | sed -n '1163,1180p'Repository: NVIDIA/TensorRT-LLM
Length of output: 1140
🏁 Script executed:
# Search for the Exaone4_5ForCausalLM class definition in the file
rg -n "class Exaone4_5" tensorrt_llm/_torch/models/modeling_exaone4_5.pyRepository: NVIDIA/TensorRT-LLM
Length of output: 334
🏁 Script executed:
# Get full context around line 120 to see the class definition
cat -n tensorrt_llm/_torch/models/modeling_exaone4_5.py | sed -n '115,135p'Repository: NVIDIA/TensorRT-LLM
Length of output: 922
🏁 Script executed:
# Verify that the forward method at lines 166-189 is in Exaone4_5_VLModel
cat -n tensorrt_llm/_torch/models/modeling_exaone4_5.py | sed -n '124,200p'Repository: NVIDIA/TensorRT-LLM
Length of output: 3413
Add filtering of multimodal_params to match Qwen2VLModelBase behavior.
Exaone4_5_VLModel inherits from Qwen2VLModelBase but doesn't use the parent's _get_requests_with_mm_data() method to filter multimodal_params. The parent implementation and Qwen3VLModel both filter to only process entries with actual multimodal data (image/video), because Qwen-VL models include mrope_config entries even for text-only prompts. Without filtering, empty entries are unnecessarily passed to the encoder. Consider calling self._get_requests_with_mm_data(multimodal_params) before processing, similar to the parent class.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tensorrt_llm/_torch/models/modeling_exaone4_5.py` around lines 175 - 189,
Exaone4_5_VLModel is processing all multimodal_params including text-only
entries; call the parent filter to only keep requests with actual multimodal
data before calling the encoder. Modify the block handling multimodal_params to
first call self._get_requests_with_mm_data(multimodal_params) (or assign its
return to a local filtered list) and use that filtered list when deciding to
call get_multimodal_embeddings, then pass the filtered mm_embeds into
find_input_mm_embeds; ensure you still raise NotImplementedError for
disaggregated mode in the same place.
| def load_weights(self, weights, weight_mapper: BaseWeightMapper): | ||
| assert isinstance(weight_mapper, Exaone4_5HfWeightMapper) | ||
| weights = weight_mapper.preprocess_weights(weights) |
There was a problem hiding this comment.
Avoid assert for runtime type validation in production code.
Using assert for type checking can be bypassed when Python is run with -O (optimizations). Use an explicit check with raise TypeError instead.
🛡️ Proposed fix
def load_weights(self, weights, weight_mapper: BaseWeightMapper):
- assert isinstance(weight_mapper, Exaone4_5HfWeightMapper)
+ if not isinstance(weight_mapper, Exaone4_5HfWeightMapper):
+ raise TypeError(
+ f"Expected Exaone4_5HfWeightMapper, got {type(weight_mapper).__name__}"
+ )
weights = weight_mapper.preprocess_weights(weights)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tensorrt_llm/_torch/models/modeling_exaone4_5.py` around lines 235 - 237,
Replace the runtime type check in load_weights to avoid using assert: explicitly
verify that weight_mapper is an instance of Exaone4_5HfWeightMapper and if not
raise a TypeError with a clear message (e.g., indicating expected
Exaone4_5HfWeightMapper but got type(weight_mapper)); then proceed to call
weight_mapper.preprocess_weights(weights). Ensure you reference the load_weights
method and the Exaone4_5HfWeightMapper/BaseWeightMapper types when making the
change.
| def prepare_attn_metadata(self, batch_size: int, seq_lens: List[int], | ||
| attn_metadata: AttentionMetadata): | ||
| batch_size = len(seq_lens) |
There was a problem hiding this comment.
Unused batch_size parameter is immediately shadowed.
Same issue as in modeling_qwen3vl.py: the batch_size parameter is passed but immediately overwritten. This should be consistent with the fix applied to qwen3vl.
🧹 Proposed fix
- def prepare_attn_metadata(self, batch_size: int, seq_lens: List[int],
+ def prepare_attn_metadata(self, seq_lens: List[int],
attn_metadata: AttentionMetadata):
batch_size = len(seq_lens)Update call sites at lines 976-980:
- full_attn_metadata = self.prepare_attn_metadata(len(grid_rows),
- seq_lens,
+ full_attn_metadata = self.prepare_attn_metadata(seq_lens,
self.full_attn_metadata)
- window_attn_metadata = self.prepare_attn_metadata(
- len(grid_rows), window_seq_lens, self.window_attn_metadata)
+ window_attn_metadata = self.prepare_attn_metadata(window_seq_lens,
+ self.window_attn_metadata)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tensorrt_llm/_torch/models/modeling_qwen2vl.py` around lines 914 - 916, The
prepare_attn_metadata method in modeling_qwen2vl.py shadows the incoming
batch_size parameter by reassigning it to len(seq_lens); remove that
reassignment so the passed batch_size is used, mirroring the fix from
modeling_qwen3vl.py; then ensure any callers that currently rely on the old
behavior (e.g., the call sites referenced around lines 976-980) are updated to
pass the correct batch_size value (or compute len(seq_lens) before calling) so
prepare_attn_metadata(batch_size: int, seq_lens: List[int], attn_metadata:
AttentionMetadata) uses its batch_size argument as intended.
| def prepare_attn_metadata( | ||
| self, batch_size: int, seq_lens: List[int], attn_metadata: AttentionMetadata | ||
| ): | ||
| batch_size = len(seq_lens) |
There was a problem hiding this comment.
Unused batch_size parameter is immediately shadowed.
The batch_size parameter is passed to prepare_attn_metadata but immediately overwritten by batch_size = len(seq_lens) on line 771. Either remove the parameter or remove the reassignment.
🧹 Proposed fix (remove parameter)
def prepare_attn_metadata(
- self, batch_size: int, seq_lens: List[int], attn_metadata: AttentionMetadata
+ self, seq_lens: List[int], attn_metadata: AttentionMetadata
):
batch_size = len(seq_lens)Then update the call site at line 789:
- attn_metadata = self.prepare_attn_metadata(len(grid_thw), seq_lens, self.attn_metadata)
+ attn_metadata = self.prepare_attn_metadata(seq_lens, self.attn_metadata)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def prepare_attn_metadata( | |
| self, batch_size: int, seq_lens: List[int], attn_metadata: AttentionMetadata | |
| ): | |
| batch_size = len(seq_lens) | |
| def prepare_attn_metadata( | |
| self, seq_lens: List[int], attn_metadata: AttentionMetadata | |
| ): | |
| batch_size = len(seq_lens) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tensorrt_llm/_torch/models/modeling_qwen3vl.py` around lines 768 - 771, The
prepare_attn_metadata function currently shadows its batch_size parameter with
batch_size = len(seq_lens); remove the parameter from prepare_attn_metadata's
signature and update every call site that passes batch_size (e.g., where
prepare_attn_metadata(...) is invoked) to stop supplying that argument, or
alternatively keep the parameter and delete the reassignment so the passed
batch_size is used; modify the function signature and all references
consistently (function name: prepare_attn_metadata, local variable: seq_lens) so
there is no shadowing or dead parameter.
| mm_data_tracker = MultimodalDataTracker( | ||
| type(model_config).model_type, multimodal_server_config) |
There was a problem hiding this comment.
Pipeline failure: formatting enforcement.
The CI indicates a pre-commit formatting failure on this file. Please run yapf or the pre-commit hooks to fix line wrapping around the MultimodalDataTracker initialization and function calls.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tensorrt_llm/serve/chat_utils.py` around lines 289 - 290, The line
initializing MultimodalDataTracker is misformatted causing pre-commit/yapf
failures; reformat the statement that constructs
MultimodalDataTracker(type(model_config).model_type, multimodal_server_config)
to satisfy the project's formatter (wrap arguments across lines or adjust
spacing consistent with other calls), then run the project's pre-commit hooks or
`yapf` to enforce line wrapping for this and any adjacent function calls; ensure
you update the call sites referencing MultimodalDataTracker, model_config, and
multimodal_server_config so they conform to the repository's line-length and
formatting rules.
| model_type = model_config.model_type | ||
| registry_format = MULTIMODAL_PLACEHOLDER_REGISTRY.get_content_format( | ||
| model_type) | ||
| type(model_config).model_type) |
There was a problem hiding this comment.
Unused variable model_type after refactor.
Line 305 assigns model_type = model_config.model_type, but subsequent code (lines 307, 336, 339, 343) now uses type(model_config).model_type instead. This leaves model_type as dead code.
🧹 Proposed fix to remove unused variable
- model_type = model_config.model_type
registry_format = MULTIMODAL_PLACEHOLDER_REGISTRY.get_content_format(
type(model_config).model_type)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| model_type = model_config.model_type | |
| registry_format = MULTIMODAL_PLACEHOLDER_REGISTRY.get_content_format( | |
| model_type) | |
| type(model_config).model_type) | |
| registry_format = MULTIMODAL_PLACEHOLDER_REGISTRY.get_content_format( | |
| type(model_config).model_type) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tensorrt_llm/serve/chat_utils.py` around lines 305 - 307, Remove the dead
local assignment "model_type = model_config.model_type" since subsequent calls
(e.g.,
MULTIMODAL_PLACEHOLDER_REGISTRY.get_content_format(type(model_config).model_type))
use type(model_config).model_type instead; delete the unused "model_type"
variable or alternatively replace other uses to reference the local variable
consistently, ensuring references around model_config and
MULTIMODAL_PLACEHOLDER_REGISTRY.get_content_format remain correct.
|
/bot run |
|
PR_Github #42598 [ run ] triggered by Bot. Commit: |
|
PR_Github #42598 [ run ] completed with state
|
This PR is adding Day-0 support of LG AI's new VLM model, EXAONE-4.5.
This PR includes Text + Multimodal support.
Prerequisite
Sample command
Summary by CodeRabbit
Release Notes
New Features
Documentation
Updates