Skip to content

feat: Support llm-d manual runs#16

Open
albertoperdomo2 wants to merge 3 commits intoopenshift-psap:mainfrom
albertoperdomo2:feat/support-llm-d-manual-runs
Open

feat: Support llm-d manual runs#16
albertoperdomo2 wants to merge 3 commits intoopenshift-psap:mainfrom
albertoperdomo2:feat/support-llm-d-manual-runs

Conversation

@albertoperdomo2
Copy link

@albertoperdomo2 albertoperdomo2 commented Mar 11, 2026

Add support for llm-d manual runs with the necessary new fields.

Summary by CodeRabbit

  • New Features
    • Added LLM‑D mode with support for deployment metadata (DP, EP, replicas, prefill pod count, decode pod count, router configuration, notes).
    • New CLI flags to enable LLM‑D and supply the deployment metadata; mode-specific flag validation updated.
    • CSV export updated to include the new deployment metadata columns when LLM‑D is active (columns present even if values are empty).

Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
@albertoperdomo2
Copy link
Author

cc: @Harshith-umesh @MML-coder

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

Adds LLM-D deployment mode to the JSON import script, introducing deployment metadata fields (DP, EP, replicas, prefill_pod_count, decode_pod_count, router_config, notes), CLI flags and validation, propagated parameters through parsing/processing, and conditional CSV headers/rows including the new columns.

Changes

Cohort / File(s) Summary
LLM-D Mode & CLI
manual_runs/scripts/import_manual_runs_json_v2.py
Adds --llm-d flag and new CLI flags --dp, --ep, --replicas, --prefill-pod-count, --decode-pod-count, --router-config, --notes. Enforces mode-specific required flags and updates help/validation logic.
Parsing & Processing Signatures
manual_runs/scripts/import_manual_runs_json_v2.py
Extends parse_guidellm_json(...) and process_benchmark_section(...) signatures to accept dp, ep, replicas, prefill_pod_count, decode_pod_count, router_config, notes and propagates them through processing paths and row construction.
CSV Output & Row Composition
manual_runs/scripts/import_manual_runs_json_v2.py
Conditional CSV header/row generation: when LLM-D is active, includes DP, EP, replicas, prefill_pod_count, decode_pod_count, router_config, notes columns; ensures new columns exist in output regardless of data presence and includes fields in per-benchmark rows.
Other
manifest file analyzer / diff metadata
Bulk change summary: +196 / -50 lines changed per manifest analyzer.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐇 I hopped into parsers to plant a few notes,
DP and EP waving like flags on the moats.
Replicas and pods lined up in a row,
Router and notes helping benchmarks go,
LLM‑D dances—data ready to float. 🌱

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: Support llm-d manual runs' clearly and concisely summarizes the main change: adding support for LLM-D mode in the manual runs import script with new deployment metadata fields.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
manual_runs/scripts/import_manual_runs_json_v2.py (2)

28-52: Docstring missing documentation for new parameters.

The new LLM-D parameters (dp, ep, replicas, prefill_pod_count, decode_pod_count, router_config) are not documented in the Args section of the docstring.

📝 Suggested docstring addition
     Args:
         benchmark: Benchmark data from JSON (guidellm 0.5.x format).
         accelerator: Accelerator type (e.g., H200, MI300X).
         model_name: Name of the AI model.
         version: Version of the inference server.
         tp_size: Tensor parallelism size.
         runtime_args: Runtime configuration arguments.
         global_data_config: Global data configuration from top-level args.
         image_tag: Container image tag used for the run.
         guidellm_version: Version of guidellm used to run the benchmark.
         guidellm_start_time_ms: Aggregated start time in milliseconds.
         guidellm_end_time_ms: Aggregated end time in milliseconds.
+        dp: Data parallelism size (LLM-D mode).
+        ep: Expert parallelism size (LLM-D mode).
+        replicas: Number of replicas (LLM-D mode).
+        prefill_pod_count: Number of prefill pods (LLM-D mode).
+        decode_pod_count: Number of decode pods (LLM-D mode).
+        router_config: Router/endpoint picker configuration (LLM-D mode).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@manual_runs/scripts/import_manual_runs_json_v2.py` around lines 28 - 52, Add
docstring entries for the new LLM-D parameters (dp, ep, replicas,
prefill_pod_count, decode_pod_count, router_config) in the function's Args
section: describe each parameter's expected type and purpose (e.g., dp: int or
None — data parallel size; ep: int or None — expert/experts or expert
parallelism; replicas: int or None — number of model replicas;
prefill_pod_count: int or None — number of pods used for prefill stage;
decode_pod_count: int or None — number of pods used for decode stage;
router_config: dict or None — routing configuration for request distribution),
mark optional parameters as None default, and keep formatting consistent with
the existing docstring style used in this function.

199-212: Docstring missing documentation for new parameters.

Similar to process_benchmark_section, the new LLM-D parameters should be documented in the Args section.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@manual_runs/scripts/import_manual_runs_json_v2.py` around lines 199 - 212,
The docstring for the function that starts with "Parse guidellm 0.5.x JSON
benchmark results." is missing entries for the new LLM-D parameters; update its
Args section to document each new parameter exactly as done in
process_benchmark_section (include parameter name, type, and brief description
for LLM-D specific fields such as any tokenizer/sequence/prompt config,
temperature/beam/candidate settings, or other runtime flags added), ensuring
names match the function signature and runtime_args/guidellm_version/other
existing params are preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@manual_runs/scripts/import_manual_runs_json_v2.py`:
- Around line 426-476: The LLM-D branch (when args.llm_d is true) is missing the
guidellm_start_time_ms and guidellm_end_time_ms columns, so rows produced by
parse_guidellm_json (which computes and adds guidellm_start_time_ms and
guidellm_end_time_ms) are truncated when written; update the fieldnames list
inside the args.llm_d block to include guidellm_start_time_ms and
guidellm_end_time_ms (matching the standard-mode fieldnames) so those timing
fields are preserved in the CSV output.

---

Nitpick comments:
In `@manual_runs/scripts/import_manual_runs_json_v2.py`:
- Around line 28-52: Add docstring entries for the new LLM-D parameters (dp, ep,
replicas, prefill_pod_count, decode_pod_count, router_config) in the function's
Args section: describe each parameter's expected type and purpose (e.g., dp: int
or None — data parallel size; ep: int or None — expert/experts or expert
parallelism; replicas: int or None — number of model replicas;
prefill_pod_count: int or None — number of pods used for prefill stage;
decode_pod_count: int or None — number of pods used for decode stage;
router_config: dict or None — routing configuration for request distribution),
mark optional parameters as None default, and keep formatting consistent with
the existing docstring style used in this function.
- Around line 199-212: The docstring for the function that starts with "Parse
guidellm 0.5.x JSON benchmark results." is missing entries for the new LLM-D
parameters; update its Args section to document each new parameter exactly as
done in process_benchmark_section (include parameter name, type, and brief
description for LLM-D specific fields such as any tokenizer/sequence/prompt
config, temperature/beam/candidate settings, or other runtime flags added),
ensuring names match the function signature and
runtime_args/guidellm_version/other existing params are preserved.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8c34a60d-9fcd-4e64-96af-b02dfcabf30a

📥 Commits

Reviewing files that changed from the base of the PR and between 52a7418 and 6f1670a.

📒 Files selected for processing (1)
  • manual_runs/scripts/import_manual_runs_json_v2.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
manual_runs/scripts/import_manual_runs_json_v2.py (1)

413-529: ⚠️ Potential issue | 🟡 Minor

Data loss risk when appending CSV files with different modes.

The script supports two CSV modes: standard (default) and --llm-d. When appending to an existing CSV file (line 414), if the existing file was created with a different mode, the combined_df[fieldnames] filtering at line 529 will silently drop all columns not in the current mode's fieldnames list.

For example: Running in standard mode on an LLM-D CSV will drop DP, EP, replicas, prefill_pod_count, decode_pod_count, and router_config columns from all rows.

Note: The --llm-d flag is not documented in the README and requires explicit parameters (--dp, --ep, --replicas, --router-config), reducing the likelihood of accidental mode mixing. However, the vulnerability exists and should be mitigated to prevent silent data loss.

Consider adding a schema mismatch check when appending:

🛡️ Proposed fix: Add schema mismatch detection
         if os.path.exists(args.csv_file):
             print(f"Appending {len(new_data_df)} new rows to {args.csv_file}...")
             existing_df = pd.read_csv(args.csv_file)
+            # Detect schema mismatch
+            llm_d_columns = {"DP", "EP", "replicas", "prefill_pod_count", "decode_pod_count", "router_config"}
+            existing_has_llm_d = bool(llm_d_columns & set(existing_df.columns))
+            if existing_has_llm_d != args.llm_d:
+                print(f"Warning: Existing CSV {'has' if existing_has_llm_d else 'lacks'} LLM-D columns, "
+                      f"but current mode is {'LLM-D' if args.llm_d else 'standard'}. "
+                      f"Some columns may be dropped or added with null values.")
             combined_df = pd.concat([existing_df, new_data_df], ignore_index=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@manual_runs/scripts/import_manual_runs_json_v2.py` around lines 413 - 529,
Appending to an existing CSV ignores schema differences between modes and then
drops columns by reindexing to the current mode's fieldnames; detect and prevent
this by comparing the existing_df.columns set to the current mode fieldnames
(use args.llm_d to choose the expected list), and if there is a mismatch
raise/print a clear error or merge-safe warning instead of blindly doing
combined_df = combined_df[fieldnames]; update the logic around
existing_df/combined_df and the fieldnames construction (refer to variables
fieldnames, existing_df, combined_df, and args.llm_d) to either (a) preserve all
existing columns by unioning fieldnames with existing_df.columns before
reindexing, or (b) abort with a schema-mismatch message that explains which
columns differ and suggests running with the matching --llm-d flag.
🧹 Nitpick comments (1)
manual_runs/scripts/import_manual_runs_json_v2.py (1)

28-51: Docstrings lack documentation for new parameters.

The new LLM-D parameters (dp, ep, replicas, prefill_pod_count, decode_pod_count, router_config) are not documented in the function docstrings. Consider adding Args entries for completeness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@manual_runs/scripts/import_manual_runs_json_v2.py` around lines 28 - 51, The
docstring for the function that processes a single benchmark section is missing
entries for the new LLM-D parameters (dp, ep, replicas, prefill_pod_count,
decode_pod_count, router_config); update the Args block in that function's
docstring to add a short description and expected type for each of these
parameters (e.g., dp: data parallelism size (int), ep: expert parallelism (int),
replicas: number of model replicas (int), prefill_pod_count: pods used for
prefill (int), decode_pod_count: pods used for decode (int), router_config:
routing configuration dict/str) so the docstring remains complete and consistent
with other parameters.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@manual_runs/scripts/import_manual_runs_json_v2.py`:
- Around line 413-529: Appending to an existing CSV ignores schema differences
between modes and then drops columns by reindexing to the current mode's
fieldnames; detect and prevent this by comparing the existing_df.columns set to
the current mode fieldnames (use args.llm_d to choose the expected list), and if
there is a mismatch raise/print a clear error or merge-safe warning instead of
blindly doing combined_df = combined_df[fieldnames]; update the logic around
existing_df/combined_df and the fieldnames construction (refer to variables
fieldnames, existing_df, combined_df, and args.llm_d) to either (a) preserve all
existing columns by unioning fieldnames with existing_df.columns before
reindexing, or (b) abort with a schema-mismatch message that explains which
columns differ and suggests running with the matching --llm-d flag.

---

Nitpick comments:
In `@manual_runs/scripts/import_manual_runs_json_v2.py`:
- Around line 28-51: The docstring for the function that processes a single
benchmark section is missing entries for the new LLM-D parameters (dp, ep,
replicas, prefill_pod_count, decode_pod_count, router_config); update the Args
block in that function's docstring to add a short description and expected type
for each of these parameters (e.g., dp: data parallelism size (int), ep: expert
parallelism (int), replicas: number of model replicas (int), prefill_pod_count:
pods used for prefill (int), decode_pod_count: pods used for decode (int),
router_config: routing configuration dict/str) so the docstring remains complete
and consistent with other parameters.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bb9730ea-0526-4f4f-a6f9-c31219aef13d

📥 Commits

Reviewing files that changed from the base of the PR and between 6f1670a and a3d954b.

📒 Files selected for processing (1)
  • manual_runs/scripts/import_manual_runs_json_v2.py

Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
manual_runs/scripts/import_manual_runs_json_v2.py (1)

28-52: Docstring missing documentation for new parameters.

The new parameters (dp, ep, replicas, prefill_pod_count, decode_pod_count, router_config, notes) are not documented in the function docstring. Consider adding them for completeness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@manual_runs/scripts/import_manual_runs_json_v2.py` around lines 28 - 52,
Update the function docstring for process_benchmark_section to document the
newly added parameters dp, ep, replicas, prefill_pod_count, decode_pod_count,
router_config, and notes; for each parameter add a short one-line description
and expected type (e.g., int, dict, str, or None) and include any semantic
meaning (e.g., dp/ep are data/engine parallel sizes, replicas is number of
server replicas, prefill_pod_count/decode_pod_count are pod counts for
prefill/decode stages, router_config is routing settings, notes is freeform
metadata). Ensure these entries follow the existing Args style and placement
with the other parameters (accelerator, model_name, etc.) in the same docstring
block for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@manual_runs/scripts/import_manual_runs_json_v2.py`:
- Around line 273-279: The call to process_benchmark_section is missing the
notes argument so the CLI --notes value parsed in parse_guidellm_json is never
forwarded; update the call that currently passes dp, ep, replicas,
prefill_pod_count, decode_pod_count, router_config (the invocation inside
parse_guidellm_json) to include notes=notes so the notes parameter is propagated
into process_benchmark_section and stored on each row.

---

Nitpick comments:
In `@manual_runs/scripts/import_manual_runs_json_v2.py`:
- Around line 28-52: Update the function docstring for process_benchmark_section
to document the newly added parameters dp, ep, replicas, prefill_pod_count,
decode_pod_count, router_config, and notes; for each parameter add a short
one-line description and expected type (e.g., int, dict, str, or None) and
include any semantic meaning (e.g., dp/ep are data/engine parallel sizes,
replicas is number of server replicas, prefill_pod_count/decode_pod_count are
pod counts for prefill/decode stages, router_config is routing settings, notes
is freeform metadata). Ensure these entries follow the existing Args style and
placement with the other parameters (accelerator, model_name, etc.) in the same
docstring block for consistency.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b23b331d-2b0f-41d7-940e-fbb39bf04705

📥 Commits

Reviewing files that changed from the base of the PR and between a3d954b and 46a22d8.

📒 Files selected for processing (1)
  • manual_runs/scripts/import_manual_runs_json_v2.py

Comment on lines +273 to 279
dp=dp,
ep=ep,
replicas=replicas,
prefill_pod_count=prefill_pod_count,
decode_pod_count=decode_pod_count,
router_config=router_config,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing notes parameter in call to process_benchmark_section.

The notes parameter is accepted by parse_guidellm_json (line 200) but is never forwarded to process_benchmark_section. This means the CLI --notes value will be silently ignored and all rows will have notes=None.

🐛 Proposed fix to forward the notes parameter
             dp=dp,
             ep=ep,
             replicas=replicas,
             prefill_pod_count=prefill_pod_count,
             decode_pod_count=decode_pod_count,
             router_config=router_config,
+            notes=notes,
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
dp=dp,
ep=ep,
replicas=replicas,
prefill_pod_count=prefill_pod_count,
decode_pod_count=decode_pod_count,
router_config=router_config,
)
dp=dp,
ep=ep,
replicas=replicas,
prefill_pod_count=prefill_pod_count,
decode_pod_count=decode_pod_count,
router_config=router_config,
notes=notes,
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@manual_runs/scripts/import_manual_runs_json_v2.py` around lines 273 - 279,
The call to process_benchmark_section is missing the notes argument so the CLI
--notes value parsed in parse_guidellm_json is never forwarded; update the call
that currently passes dp, ep, replicas, prefill_pod_count, decode_pod_count,
router_config (the invocation inside parse_guidellm_json) to include notes=notes
so the notes parameter is propagated into process_benchmark_section and stored
on each row.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant