-
Notifications
You must be signed in to change notification settings - Fork 0
Clean project naming, update scripts and readme #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request rebrands the project from "WorkloadSim" to "FlowSim" and introduces developer workflow improvements. The changes include comprehensive path updates across the codebase, documentation enhancements with step-by-step guides, and new automation scripts for profiling and simulation workflows.
Key changes:
- Renamed all references from
/workloadsimto/flowsimacross tests, Docker files, and core modules - Added
scripts/run_profile.pyandscripts/run_simulate.pyto automate profiling and simulation tasks - Expanded README.md with detailed Getting Started and Developer sections including concrete usage examples
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/utils.py | Updated default artifact and sglang directory paths to use /flowsim |
| tests/unit/*.py | Updated trace file paths in test fixtures and docstrings to use /flowsim |
| tests/integration/*.py | Updated model config paths, working directories, and artifact paths to use /flowsim |
| simulator/base_parser.py | Updated kernel database, NCCL test binary, and unknown kernels paths to use /flowsim |
| scripts/run_simulate.py | New script automating kernel submission to LLMCompass backend with result tracking |
| scripts/run_profile.py | New script automating sglang server launch and profiling workload execution |
| dockerfiles/cuda12.6.dockerfile | Updated working directories, copy paths, maintainer label, and build commands to use /flowsim; added patch application steps |
| backend/LLMCompass | Updated submodule commit to latest version |
| README.md | Complete rewrite with structured Getting Started guide, detailed workflow steps, and developer documentation |
| Makefile | Updated image/container names to flowsim-*; removed MOUNT_VOLUME parameter |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
scripts/run_profile.py
Outdated
| default="/workloadsim/server_profile", | ||
| help="Directory where profiler traces (.trace.json.gz) will be written", | ||
| ) | ||
| p.add_argument( | ||
| "--log-dir", | ||
| default="/workloadsim/tests/test-artifacts", |
Copilot
AI
Dec 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default path still uses old /workloadsim prefix instead of /flowsim. This inconsistency will cause the script to fail when users rely on the default value.
| default="/workloadsim/server_profile", | |
| help="Directory where profiler traces (.trace.json.gz) will be written", | |
| ) | |
| p.add_argument( | |
| "--log-dir", | |
| default="/workloadsim/tests/test-artifacts", | |
| default="/flowsim/server_profile", | |
| help="Directory where profiler traces (.trace.json.gz) will be written", | |
| ) | |
| p.add_argument( | |
| "--log-dir", | |
| default="/flowsim/tests/test-artifacts", |
scripts/run_profile.py
Outdated
|
|
||
|
|
||
| def wait_for_port(host: str, port: int, timeout: int = 600) -> bool: | ||
| """Wait until a TCP port becomes reachable.""" |
Copilot
AI
Dec 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function uses tab characters for indentation instead of spaces. Python PEP 8 recommends using 4 spaces per indentation level. This is inconsistent with the rest of the codebase which uses spaces.
scripts/run_profile.py
Outdated
| def clean_dir(path: str) -> None: | ||
| """Clean or create a directory.""" |
Copilot
AI
Dec 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function uses tab characters for indentation instead of spaces. Python PEP 8 recommends using 4 spaces per indentation level. This is inconsistent with the rest of the codebase which uses spaces.
scripts/run_profile.py
Outdated
| """Wait until a TCP port becomes reachable.""" | ||
| deadline = time.time() + timeout | ||
| while time.time() < deadline: | ||
| try: | ||
| with socket.create_connection((host, port), timeout=2): | ||
| return True | ||
| except Exception: | ||
| time.sleep(1) | ||
| return False | ||
|
|
||
|
|
||
| def clean_dir(path: str) -> None: | ||
| """Clean or create a directory.""" | ||
| if os.path.exists(path): | ||
| for name in os.listdir(path): | ||
| fp = os.path.join(path, name) | ||
| if os.path.isfile(fp) or os.path.islink(fp): | ||
| os.unlink(fp) | ||
| elif os.path.isdir(fp): | ||
| shutil.rmtree(fp) | ||
| else: | ||
| os.makedirs(path, exist_ok=True) | ||
|
|
||
|
|
||
| def parse_args(argv: Optional[list] = None) -> argparse.Namespace: | ||
| p = argparse.ArgumentParser(description="Run sglang profiling workload") | ||
|
|
||
| p.add_argument( | ||
| "--profile-dir", | ||
| default="/workloadsim/server_profile", | ||
| help="Directory where profiler traces (.trace.json.gz) will be written", | ||
| ) | ||
| p.add_argument( | ||
| "--log-dir", | ||
| default="/workloadsim/tests/test-artifacts", | ||
| help="Directory to write server/client logs", | ||
| ) | ||
| p.add_argument( | ||
| "--server-opts", | ||
| required=True, | ||
| help=( | ||
| "All options for sglang.launch_server (include --host, --port, --model-path, --tp, etc). " | ||
| "Example: '--model-path /path --tp 1 --host 0.0.0.0 --port 30001 --disable-cuda-graph'" | ||
| ), | ||
| ) | ||
| p.add_argument( | ||
| "--bench-opts", | ||
| required=True, | ||
| help=( | ||
| "All options for bench_serving.py (include --backend, --host, --port, --dataset-name, --profile, etc). " | ||
| "Example: '--backend sglang --host 0.0.0.0 --port 30001 --dataset-name defined-len --num-prompts 16 --profile'" | ||
| ), | ||
| ) | ||
| p.add_argument( | ||
| "--bench-timeout", | ||
| type=int, | ||
| default=1200, | ||
| help="Timeout in seconds for bench_serving.py", | ||
| ) | ||
|
|
||
| return p.parse_args(argv) |
Copilot
AI
Dec 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function uses tab characters for indentation instead of spaces. Python PEP 8 recommends using 4 spaces per indentation level. This is inconsistent with the rest of the codebase which uses spaces.
| """Wait until a TCP port becomes reachable.""" | |
| deadline = time.time() + timeout | |
| while time.time() < deadline: | |
| try: | |
| with socket.create_connection((host, port), timeout=2): | |
| return True | |
| except Exception: | |
| time.sleep(1) | |
| return False | |
| def clean_dir(path: str) -> None: | |
| """Clean or create a directory.""" | |
| if os.path.exists(path): | |
| for name in os.listdir(path): | |
| fp = os.path.join(path, name) | |
| if os.path.isfile(fp) or os.path.islink(fp): | |
| os.unlink(fp) | |
| elif os.path.isdir(fp): | |
| shutil.rmtree(fp) | |
| else: | |
| os.makedirs(path, exist_ok=True) | |
| def parse_args(argv: Optional[list] = None) -> argparse.Namespace: | |
| p = argparse.ArgumentParser(description="Run sglang profiling workload") | |
| p.add_argument( | |
| "--profile-dir", | |
| default="/workloadsim/server_profile", | |
| help="Directory where profiler traces (.trace.json.gz) will be written", | |
| ) | |
| p.add_argument( | |
| "--log-dir", | |
| default="/workloadsim/tests/test-artifacts", | |
| help="Directory to write server/client logs", | |
| ) | |
| p.add_argument( | |
| "--server-opts", | |
| required=True, | |
| help=( | |
| "All options for sglang.launch_server (include --host, --port, --model-path, --tp, etc). " | |
| "Example: '--model-path /path --tp 1 --host 0.0.0.0 --port 30001 --disable-cuda-graph'" | |
| ), | |
| ) | |
| p.add_argument( | |
| "--bench-opts", | |
| required=True, | |
| help=( | |
| "All options for bench_serving.py (include --backend, --host, --port, --dataset-name, --profile, etc). " | |
| "Example: '--backend sglang --host 0.0.0.0 --port 30001 --dataset-name defined-len --num-prompts 16 --profile'" | |
| ), | |
| ) | |
| p.add_argument( | |
| "--bench-timeout", | |
| type=int, | |
| default=1200, | |
| help="Timeout in seconds for bench_serving.py", | |
| ) | |
| return p.parse_args(argv) | |
| """Wait until a TCP port becomes reachable.""" | |
| deadline = time.time() + timeout | |
| while time.time() < deadline: | |
| try: | |
| with socket.create_connection((host, port), timeout=2): | |
| return True | |
| except Exception: | |
| time.sleep(1) | |
| return False | |
| def clean_dir(path: str) -> None: | |
| """Clean or create a directory.""" | |
| if os.path.exists(path): | |
| for name in os.listdir(path): | |
| fp = os.path.join(path, name) | |
| if os.path.isfile(fp) or os.path.islink(fp): | |
| os.unlink(fp) | |
| elif os.path.isdir(fp): | |
| shutil.rmtree(fp) | |
| else: | |
| os.makedirs(path, exist_ok=True) | |
| def parse_args(argv: Optional[list] = None) -> argparse.Namespace: | |
| p = argparse.ArgumentParser(description="Run sglang profiling workload") | |
| p.add_argument( | |
| "--profile-dir", | |
| default="/workloadsim/server_profile", | |
| help="Directory where profiler traces (.trace.json.gz) will be written", | |
| ) | |
| p.add_argument( | |
| "--log-dir", | |
| default="/workloadsim/tests/test-artifacts", | |
| help="Directory to write server/client logs", | |
| ) | |
| p.add_argument( | |
| "--server-opts", | |
| required=True, | |
| help=( | |
| "All options for sglang.launch_server (include --host, --port, --model-path, --tp, etc). " | |
| "Example: '--model-path /path --tp 1 --host 0.0.0.0 --port 30001 --disable-cuda-graph'" | |
| ), | |
| ) | |
| p.add_argument( | |
| "--bench-opts", | |
| required=True, | |
| help=( | |
| "All options for bench_serving.py (include --backend, --host, --port, --dataset-name, --profile, etc). " | |
| "Example: '--backend sglang --host 0.0.0.0 --port 30001 --dataset-name defined-len --num-prompts 16 --profile'" | |
| ), | |
| ) | |
| p.add_argument( | |
| "--bench-timeout", | |
| type=int, | |
| default=1200, | |
| help="Timeout in seconds for bench_serving.py", | |
| ) | |
| return p.parse_args(argv) |
scripts/run_profile.py
Outdated
| def main(argv: Optional[list] = None) -> int: | ||
| args = parse_args(argv) |
Copilot
AI
Dec 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function uses tab characters for indentation instead of spaces. Python PEP 8 recommends using 4 spaces per indentation level. This is inconsistent with the rest of the codebase which uses spaces.
| else: | ||
| profiled_duration = nb.run_nccl_all_gather_perf( | ||
| cmd_path="/workloadsim/third_party/nccl-tests/build/all_gather_perf", | ||
| cmd_path="/flowsim/third_party/nccl-tests/build/all_gather_perf", |
Copilot
AI
Dec 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path to NCCL all_gather_perf binary was updated to /flowsim, but the corresponding all_reduce_perf path at line 558 still uses /workloadsim. This inconsistency will cause failures when calibrating all_reduce operations.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@copilot open a new pull request to apply changes based on the comments in this thread |
|
@TerrenceZhangX I've opened a new pull request, #7, to work on those changes. Once the pull request is ready, I'll request review from you. |
* Initial plan * Fix indentation and path inconsistencies per review comments Co-authored-by: TerrenceZhangX <39916879+TerrenceZhangX@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: TerrenceZhangX <39916879+TerrenceZhangX@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This pull request rebrands the project from "WorkloadSim" to "FlowSim" and introduces several improvements to the developer and user experience. The most significant changes include renaming project references, updating Docker and build scripts, modernizing and expanding the documentation, and adding a new profiling script for simplified workflows.
Project rebranding and documentation overhaul:
README.mdto provide clearer, step-by-step instructions for profiling, parsing, and simulation workflows, as well as improved developer guidance and troubleshooting tips. [1] [2]Docker and build system improvements:
/flowsiminstead of/workloadsim, including all code copies, working directories, and build instructions. Also updated the maintainer label and improved comments for clarity. [1] [2]Makefileto match the new naming and workflow. [1] [2]Developer workflow enhancements:
scripts/run_profile.py, which automates launching an sglang server and running profiling workloads with customizable options, making it easier to generate traces in Docker or Kubernetes environments.scripts/run_simulate.py, which simulates the kernels in the profiled workloads on LLMCompass backend. Return results will be a summary csv/json file, concluding all kernels' simulation status.(References: [1] [2] [3] [4] [5] [6] [7] [8] [9]