Introduce unified test framework for benchmark and functional tests by lajagapp · Pull Request #2803 · ROCm/TheRock

lajagapp · 2026-01-07T06:51:36Z

Motivation

The existing benchmarks/ directory was originally designed only for performance benchmarks, but as testing needs have evolved, we need support for multiple test types:

Benchmark tests: Performance regression detection against LKG (Last Known Good) baselines
Functional tests: Extended correctness validation with detailed reporting

This PR restructures the testing infrastructure to accommodate both test types under a unified framework, improving maintainability, code reuse, and extensibility.

Naming Note: "Benchmark tests" could also be called "performance tests" - they measure performance and detect regressions. Current naming aligns with existing conventions.

Technical Details

1. Directory Restructure (with git history preservation)

Renamed: benchmarks/ → extended_tests/
New structure: Updated in extended_tests/README.md

2. Key Changes

Shared Infrastructure

Renamed BenchmarkClient → TestClient for use across all test types
Moved utils/ to top-level (extended_tests/utils/) for sharing across test types
Updated all import paths in benchmark scripts to reflect new structure

Functional Test Framework (New)

scripts/: Base class and actual test file for functional tests
configs/: Shared functional test configuration
functional_test_matrix.py: Test matrix for functional tests

Documentation Updates

extended_tests/README.md: Framework overview comparing benchmark vs functional tests
extended_tests/benchmark/README.md: Benchmark-specific documentation
extended_tests/functional/README.md: Functional test documentation
extended_tests/utils/README.md: Updated paths from benchmarks/ to extended_tests/

CI Integration

Updated fetch_test_configurations.py: New module path (extended_tests.benchmark.benchmark_test_matrix)
Updated configure_ci_test.py: Support for functional test matrix
Both benchmark and functional tests run on nightly CI only
Added CI workflow for extended functional tests (test_extended_functional_tests.yml)

Test Plan

Benchmark Tests:

Verified all existing benchmark tests still work with new paths
Confirmed imports updated correctly (extended_tests.benchmark...)

Functional Tests:

Ran test_miopendriver_conv.py locally on gfx94x GPUs
Verified correct parsing of MIOpenDriver output
Confirmed pass/fail detection and summary tables

Results

Nightly-CI run with this PR - https://github.com/ROCm/TheRock/actions/runs/21151021971

Next Steps

Add more functional tests for other libraries
Rename "benchmark tests" to "performance tests" for clarity (separate PR if needed)

Renamed benchmarks/ to test_framework/ with support for benchmark, and functional tests. - Framework restructure (benchmarks → test_framework) - Test type organization (benchmark, functional) - Shared utilities moved to framework level - benchmark_client.py → test_client.py rename - benchmark_test_matrix.py → benchmark_matrix.py rename - Functional test base class (functional_base.py) - Updated documentation (main + benchmark READMEs) - Updated all imports Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Created FunctionalBase class and MIOpen driver convolution test. Features: two-table display, 4 status types, JSON configs, accurate per-command parsing with 'Verifies OK' detection, GitHub Actions integration. Consistent with benchmark test patterns. Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

geomin12 · 2026-01-08T15:47:29Z

+                        log.error(f"Error running command: {e}")
+                        f.write(f"ERROR: {e}\n\n")
+
+        log.info("Test execution complete")


more details to this log pls

geomin12 · 2026-01-08T15:48:13Z

+        test_results = []
+
+        try:
+            with open(self.log_file, "r") as f:


why not add the data to JSON previously? instead of having to parse a .txt file which may be prone to error

Done! I've refactored the code to write test results directly to JSON during execution in run_tests(), eliminating all the complex regex-based log parsing in parse_results(). Now parse_results() simply reads the JSON file - much cleaner, more maintainable, and less error-prone.

geomin12 · 2026-01-08T15:48:36Z

+                    if cmd_match:
+                        # Extract content after this command until next command or end
+                        start_pos = cmd_match.end()
+                        next_cmd_match = re.search(
+                            r"\nCommand:", suite_content[start_pos:]
+                        )
+                        if next_cmd_match:
+                            cmd_section = suite_content[
+                                start_pos : start_pos + next_cmd_match.start()
+                            ]
+                        else:
+                            cmd_section = suite_content[start_pos:]
+
+                        # Check return code in THIS command's section only
+                        return_code_match = re.search(
+                            r"Return code:\s*(\d+)", cmd_section
+                        )
+                        if return_code_match:
+                            return_code = int(return_code_match.group(1))
+                            status = "PASS" if return_code == 0 else "FAIL"
+                        elif "PASSED" in cmd_section:
+                            status = "PASS"
+                        elif "FAILED" in cmd_section or "ERROR" in cmd_section:
+                            status = "FAIL"
+                        else:
+                            status = "FAIL"  # Unknown result, assume fail
+                    else:
+                        status = "FAIL"  # Command not found in log


I think a lot of this can be avoided by using JSON

geomin12 · 2026-01-08T15:49:39Z

just curious, are functional tests run once a day in rocm-ci?

We didn't run these functional & benchmark tests in rocm-ci and these were tested in QA validation. We were running only limited unit tests in rocm-ci previously.

geomin12 · 2026-01-08T15:50:23Z


 from github_actions_utils import *
-from benchmarks.benchmark_test_matrix import benchmark_matrix
+from test_framework.benchmark.benchmark_matrix import benchmark_matrix


could we perhaps be more specific than test_framework? maybe extended or a better word

test_framework is broad

Thanks for the feedback. Renamed top-level folder from test_framework to extended_tests for better clarity and updated all references accordingly.

geomin12 · 2026-01-08T15:53:38Z

is this needed?

Yes, not required. Removed now.

geomin12 · 2026-01-08T15:54:38Z

+            display_name: Display name for reports (e.g., 'MIOpen Driver Convolution')
+        """
+        self.test_name = test_name
+        self.display_name = display_name or test_name.replace("_", " ").title()


i think it's fine to have _ in the test name, no need to do replacing

geomin12 · 2026-01-08T15:55:35Z

+        Returns:
+            True if upload successful, False otherwise
+        """
+        log.info("Uploading Results to API")


can we add functional test results to the log

Already logged! The SUMMARY section prints the full results table with all stats right before the upload (lines 274-276), so the information is already in the logs.

no, as in:

Suggested change

log.info("Uploading Results to API")

log.info("Uploading Functional Tests Results to API")

geomin12 · 2026-01-08T15:56:00Z

+    def get_gpu_id(self) -> str:
+        """Detect GPU ID using rocminfo."""
+        try:
+            result = subprocess.run(
+                ["rocminfo"], capture_output=True, text=True, check=True
+            )
+            # Extract GPU name (e.g., gfx90a, gfx942)
+            match = re.search(r"Name:\s+(gfx\w+)", result.stdout)
+            if match:
+                return match.group(1)
+        except (subprocess.CalledProcessError, FileNotFoundError):
+            log.warning("Could not detect GPU ID, assuming default")
+        return "unknown"


shouldn't this be a part of the utils?

Good catch! Moved get_gpu_id() to the shared utils module (utils/system/hardware.py) since it's common functionality that can be reused across benchmark and functional tests.

geomin12 · 2026-01-08T15:56:46Z

+    try:
+        exit_code = test_instance.run()
+        if exit_code != 0:
+            raise RuntimeError(f"Test failed with exit code {exit_code}")


can we let python handle errors? in the long run, this won't be helpful as failures will just have a very broad error. better to let python do the errors in my opinion

Done! Removed try/except wrapper in run_functional_test_main(), exceptions propagate naturally with full stack traces. Added semantic TestResultError to distinguish execution vs result failures.

- Consolidate common setup and architecture in main README - Move implementation details to benchmark/functional sub-READMEs Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

- benchmark_matrix.py → benchmark_test_matrix.py - functional_matrix.py → functional_test_matrix.py - Update all imports and documentation references Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

- Move get_gpu_architecture() to hardware.py - Simplify display_name default Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

- Add TestResultError for result failures vs execution errors - Remove sys.exit() - let exceptions propagate naturally - Add progress indicators and better error messages - Consistent exception pattern across all test files Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

HereThereBeDragons · 2026-01-20T13:34:56Z

The reaction to my review seem AI created and in some place absolutely wrong.
I will not further review this PR until you, @lajagapp, manually reviewed it and fixed it.

lajagapp · 2026-01-20T16:08:11Z

The reaction to my review seem AI created and in some place absolutely wrong.
I will not further review this PR until you, @lajagapp, manually reviewed it and fixed it.

I’ve addressed all the comments except for this one, which I’m checking now - #2803 (comment).
If there are any specific responses that still need clarification or further fixes, please let me know and I will update them accordingly.

The reaction to my review seem AI created and in some place absolutely wrong. I will not further review this PR until you, @lajagapp, manually reviewed it and fixed it.

Hi @HereThereBeDragons - I’ve reviewed and addressed all the comments.
If any response needs further correction or clarification, please let me know and I will update it.

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

stellaraccident · 2026-01-20T17:19:34Z

The reaction to my review seem AI created and in some place absolutely wrong.
I will not further review this PR until you, @lajagapp, manually reviewed it and fixed it.

I’ve addressed all the comments except for this one, which I’m checking now - #2803 (comment). If there are any specific responses that still need clarification or further fixes, please let me know and I will update them accordingly.

The reaction to my review seem AI created and in some place absolutely wrong. I will not further review this PR until you, @lajagapp, manually reviewed it and fixed it.

Hi @HereThereBeDragons - I’ve reviewed and addressed all the comments. If any response needs further correction or clarification, please let me know and I will update it.

Please always review, understand and take ownership of AI generated content before asking someone else to give their attention. Repeated failures to do so will result in automatic rejection of future contributions.

geomin12

please rebase your changes with main, so I am able to see code changes properly

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

lajagapp · 2026-01-27T09:18:04Z

please rebase your changes with main, so I am able to see code changes properly

I've rebased the PR with latest main branch.

geomin12

Please take a look at the changes, there are a lot of files that are not a part of this PR. Please rebase again or perhaps open a clean branch and make updates there

geomin12 · 2026-01-31T00:14:08Z


+  run_linux_extended_functional_tests:
+    needs: [build_portable_linux_artifacts]
+    name: Run Extended Functional Tests


let's add linux to the name

Added Linux in name.

geomin12 · 2026-01-31T00:14:22Z


+  run_windows_extended_functional_tests:
+    needs: [build_windows_artifacts]
+    name: Run Extended Functional Tests


let's add windows to the name

Added Windows in name.

geomin12 · 2026-01-31T00:16:50Z

+          IS_EXTENDED_FUNCTIONAL_TESTS: "true"
+        run: python ./build_tools/github_actions/fetch_test_configurations.py
+
+  run_tests:


better job name please, run_tests is too generic

Update job name to run_extended_functional_tests.

geomin12 · 2026-01-31T00:19:25Z

instead of extended_tests folder, why not just have:

benchmark_tests/ functional_tests/

seeing benchmark tests under extended_tests leads me to believe that test_extended_functional_tests might run that. let's separate them out, or call the workflow file test_functional_tests. whatever you think is best!

Thanks for the feedback. I'm keeping extended_tests/ as the parent folder since benchmarks and functional tests share utilities and follow identical patterns (execution → parsing → upload).

Renamed workflow from test_extended_functional_tests.yml to test_functional_tests.yml. And updated all its references and variables.

geomin12 · 2026-01-31T00:21:54Z

+        Returns:
+            True if upload successful, False otherwise
+        """
+        log.info("Uploading Results to API")


no, as in:

Suggested change

log.info("Uploading Results to API")

log.info("Uploading Functional Tests Results to API")

geomin12 · 2026-01-31T00:27:49Z

+                        backward_flags = self.gpu_specific_flags[gfx_id].get(
+                            "backward_flags", ""
+                        )
+                        if backward_flags:
+                            full_cmd = f"{full_cmd} {backward_flags}"


Suggested change

backward_flags = self.gpu_specific_flags[gfx_id].get(

"backward_flags", ""

)

if backward_flags:

full_cmd = f"{full_cmd} {backward_flags}"

backward_flags = self.gpu_specific_flags[gfx_id].get(

"backward_flags", ""

)

full_cmd = f"{full_cmd} {backward_flags}"

Suggested changes are updated.

geomin12 · 2026-01-31T00:29:37Z

+        Returns:
+            tuple: (test_results list, detailed PrettyTable, number of test suites)
+        """
+        log.info("Parsing Results")


parsing results for miopendriver, please add more specifics otherwise these logs are useless

Updated the info with {self.display_name}.

geomin12 · 2026-01-31T00:29:45Z

+        with open(self.results_json, "w") as f:
+            json.dump(all_results, f, indent=2)
+
+        log.info(f"Results saved to {self.results_json}")


for miopendriver

Updated the info with {self.display_name}.

geomin12 · 2026-01-31T00:30:29Z

+
+                        for line in process.stdout:
+                            log.info(line.strip())
+                            f.write(line)


i notice we are writing to a file but it isn't being used anywhere. do we need to write to a file?

You're correct. Since we are using generated JSON data for the parsing the results, this writing to file is not required and we can remove it.
Now, removed log file definition and all the file write operations.

geomin12 · 2026-01-31T00:31:25Z

+        "fetch_artifact_args": "--miopen --tests",
+        "timeout_minutes": 30,
+        "test_script": f"python {_get_script_path('test_miopendriver_conv.py')}",
+        # TODO(lajagapp): Add windows support


can we make a github issue and tag it here?

Created github issue and updated in TODO.

If creating an issue, this should be linked here in the code and you should not solely link your GH handle.

- Remove write_step_summary wrapper and redundant log file writing - Use display_name consistently in logging Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

- Rename test_extended_functional_tests.yml to test_functional_tests.yml - Update all variables and references in workflow files and documentation Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

marbre · 2026-02-03T14:01:40Z

This already has several reviewers involved but kind of feels of the rails just looking at the number of review comments. Most answers to comments seem AI generated. I would strongly suggest to split this into smaller chunks and more PRs. Will unassign myself and leave review to others.

geomin12

i would agree with Marius, let's split this up into smaller chunks to make it faster to iterate and easier to understand what changes are occuring.

In my opinion, let's split it up to:

adding functional tests
organizing files under new folder

lajagapp · 2026-02-05T17:12:06Z

smaller chunks

Thanks for the suggestion. Sure, I will split this PR into multiple smaller PRs and will add you for the review. Now, I'm closing this PR.

## Motivation Bump rocm-systems from 93bc019 to 093b66caa3 (includes fix for hip-tests issue and revert for mathlib hiprtc issues and revert for rccl-test, added revert for miopen failures due to PR 653): Commits: 093b66caa3 (HEAD, origin/develop, origin/HEAD) Revert "SWDEV-546177 - hipModuleGetLoadingMode API impl (#653)" (#3858) d8a0adbc9f [AMD-SMI] Hide libamd_smi.so internal symbols (#3777) d4da458f94 [rocprofiler-sdk] [Documentation ] Updating changelog (#3827) 19fadeb082 (origin/users/abchoudh/fix_dispatch_count) [RCCL][Tuner Plugin] Enable tuning of RCCL tuning constants (#3757) b4f5f8a6a8 rocr: Fix IPC dmabuf hang with large allocations (#3211) 64efea0435 RCCL: allow users to override max and per job memory & fix defaults. (#3797) 9b3dd101bb Removing ready_for_review (#3849) 7e43880a64 [rocprofiler-systems] Update ROCm version to 7.2.0 in CI workflows for Debian, RedHat, and Ubuntu (#3431) 1fdb6b9827 [rocshmem] add gda/topology unit tests (#3715) be1ea24a96 Move hipMipmappedArrayGetMemoryRequirements test to common tests e4513f04c8 Update amdgpu-windows-interop with latest changes, pal 58aa0bab2ced0cc9ebe8d2d0932db6774feb4e49 2026-03-04(#3773) b1f964d796 [rocprofiler-compute] Ensure long kernel name fully shows in compute analyze (#3665) 4dcf1e3ce0 SWDEV-567112 - Replace test names (#3787) 33f5f302e5 ROCM-2428 - fixes hipStreamBatchMemOp invalid operation checks (#3099) 139f4bfff8 [SWDEV-556456] Align HIP_UUID with rocminfo (#3614) 8e8928544c Reduce buffers alignment to 4 bytes (#3821) 51be29a647 AIRUNTIME-125: Consolidate Windows optimization and debug flags (#3825) 1407392240 [AMD-SMI] CI: Fix root workflow to use ASIC-specific test filters (#3807) 63f78a98d7 (origin/users/mcao/fix_rocpdsummary) [ROCM-SMI] Fix DRM include dirs leaking absolute build paths to consumers (#3808) caf2f7e1eb [ROCM-186] amd-smi: Add support for a VRAM and GTT tuning interface (#3636) a0712d4c2a [TheRock CI] Update projects_to_test lists (#3749) 02090c42c9 rocrtst: install gfx .hsaco files to share/rocrtst (#3744) 4a0a1cbfce Merge other simd table (#3696) 0d07657d78 Add missing kwargs from rocprofiler_add_integration_validate_test in .cmake-format.yaml (#2336) 3a3df301dc Optimize device counting service GPU interactions (#1583) 95d9da0098 Add SPM Enable flag in build infrastructure (#3677) 12bb9435b2 [rocprofiler-sdk] On-demand GPU profile queue creation/destruction (#3586) 941057c2c0 Navi4 tuning table iter 1 (#3052) dbf2b7369f [AMD-SMI] Display N/A for cu_occupancy when file is unavailable (#3589) b0efc7c639 [RCCL] [UT] Add ROCTX test (#3625) ba7a20ea18 Reducing the p2pnChannels for half-subscription A2A on multi-node MI350 (#3381) 75238c98a2 [clr] Fix memory leak in getOrCreateHostcallBuffer (#3699) af2ee0e8ad [hip-tests] ASAN Check for image support before we create context (#3834) ad4496678e Update windows ci subtree in include amdgpu-windows-interop (#3814) c8ad252208 [rocprofiler-register] Fix compilation with system fmt/glog (#1243) 781881544d Update README to include dbgapi and debug agent components (#3731) 88e4a7837e ROCProfiler and ROCTracer: Modifying deprecation note (#3831) b5918a5f35 [ROCM-3124-3125-3126] CUID file generation hangs on MI350 systems/CUID test failures/Segmentation fault in CUID example code (#3548) 97a5dd993c Update copyright to use SPDX IDs (#3805) 511730ab45 [rocshmem]: add flood-amo tester (#3653) 2d650a0065 [clr] Fix heap use after free error in device allocations (#3789) b6b179ad81 Disable hipHostRegister_Negative test for ASAN (#3832) 39ec318c8d [RCCL] Add GDA alltoallv via rocshmem integration (#3613) fb0f4d53b1 [RCCL] [CUMEM] Fix cuMem multi-process runs (#3811) c3de7d4bf6 SWDEV-526201 - Fix and enable disabled HIP tests from warp group (#3089) 8d9a8ca161 roofline: code cleanup and refactor vector types (#3813) 8957e49028 Don't wait on command completion if worker thread is destroyed (#3790) 9e7586a5fa [rocshmem] Add barrier APIs and expose `ROCSHMEM_TEAM_WORLD` on device (#3651) 91b09235b0 Revert "fix local gpu release static build failure (#3667)" (#3799) 0fda754b1b libhsakmt: Add secondary KFD context creation support ee43db95b0 Revert "Update TheRock reference to 20260303 commit (#3709)" (#3826) 86e28b9fae Added fix to update GL2C counters instance count for GFX11.5 (#3100) 93f69f7de4 Adjust includes to match use (#3742) e9fbc3f1a2 (develop) Update TheRock reference to 20260303 commit (#3709) be0675a1a6 (HEAD) Revert "Support fp8 types in hiprtc (#2605)" (#3792) 3e3a94a4ef [rocprofiler-systems] Add trace_cache support for std::optional<T> serialization (#3490) 0b42a7f472 clr: Eliminate unnecessary kernel name string copies (#3774) b6b0d77b29 rocr: Add hsa_amd_memory_async_batch_copy API for batched memory copies (#3259) 486e6d12d2 Resolve staircase RS regression with 48 max channels (#3684) eb59c85ac4 [gfx942][gfx950] Leverage new cache bypass builtins for simple protocol where available (#2847) 4d74d27f0e (origin/users/raramakr/rocm-smi-target) Revert "Auto Labeler: Add ci:regression-detection label to rccl PRs (#3543)" (#3769) 8f0795517c [AMD-SMI] CI: Use ASIC-specific test blacklists in workflows (#3775) 7cef5b64c1 Fix MFMA total FLOPS calculation (#3371) aea37512ba Remove duplicated tests (#3235) b6c656fdd4 Remove duplicated tests in memory module (#3087) ca3137d8f9 [rocprofiler-sdk] Install integration tests without building for therock & Misc. fixes (#3047) 0ab5c41f65 [rdc] Enable on-demand queue mode in rocprofiler-sdk to prevent inference degradation (#3629) a1eb2a1f7c rocr/wsl: a library should not output to std::out by default (#3718) b7da296cc8 Reenable flood_put/get testers on mlx5 since they should work after pr2732 (#3748) 000e24de2f [rocprofiler-sdk] Add automatic late-start support to rocprofiler_force_configure (#2168) 64ea87f592 [hip-tests] Fix memory leaks in hipMemPoolTrimTo tests (#3643) 543a7d765f rocr: Include code object allocs in lightweight coredump a58da378d4 [rocdecode] - update rocdecode ctest (#3768) f88e4ee44d [rocprofiler-systems] Make CDash submit non-fatal and add GitHub Actions logging (#3525) cb14debc3a [rocprofiler-systems] Update nlohmann-json submodule (#3391) 449253009a SWDEV-567112 - Introduce new mechanism for tagging and disabling tests - Part 2 (#3707) 8ca991393d disabling rccl from full build (linux), covered in RCCL CI (#3770) c4fdb20b74 [ATT] Re-enable tests. Add option to specify perf to target CU only (#2819) 615aab95ed ROCM-3816 Out of Memory fix (#3588) 8ffad41b24 Fix rocm_smi64 exporting invalid absolute paths to consumers (#3717) 042d76a626 rocr: Remove dependency on KFD in Runtime::VMemoryHandleMap (#2515) 555db59b2a [AMD-SMI] CPU: Added support for family 1A Models 50h-57h (#3206) 3affa2c7a3 [SWDEV-555935] Fix shared mutex and self-heal (#3729) ba0bf0f3db Replace hipMemGetInfo with ihipMemGetInfo and use it for internal calls. (#2845) c5cef9b18e Fix HIP_RETURN on all HIP API calls. (#2838) 241ce7ba83 Revert "memory: fix "contiguous_bytes" calculation in generic conversion (#3285)" (#3755) 8a690f482e [kpack/clr] Windows PE/COFF support for kpack artifact splitting and runtime loading (#3728) 863bdf8aa8 MFMA pre-processor guards for ipc.hip (#3724) 90bb9b1921 Release queue outside of vgpusAccess lock (#3705) de4523910c clr: Add build support of ROCR and PAL backends together (#3722) dfb7abc2d8 [rocprofiler-sdk] RCCL API changes for RCCL_API_TRACE_VERSION_PATCH = 3 (#3477) d69d4f23db [AICOMRCCL-633] - Fixed warnings in tests (#3402) 067d86dcaa rocr/wsl: Disable AQL Queue usage with flag ROCR_USE_PM4 (#3663) 594eb60d42 [TheRock CI] rocm-systems build full ROCm stack (#3182) 27d17e8ea0 [ROCProfiler-SDK] Fix SWDEV-556922: Handle comments before checking for pmc: (#1723) c80d90439d memory: fix "contiguous_bytes" calculation in generic conversion (#3285) 669987c83f [hip-tests] ASAN - add missing release handles (#3735) a24bbd75a4 fix local gpu release static build failure (#3667) 259b2ff913 Speed up DeviceId (#2803) 65d9264bf4 Simplify MPI trace merge logic and remove legacy guards (#3562) 1076c083cb use system to look for zcat path instead (#3720) 22f1d19db3 [AICOMRCCL-355] Enable threshold-based p2p-batching (#3000) a2e4c794d2 Partially flatten template tests cases (#2597) e242abe219 Pass space separated gfx target list to RCCL build command (#3701) 4f78aea66d SWDEV-570074 - Refactor Memset memory object handling. (#2228) b3ad12d834 Support Nvidia build on theRock for HIP-tests (#3335) a1cf15ea9a Support fp8 types in hiprtc (#2605) 8ef84b0a50 [rocprofiler-systems] Add HPC examples to automated testing (#3437) db3a70dfa0 Free memory which was allocated in tests (#3710) 27e6809c7e [rocprofiler-systems]: Fix rhel CI failure on for MPI and UCX tests (#3700) 0d9aaf59d8 rccl/topo_expl: fix build issue. (#3719) be04d75765 Fix zcat path used for checking kernel configs (#3423) cab60a7b27 rocr/thunk/win: Add CU mask support (#3518) 5b3d826c05 [CUMEM] Initial support for cuMem APIs (#2763) 0606ff491f [HIP] [PLAT-194496] Improve Stress_hipMalloc_HighSizeAlloc reliability (#3550) 05750a77cc fix hip-test name in config (#3716) 33f777f3e9 hsakmt: Remove --high functionality from run_kfdtest.sh (#2486) e4c46e3480 Hide the retain under direct dispatch check (#3698) bfe0ca0279 Add rocprof trace decoder to CI tests (#3690) a769b6f54e [rocSHMEM] Edgar/abstract allocator ipc part1 (#3411) 659fb52243 [AMD-SMI] Fix bugs, improve error handling, and clean up NIC/switch code (#3654) 0eb26ea571 hsakmt: Fix Import/Export of dmabuf_fd for WSL/Windows (#3348) a122936abb [SWDEV-567812] Add UBB power and power_limit fields to npm_info (#3262) c3bec090c5 [rocprofiler-sdk][rocprofv3][rocpd] Updates for KFD data (#340) 7c44d47740 SWDEV-547659 - Remove HIP_VERSION_GITHASH in logs (#448) 74b6487a6a SWDEV-547008 - Documentation fix for function return values (#463) af21cd44f1 SWDEV-545553 - Improve clarity and robustness of CALLBACK unit tests (#546) 180d639044 SWDEV-544900 - Change hip-test test case name (#547) feeca99950 Doc improvements (#3688) c1822b6336 ROCprofiler-SDK: deprecation of legacy tools (#3609) 5d7aff8462 Fix rocprof-compute-viewer link (#3459) 0b0b4846f0 AIRUNTIME-129 - Fix Ocl test failures of 2D image with pitches. (#3584) ac569b87e0 Fix memory tests config (#3687) 603fe7a5cf [hip-tests] Enable hipMipmappedArrayGetMemoryRequirements test via cmake 4fad4452d9 [hip] Docs: Updates to some memory management pages 8cc59559fe AICOMRCCL-656 fix memory leak in ncclCommInitRankFunc (#3628) 94a4595a5d Fix missing amd_comgr linkage in pc-sampling integration test (#3453) 2a68565dce rocrtst: CMAke file: strip xnack/feature suffixes from gfxNum in build_kernel (#3652) c3542bfb2b [rocprofv3] Deprecating input text files for counter collection (#1562) ff122e7ed7 SWDEV-573073 - Cleanup hipHostAlloc/Malloc/Register tests (#3017) 5b1deaf29d SWDEV-567112 - Introduce new mechanism for tagging and disabling tests - Part 1 - Core (#2351) 6e0cc309e1 rocrtst: MaxSingleAllocationTest: skip CPU NUMA nodes >0 (#3208) d65f601195 [AICOMRCCL-667] rccl: Change GDR selection logic. (#3607) f1c44ab200 Patch Back to Old Repo: fixes from manual runs (#3621) fe53bcd715 [AMD-SMI] Allow amdsmi init to succeed when no NIC hardware is present (#3403) b25600efdb [ROCM SMI] Fix fw pldm version not displayed in default amd-smi (#3594) 169d2ef763 root to module wiring, remove legacy source collection (#3482) 7469781988 [LRT][clr] SWDEV-512963-Fix CTS test failures for 1D buffer copy (#3520) c8f55d9b86 Adding rocprof trace decoder (#3576) 425e983502 Trace decoder codeowners (#3600) a176efd648 [hip-tests] Add return statements to HIP_SKIP_TEST (#3647) 32687cf183 rocrtst: CPUAccessToGPUMemoryTest: Cap host allocation to 512 MB under ASAN (#3407) 97c0206753 Update codeowners for thunk DXG (#3334) be44b28bb6 [rocdecode][rocjpeg] - ctest CMakeLists cleanup (#3632) 80ff0b8942 Various memory leak fixes in hip-tests (#3605) 0988f67a85 fix typo in help text (#3314) 9f823c53f1 Fix CUID file lookup by loading files before searching entries (#3436) 064c89261b SWDEV-546177 - hipModuleGetLoadingMode API impl (#653) 006213e112 ROCM-2696: Ignare size and base if null ptr (#3336) 6060b99d83 Improve atomic min max test perf (#2580) 3fbcc13602 Change printf capture impl (#1127) 93bc01937c (tag: hip-version_7.12.60610, origin/users/mradosav-amd/rocprofsys-selective-region) [ROCM-CORE] Update rdhc script to support rocm install prefix (ROCm/rocm-systems#3596) [AICOMRCCL-355]: https://amd-hub.atlassian.net/browse/AICOMRCCL-355?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

lajagapp added 6 commits January 6, 2026 13:12

Fix import errors

64d9a40

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Fix pre-commit errors

7db3292

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Merge branch 'main' into users/lajagapp/add-miopen-driver-test

ee49925

Update overall README.md

8013e71

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

github-project-automation Bot added this to TheRock Triage Jan 7, 2026

github-project-automation Bot moved this to TODO in TheRock Triage Jan 7, 2026

lajagapp requested review from amd-aakash, geomin12, gkathirv, jayhawk-commits and kiranzangam January 7, 2026 06:52

lajagapp added 3 commits January 7, 2026 15:47

Merge main into users/lajagapp/add-miopen-driver-test

fb3521b

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Merge branch 'main' into users/lajagapp/add-miopen-driver-test

0d2b187

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Fix pre-commit errors

398c9b7

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

geomin12 reviewed Jan 8, 2026

View reviewed changes

lajagapp added 6 commits January 12, 2026 10:53

Merge branch 'main' into users/lajagapp/add-miopen-driver-test

7f6bbd3

Merge branch 'main' into users/lajagapp/add-miopen-driver-test

bf9cf14

Refactor test framework READMEs to eliminate duplication

a56f9c4

- Consolidate common setup and architecture in main README - Move implementation details to benchmark/functional sub-READMEs Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Rename test matrix files for clarity

48d75c7

- benchmark_matrix.py → benchmark_test_matrix.py - functional_matrix.py → functional_test_matrix.py - Update all imports and documentation references Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Update functional_base.py

049af2e

- Move get_gpu_architecture() to hardware.py - Simplify display_name default Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Merge branch 'main' into users/lajagapp/add-miopen-driver-test

1ef8ecb

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

subodh-dubey-amd force-pushed the main branch from 7e9ba03 to 3077ce8 Compare January 13, 2026 12:08

subodh-dubey-amd requested review from a team, ScottTodd and marbre as code owners January 13, 2026 12:08

lajagapp added 3 commits January 13, 2026 14:45

Merge branch 'main' into users/lajagapp/add-miopen-driver-test

a5957e3

Refactor test_miopendriver_conv.py with JSON result parsing

c87ecac

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

lajagapp requested a review from HereThereBeDragons January 19, 2026 20:47

lajagapp and others added 3 commits January 20, 2026 16:17

Update README with test_extended_functional_tests.yml

6590d7e

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Merge branch 'main' into users/lajagapp/add-miopen-driver-test

f0568fe

Fix pre-commit errors

52cae62

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

geomin12 reviewed Jan 26, 2026

View reviewed changes

lajagapp added 2 commits January 26, 2026 17:34

Merge branch 'main' into users/lajagapp/add-miopen-driver-test

7387c37

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Restore rocm-libraries submodule to match main

fece860

lajagapp requested a review from geomin12 January 26, 2026 17:49

geomin12 reviewed Jan 30, 2026

View reviewed changes

Merge branch 'main' into users/lajagapp/add-miopen-driver-test

39deb97

geomin12 self-requested a review January 30, 2026 17:07

geomin12 reviewed Jan 31, 2026

View reviewed changes

Merge branch 'main' into users/lajagapp/add-miopen-driver-test

e9e6ed1

lajagapp mentioned this pull request Feb 2, 2026

[CI] Add Windows support for MIOpenDriver test (extended functional test) #3207

Open

lajagapp added 3 commits February 2, 2026 20:15

Improve functional test logging and remove redundancy

49a64c0

- Remove write_step_summary wrapper and redundant log file writing - Use display_name consistently in logging Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Rename functional test workflow for clarity

7329da8

- Rename test_extended_functional_tests.yml to test_functional_tests.yml - Update all variables and references in workflow files and documentation Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

Fix pre-commit errors

f58d124

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>

lajagapp requested a review from geomin12 February 2, 2026 20:57

marbre removed their request for review February 3, 2026 14:01

geomin12 reviewed Feb 4, 2026

View reviewed changes

lajagapp closed this Feb 5, 2026

github-project-automation Bot moved this from TODO to Done in TheRock Triage Feb 5, 2026

rahulc-gh mentioned this pull request Mar 4, 2026

Bump rocm-systems submodule from 93bc019 to 093b66caa3 #3754

Merged

	log.info("Uploading Results to API")
	log.info("Uploading Functional Tests Results to API")

Conversation

lajagapp commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

1. Directory Restructure (with git history preservation)

2. Key Changes

Shared Infrastructure

Functional Test Framework (New)

Documentation Updates

CI Integration

Test Plan

Results

Next Steps

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HereThereBeDragons commented Jan 20, 2026

Uh oh!

lajagapp commented Jan 20, 2026

Uh oh!

stellaraccident commented Jan 20, 2026

Uh oh!

geomin12 left a comment

Choose a reason for hiding this comment

Uh oh!

lajagapp commented Jan 27, 2026

Uh oh!

geomin12 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lajagapp commented Jan 7, 2026 •

edited

Loading

geomin12 left a comment •

edited

Loading