Skip to content

JSON support in MemoryAnalysis handler #26

Merged
coleramos425 merged 4 commits intomainfrom
colramos/memAnalysis-json-v2
Jul 11, 2025
Merged

JSON support in MemoryAnalysis handler #26
coleramos425 merged 4 commits intomainfrom
colramos/memAnalysis-json-v2

Conversation

@coleramos425
Copy link
Collaborator

@coleramos425 coleramos425 commented Jun 30, 2025

This PR is an iteration on #21 that instead logs MemoryAnalysis output in the logduration message handler. The plugin behaves same as before in most instances, however now when user provides -t json JSON output will print to console. Additionally, the user can specify a log file (e.g. -l output.json) to save this JSON structure to an output file for further post-processing.

Example usage:
root@51a53ca4a7c7:/workspace# logdur_install/bin/logDuration/omniprobe -i -a MemoryAnalysis -t json -- audacious/maestro/examples/build/bank_conflict/b2b_matrix_transpose 
  ___                  _                 _          
 / _ \ _ __ ___  _ __ (_)_ __  _ __ ___ | |__   ___ 
| | | | '_ ` _ \| '_ \| | '_ \| '__/ _ \| '_ \ / _ \
| |_| | | | | | | | | | | |_) | | | (_) | |_) |  __/
 \___/|_| |_| |_|_| |_|_| .__/|_|  \___/|_.__/ \___|
                        |_|                         

Found config file at /workspace/logdur_install/bin/logDuration/runtime_config.txt

Omniprobe is developed by Advanced Micro Devices, Research and Advanced Development
Copyright (c) 2025 Advanced Micro Devices. All rights reserved.

No Triton cache location provided; assuming HIP run.
HANDLER: libMemAnalysis64.so
Memory Analysis Wrapper loaded.
handlerManager: OpenedlibMemAnalysis64.so
Adding /workspace/audacious/maestro/examples/build/bank_conflict/b2b_matrix_transpose
Found 12 kernels
Adding linux-vdso.so.1
Adding /opt/rocm/lib/libamdhip64.so.6
Adding /lib/x86_64-linux-gnu/libstdc++.so.6
Adding /lib/x86_64-linux-gnu/libm.so.6
Adding /lib/x86_64-linux-gnu/libgcc_s.so.1
Adding /lib/x86_64-linux-gnu/libc.so.6
Adding /opt/rocm/lib/librocprofiler-register.so.0
Adding /opt/rocm/lib/libamd_comgr.so.2
Adding /opt/rocm/lib/libhsa-runtime64.so.1
Adding /lib/x86_64-linux-gnu/libnuma.so.1
Adding /lib64/ld-linux-x86-64.so.2
Adding /lib/x86_64-linux-gnu/libz.so.1
Adding /lib/x86_64-linux-gnu/libzstd.so.1
Adding /lib/x86_64-linux-gnu/libelf.so.1
Adding /opt/amdgpu/lib/x86_64-linux-gnu/libdrm.so.2
Adding /opt/amdgpu/lib/x86_64-linux-gnu/libdrm_amdgpu.so.1
Adding /workspace/logdur_install/lib/logDuration/liblogDuration64.so
Adding /workspace/logdur_install/lib/libdh_comms.so
Adding /workspace/logdur_install/lib/libkernelDB64.so.1
Adding /lib/x86_64-linux-gnu/libdwarf.so.1
Adding /workspace/logdur_install/lib/libMemAnalysis64.so
Adding /workspace/audacious/maestro/examples/build/bank_conflict/b2b_matrix_transpose
[...Removed for brevity...]
>>>>>>>> HSA intercept registered.
{
  "kernel_analyses": [
    {
      "kernel_info": {
        "name": "matrixTransposeShared_0(float*, float const*, int, int) [clone .kd]",
        "dispatch_id": 1
      },
      "cache_analysis": {
        "accesses": [
          {
            "source_location": {
              "file": "/workspace/audacious/maestro/examples/bank_conflict/b2b_matrix_transpose/b2b_matrix_transpose.hip",
              "line": 40,
              "column": 38
            },
            "code_context": "tile[threadIdx.y][threadIdx.x] = in[y * width + x];",
            "access_info": {
              "type": "read",
              "execution_count": 16384,
              "ir_bytes": 4,
              "isa_bytes": 4,
              "isa_instruction": "global_load_dword",
              "cache_lines": {
                "needed": 65536,
                "used": 65536
              }
            }
          },
          {
            "source_location": {
              "file": "/workspace/audacious/maestro/examples/bank_conflict/b2b_matrix_transpose/b2b_matrix_transpose.hip",
              "line": 49,
              "column": 25
            },
            "code_context": "out[y * height + x] = tile[threadIdx.x][threadIdx.y];",
            "access_info": {
              "type": "write",
              "execution_count": 16384,
              "ir_bytes": 4,
              "isa_bytes": 4,
              "isa_instruction": "global_store_dword",
              "cache_lines": {
                "needed": 65536,
                "used": 65536
              }
            }
          }
        ]
      },
      "bank_conflicts": {
        "accesses": [
          {
            "source_location": {
              "file": "/workspace/audacious/maestro/examples/bank_conflict/b2b_matrix_transpose/b2b_matrix_transpose.hip",
              "line": 40,
              "column": 36
            },
            "code_context": "tile[threadIdx.y][threadIdx.x] = in[y * width + x];",
            "access_info": {
              "type": "write",
              "execution_count": 16384,
              "ir_bytes": 4,
              "total_conflicts": 0
            }
          },
          {
            "source_location": {
              "file": "/workspace/audacious/maestro/examples/bank_conflict/b2b_matrix_transpose/b2b_matrix_transpose.hip",
              "line": 49,
              "column": 27
            },
            "code_context": "out[y * height + x] = tile[threadIdx.x][threadIdx.y];",
            "access_info": {
              "type": "read",
              "execution_count": 16384,
              "ir_bytes": 4,
              "total_conflicts": 229376
            }
          }
        ]
      }
    }
  ],
  "metadata": {
    "version": "null",
    "kernels_found": 1,
    "timestamp": "2025-06-30 17:00:30",
    "gpu_info": {
      "architecture": "gfx90a",
      "cache_line_size": 128
    }
  }
},
54525952 bytes processed in 1.878942 seconds (29.0 MiB/s)
{
  "kernel_analyses": [
    {
      "kernel_info": {
        "name": "matrixTransposeShared_1(float*, float const*, int, int) [clone .kd]",
        "dispatch_id": 2
      },
      "cache_analysis": {
        "accesses": [
          {
            "source_location": {
              "file": "/workspace/audacious/maestro/examples/bank_conflict/b2b_matrix_transpose/b2b_matrix_transpose.hip",
              "line": 63,
              "column": 38
            },
            "code_context": "tile[threadIdx.y][threadIdx.x] = in[y * width + x];",
            "access_info": {
              "type": "read",
              "execution_count": 16384,
              "ir_bytes": 4,
              "isa_bytes": 4,
              "isa_instruction": "global_load_dword",
              "cache_lines": {
                "needed": 65536,
                "used": 65536
              }
            }
          },
          {
            "source_location": {
              "file": "/workspace/audacious/maestro/examples/bank_conflict/b2b_matrix_transpose/b2b_matrix_transpose.hip",
              "line": 72,
              "column": 25
            },
            "code_context": "out[y * height + x] = tile[threadIdx.x][threadIdx.y];",
            "access_info": {
              "type": "write",
              "execution_count": 16384,
              "ir_bytes": 4,
              "isa_bytes": 4,
              "isa_instruction": "global_store_dword",
              "cache_lines": {
                "needed": 65536,
                "used": 65536
              }
            }
          }
        ]
      },
      "bank_conflicts": {
        "accesses": [
          {
            "source_location": {
              "file": "/workspace/audacious/maestro/examples/bank_conflict/b2b_matrix_transpose/b2b_matrix_transpose.hip",
              "line": 63,
              "column": 36
            },
            "code_context": "tile[threadIdx.y][threadIdx.x] = in[y * width + x];",
            "access_info": {
              "type": "write",
              "execution_count": 16384,
              "ir_bytes": 4,
              "total_conflicts": 0
            }
          },
          {
            "source_location": {
              "file": "/workspace/audacious/maestro/examples/bank_conflict/b2b_matrix_transpose/b2b_matrix_transpose.hip",
              "line": 72,
              "column": 27
            },
            "code_context": "out[y * height + x] = tile[threadIdx.x][threadIdx.y];",
            "access_info": {
              "type": "read",
              "execution_count": 16384,
              "ir_bytes": 4,
              "total_conflicts": 229376
            }
          }
        ]
      }
    }
  ],
  "metadata": {
    "version": "null",
    "kernels_found": 1,
    "timestamp": "2025-06-30 17:00:32",
    "gpu_info": {
      "architecture": "gfx90a",
      "cache_line_size": 128
    }
  }
}
54525952 bytes processed in 3.588229 seconds (15.2 MiB/s)
Memory Analysis Wrapper unloaded.
Comms Runner shutting down
Cache Watcher shutting down

@coleramos425 coleramos425 requested a review from rwvo July 10, 2025 16:44
@coleramos425
Copy link
Collaborator Author

Most recent commit finalizes this PR by ensuring valid JSON syntax when producing a MemoryAnalysis report for an application with several dispatches
71a9d32

The core challenge is creating a mechanism that can reliably signal when all processing is complete, allowing for final JSON output refinement without disrupting the existing plugin architecture. Implementing this logic in the omniprobe module seems to be the simplest solution

Signed-off-by: coleramos425 <colramos@amd.com>
Signed-off-by: coleramos425 <colramos@amd.com>
…ess for proper JSON formatting

Signed-off-by: coleramos425 <colramos@amd.com>
…patch report, implemented a mechanizm in omniprobe to *close report*. This way we're certain we're done writing dispatches

Signed-off-by: coleramos425 <colramos@amd.com>
@coleramos425 coleramos425 force-pushed the colramos/memAnalysis-json-v2 branch from 71a9d32 to bf408bb Compare July 10, 2025 16:53
@coleramos425 coleramos425 merged commit 6ebb58d into main Jul 11, 2025
4 of 16 checks passed
@coleramos425 coleramos425 deleted the colramos/memAnalysis-json-v2 branch September 25, 2025 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant