An Example: Adding a Custom Analysis Tool
This section demonstrates how to extend AccelProf by adding a new tool for basic GPU performance analysis. The steps include integrating the tool model, implementing analysis logic, registering the tool, adding a GPU-side patch, and enabling it in the backend runtime.
Step 1: Add Tool Model
Add app_analysis to the tool detection logic and internal enumeration types.
Modify the Tool Dispatcher
In bin/accelprof:
...
+elif [ ${TOOL} == "app_analysis" ]
+then
+ export YOSEMITE_TOOL_NAME=app_analysis
...
Add to tool_type.h
typedef enum {
CODE_CHECK = 0,
APP_METRICE = 1,
MEM_TRACE = 2,
HOT_ANALYSIS = 3,
UVM_ADVISOR = 4,
+ APP_ANALYSIS = 5,
TOOL_NUMS = 6
} AnalysisTool_t;
Add to sanalyzer.h
typedef enum {
GPU_NO_PATCH = 0,
GPU_PATCH_APP_METRIC = 1,
GPU_PATCH_MEM_TRACE = 2,
GPU_PATCH_HOT_ANALYSIS = 3,
GPU_PATCH_UVM_ADVISOR = 4,
+ GPU_PATCH_APP_ANALYSIS = 5,
} SanitizerPatchName_t;
Step 2: Implement Analysis Logic
Define the structure of your analysis tool by inheriting from the Tool base class.
app_analysis.h
// ...header file defining the AppAnalysis class
app_analysis.cpp
// ...implementation of event callbacks and GPU-side analysis logic
Each callback handles a specific type of event (e.g., memory allocations, kernel launches). You can modify or extend the callbacks for custom behavior.
Step 3: Enable the Tool in Sanalyzer
In sanalyzer.cpp, register the new tool and its associated patch.
+ #include "tools/app_analysis.h"
+ else if (std::string(tool_name) == "app_analysis") {
+ tool = APP_ANALYSIS;
+ _tools.emplace(APP_ANALYSIS, std::make_shared<AppAnalysis>());
+ }
+ else if (tool == APP_ANALYSIS) {
+ options.patch_name = GPU_PATCH_APP_ANALYSIS;
+ options.patch_file = "gpu_patch_app_analysis.fatbin";
+ }
Step 4: Add a GPU Patch
Write a patch file that will be injected during kernel execution.
gpu_patch_app_analysis.cu
// ...implementation of CommonCallback and memory access tracking functions
These patch functions log global, shared, and local memory accesses during kernel execution and update associated tracking data structures.
Step 5: Enable Patch Logic in the Backend
In your backend runtime patching and tracking logic, add conditions for GPU_PATCH_APP_ANALYSIS.
+ else if (sanitizer_options.patch_name == GPU_PATCH_APP_ANALYSIS) {
+ // Patch instructions for memory access
+ }
+ else if (sanitizer_options.patch_name == GPU_PATCH_APP_ANALYSIS) {
+ // Memory state allocation
+ }
+ else if (sanitizer_options.patch_name == GPU_PATCH_UVM_ADVISOR) {
+ // Initialization of access state and memory copy
+ }
+ else if (sanitizer_options.patch_name == GPU_PATCH_UVM_ADVISOR) {
+ // Final analysis and data transfer from device to host
+ }
Summary
By following these steps, you can extend AccelProf with a new tool for custom GPU analysis:
Add enum entries and shell hooks
Implement event-driven analysis logic
Register and activate your tool
Write a custom memory patch and wire it into the runtime backend
This modular design enables scalable and flexible extension of profiling capabilities.