AMD GPU Services (AGS)
|
API for writing top-of-pipe and bottom-of-pipe markers to help track down GPU hangs. More...
Classes | |
struct | AGSBreadcrumbMarker |
The breadcrumb marker struct used by agsDriverExtensionsDX11_WriteBreadcrumb. More... | |
Typedefs | |
typedef struct AGSBreadcrumbMarker | AGSBreadcrumbMarker |
The breadcrumb marker struct used by agsDriverExtensionsDX11_WriteBreadcrumb. | |
Functions | |
AMD_AGS_API AGSReturnCode | agsDriverExtensionsDX11_WriteBreadcrumb (AGSContext *context, const AGSBreadcrumbMarker *marker) |
Function to write a breadcrumb marker. More... | |
API for writing top-of-pipe and bottom-of-pipe markers to help track down GPU hangs.
The API is available if the AGSDX11ReturnedParams::ExtensionsSupported::breadcrumbMarkers is present.
To use the API, a non zero value needs to be specificed in AGSDX11ExtensionParams::numBreadcrumbMarkers. This enables the API (if available) and allocates a system memory buffer which is returned to the user in AGSDX11ReturnedParams::breadcrumbBuffer.
The user can now write markers before and after draw calls using agsDriverExtensionsDX11_WriteBreadcrumb.
A top-of-pipe (TOP) command is scheduled for execution as soon as the command processor (CP) reaches the command. A bottom-of-pipe (BOP) command is scheduled for execution once the previous rendering commands (draw and dispatch) finish execution. TOP and BOP commands do not block CP. i.e. the CP schedules the command for execution then proceeds to the next command without waiting. To effectively use TOP and BOP commands, it is important to understand how they interact with rendering commands:
When the CP encounters a rendering command it queues it for execution and moves to the next command. The queued rendering commands are issued in order. There can be multiple rendering commands running in parallel. When a rendering command is issued we say it is at the top of the pipe. When a rendering command finishes execution we say it has reached the bottom of the pipe.
A BOP command remains in a waiting queue and is executed once prior rendering commands finish. The queue of BOP commands is limited to 64 entries in GCN generation 1, 2, 3, 4 and 5. If the 64 limit is reached the CP will stop queueing BOP commands and also rendering commands. Developers should limit the number of BOP commands that write markers to avoid contention. In general, developers should limit both TOP and BOP commands to avoid stalling the CP.
In the above example, the CP writes markers 1, 2 and 3 without waiting: Marker 1 is TOP so it's independent from other commands There's no wait for marker 2 and 3 because there are no draws preceding the BOP commands Marker 4 is only written once DrawX finishes execution Marker 5 doesn't wait for additional draws so it is written right after marker 4 Marker 6 can be written as soon as the CP reaches the command. For instance, it is very possible that CP writes marker 6 while DrawX is running and therefore marker 6 gets written before markers 4 and 5
In this example marker 1 is written before the start of DrawX Marker 2 is written once DrawX finishes execution Similarly marker 3 is written before the start of DrawY Marker 4 is written once DrawY finishes execution In case of a GPU hang, if markers 1 and 3 are written but markers 2 and 4 are missing we can conclude that: The CP has reached both DrawX and DrawY commands since marker 1 and 3 are present The fact that marker 2 and 4 are missing means that either DrawX is hanging while DrawY is at the top of the pipe or both DrawX and DrawY started and both are simultaneously hanging
In this example marker 1 is written before the start of DrawX Marker 2 is written once DrawX finishes Marker 3 is written once DrawY finishes Marker 4 is written once DrawZ finishes If the GPU hangs and only marker 1 is written we can conclude that the hang is happening in either DrawX, DrawY or DrawZ If the GPU hangs and only marker 1 and 2 are written we can conclude that the hang is happening in DrawY or DrawZ If the GPU hangs and only marker 4 is missing we can conclude that the hang is happening in DrawZ
In this example, in case the GPU hangs and only marker 1 is written we can conclude that the hang is happening in DrawX In case the GPU hangs and only marker 1 and 2 are written we can conclude that the hang is happening in DrawX or DrawY In case the GPU hangs and all 3 markers are written we can conclude that the hang is happening in any of DrawX, DrawY or DrawZ
Marker 1 is written right after DrawX is queued for execution. Marker 2 is only written once DrawX finishes execution. Marker 3 is written right after DrawY is queued for execution. Marker 4 is only written once DrawY finishes execution If marker 1 is written we would know that the CP has reached the command DrawX (DrawX at the top of the pipe). If marker 2 is written we can say that DrawX has finished execution (DrawX at the bottom of the pipe). In case the GPU hangs and only marker 1 and 3 are written we can conclude that the hang is happening in DrawX or DrawY In case the GPU hangs and only marker 1 is written we can conclude that the hang is happening in DrawX In case the GPU hangs and only marker 4 is missing we can conclude that the hang is happening in DrawY
In the event of a GPU hang, the user can inspect the system memory buffer to determine which draw has caused the hang. For example:
The console output would resemble something like:
AMD_AGS_API AGSReturnCode agsDriverExtensionsDX11_WriteBreadcrumb | ( | AGSContext * | context, |
const AGSBreadcrumbMarker * | marker | ||
) |
Function to write a breadcrumb marker.
This method inserts a write marker operation in the GPU command stream. In the case where the GPU is hanging the write command will never be reached and the marker will never get written to memory.
In order to use this function, AGSDX11ExtensionParams::numBreadcrumbMarkers must be set to a non zero value.
[in] | context | Pointer to a context. |
[in] | marker | Pointer to a marker. |