AMD Open Capture and Analysis Tool (OCAT)
如果您想实时了解游戏在您的机器上的性能如何,同时开销很低,AMD OCAT 可以满足您的需求。
性能 指南
我们的AMD Ryzen™性能指南将通过一系列技巧、窍门和诀窍,帮助您完成优化过程,支持您追求卓越性能。
PresentMon是一个命令行界面(CLI)工具,用于记录帧时间,例如MsBetweenPresents。
示例
PresentMon-1.6.0-x64.exe -process_name "MyGame.exe"-stop_existing_session-terminate_on_proc_exit-terminate_after_timed-timed 60-output_file "%CD%\result\presentmon.csv"OCAT是一个图形用户界面(GUI)工具,支持热键,用于记录基于PresentMon的帧时间。
WPA是一个高度可配置的工具,用于查找系统性能瓶颈,并且非常适合过滤和可视化调用堆栈。
wpr.exe或xperf.exe创建的日志。wpr.exe包含在所有Windows 10安装中。xperf.exe包含在Windows SDK中。GPUView是一个用于分析GPU性能的工具,特别是关于直接内存访问(DMA)缓冲区处理。
您可以使用Visual Studio的并发可视化工具来定位性能瓶颈、CPU利用率不足、线程争用、跨核心线程迁移、同步延迟、DirectX活动、重叠I/O区域以及其他信息。
RGP是一款适用于DirectX、Vulkan®、SPIR-V™、OpenGL®和OpenCL™的离线编译器和性能分析工具。
Build.h中,#define FORCE_USE_STATS和#define STATS在发布构建中绝不应启用。ALLOW_CONSOLE_IN_SHIPPING可能很方便。运行Unreal EngineUE4Editor MapCheck以查找错误。
使用Unity®AssetPostprocessor强制执行最低标准。
使用超出规格或工厂设置的AMD处理器可能导致的损坏不在您的AMD产品保修范围内,也可能不在您的系统制造商的保修范围内。在超出规格或工厂设置(包括但不限于超频)的情况下运行您的AMD处理器,可能会损坏或缩短您的处理器或其他系统组件的寿命,造成系统不稳定(例如数据丢失和图像损坏),在极端情况下可能导致系统完全失效。AMD不为超出处理器规格或工厂设置的情况相关的任何问题或损坏提供支持或服务。
bcdedit.exe /deletevalue useplatformclock
bcdedit.exe /set useplatformclock yes
rem Run as administratorrem Disable Steam Shader Pre-Caching before running this scriptrem Reboot after running this script to clear any shaders still in system memory
setlocal enableextensionscd /d "%~dp0"rmdir /s /q "%LOCALAPPDATA%\D3DSCache"rmdir /s /q "%LOCALAPPDATA%\AMD\DxCache"rmdir /s /q "%LOCALAPPDATA%\AMD\GLCache"rmdir /s /q "%LOCALAPPDATA%\AMD\VkCache"rmdir /s /q "%ProgramData%\NVIDIA Corporation\NV_Cache"rmdir /s /q "%ProgramFiles(x86)%\Steam\steamapps\shadercache"Hypervisor-Protected Code Integrity (HVCI)在Windows安全应用程序中标记为内存完整性。
symstore和符号路径可以是加载供应商符号以及为不检查本地目录的工具提供提示的强大工具。
_NT_SYMBOL_PATH的系统环境变量。_NT_SYMBOL_PATH=cache*c:\symbols;srv*https://download.amd.com/dir/bin;srv*https://driver-symbols.nvidia.com/;srv*http://msdl.microsoft.com/download/symbols“C:\Program Files (x86)\Windows Kits\10\Debuggers\x64”添加到PATH。symstore.exe add /r /f *.pdb /s c:\symbols /t "MyProject"通常,如果GPU空闲时间> 5%,则应用程序受CPU限制。
在RGP、GPUView和Visual Studio并发可视化工具等工具中查找GPU上的空闲工作气泡。
有多种工具和方法可供开发人员检测边界性。
Radeon GPU Profiler (RGP)

rem run as administratorrem add "C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\gpuview" to pathsetlocal enableextensionscd /d "%~dp0"rem switch active foreground window back to the game applicationtimeout.exe /t 5call log.cmd lighttimeout.exe /t 5call log.cmdrem open Merged.etlrem run as administratorsetlocal enableextensionscd /d "%~dp0"rem switch active foreground window back to the game applicationtimeout.exe /t 5wpr.exe -start gpu -filemodetimeout.exe /t 5wpr.exe -stop out.etlrem open out.etl| 命令 | 推荐值 |
|---|---|
r.rhicmdbypass | 0 |
r.rhicmdusedeferredcontexts | 1 |
r.rhicmduseparallelalgorithms | 1 |
r.rhithread.enable | 1 |
使用冷着色器缓存同时验证并行DX12管线状态创建。
PATH。rem run as administratorrem clear shader cachecall log.cmdrem collect samples while game is starting and calling D3D12.dll!CDevice::CreatePipelineStatecall log.cmdetl日志文件。D3D12.dll!CDevice::CreatePipelineState。此查找命令突出显示了CPU使用率(精确)图中的相关样本。

PATH。rem run as administratorrem add "C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\gpuview" to pathsetlocal enableextensionscd /d "%~dp0"rem switch active foreground window back to the game applicationtimeout.exe /t 5call log.cmdrem collect samples while game is playing and rendering frames. 1 seconds should be more than enough data.timeout.exe /t 1call log.cmd
WinDbg可用于设置断点、日志记录、跳过函数、编辑内存或编辑寄存器。
RCX、RDX、R8和R9中。第五个及更高的参数通过堆栈传递。
steam_appid.txt文件或SteamAppId系统环境变量才能从WinDbg启动可执行文件。DXGI_GPU_PREFERENCE_HIGH_PERFORMANCEDXGI_GPU_PREFERENCE_HIGH_PERFORMANCE (2)。bp dxgi!CDXGIFactory::EnumAdapterByGpuPreference ".printf \"FOUND DXGIFactory::EnumAdapterByGpuPreference DXGI_GPU_PREFERENCE=%x\\n\",@r8"GetLogicalProcessorInformation(Ex)调用是否具有非零输入缓冲区长度并返回成功0以获取要malloc的缓冲区长度。return 1)。bp kernelbase!GetLogicalProcessorInformation "bp /1 @$ra \".printf \\\"GetLogicalProcessorInformation returned %i\\\", @rax; .echo; g\"; .printf \"GetLogicalProcessorInformation input buffer length 0x%x\", poi(@rdx); .echo; g"bp kernelbase!GetLogicalProcessorInformationEx "bp /1 @$ra \".printf \\\"GetLogicalProcessorInformationEx returned %i\\\", @rax; .echo; g\"; .printf \"GetLogicalProcessorInformationEx input buffer length 0x%x\", poi(@r8); .echo; g"DirectX API通过术语统一内存架构(UMA)来引用加速处理单元(APU)或集成显卡。
bool isUMA(ID3D12Device* pDevice){ bool result = false; D3D12_FEATURE_DATA_ARCHITECTURE data = {}; if (S_OK == pDevice->CheckFeatureSupport( D3D12_FEATURE_ARCHITECTURE, &data, sizeof(data))) { result = data.UMA; } return result;}//// Copyright (c) 2021 Advanced Micro Devices, Inc. All rights reserved.//// Permission is hereby granted, free of charge, to any person obtaining a copy// of this software and associated documentation files (the "Software"), to deal// in the Software without restriction, including without limitation the rights// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell// copies of the Software, and to permit persons to whom the Software is// furnished to do so, subject to the following conditions://// The above copyright notice and this permission notice shall be included in// all copies or substantial portions of the Software.//// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN// THE SOFTWARE.//
#include <iostream>#include <dxgi1_4.h>#include <d3d12.h>
#pragma comment( lib, "dxgi" )#pragma comment( lib, "d3d12" )
bool isUMA(ID3D12Device* pDevice){ bool result = false; D3D12_FEATURE_DATA_ARCHITECTURE data = {}; if (S_OK == pDevice->CheckFeatureSupport( D3D12_FEATURE_ARCHITECTURE, &data, sizeof(data))) { result = data.UMA; } return result;}
int main(){ ID3D12Device* pDevice = nullptr; if (SUCCEEDED(D3D12CreateDevice( NULL, D3D_FEATURE_LEVEL_11_0, _uuidof(ID3D12Device), (void**)&pDevice))) { IDXGIFactory* pFactory; IDXGIFactory4* pFactory4; if (SUCCEEDED(CreateDXGIFactory(__uuidof(IDXGIFactory), (void**)(&pFactory))) && SUCCEEDED(pFactory->QueryInterface(__uuidof(IDXGIFactory4), (void**)&pFactory4))) { LUID luid = pDevice->GetAdapterLuid(); IDXGIAdapter* pAdapter; DXGI_ADAPTER_DESC desc; if (SUCCEEDED(pFactory4->EnumAdapterByLuid(luid, __uuidof(IDXGIAdapter), (void**)&pAdapter)) && SUCCEEDED(pAdapter->GetDesc(&desc))) { printf("DedicatedVideoMemory %I64u\n", desc.DedicatedVideoMemory); printf("DedicatedSystemMemory %I64u\n", desc.DedicatedSystemMemory); printf("SharedSystemMemory %I64u\n", desc.SharedSystemMemory); printf("isUMA %i\n", isUMA(pDevice)); SIZE_T budget = desc.DedicatedVideoMemory; if (isUMA(pDevice)) { budget += desc.DedicatedSystemMemory + desc.SharedSystemMemory; } IDXGIAdapter3* pAdapter3 = nullptr; DXGI_QUERY_VIDEO_MEMORY_INFO info = {}; if (SUCCEEDED(pAdapter->QueryInterface(__uuidof(IDXGIAdapter3), (void**)&pAdapter3)) && SUCCEEDED(pAdapter3->QueryVideoMemoryInfo(0, DXGI_MEMORY_SEGMENT_GROUP_LOCAL, &info))) { budget = info.Budget; } printf("budget %I64u\n", budget); } } }}bool isUMA(ID3D11Device* pDevice){ bool result = false; ID3D11Device3* pD3D11Device3 = nullptr; if (S_OK == pDevice->QueryInterface(IID_PPV_ARGS(&pD3D11Device3)) && pD3D11Device3) { D3D11_FEATURE_DATA_D3D11_OPTIONS2 data = {}; if (S_OK == pD3D11Device3->CheckFeatureSupport( D3D11_FEATURE_D3D11_OPTIONS2, &data, sizeof(data))) { result = data.UnifiedMemoryArchitecture; } pD3D11Device3->Release(); } return result;}//// Copyright (c) 2021 Advanced Micro Devices, Inc. All rights reserved.//// Permission is hereby granted, free of charge, to any person obtaining a copy// of this software and associated documentation files (the "Software"), to deal// in the Software without restriction, including without limitation the rights// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell// copies of the Software, and to permit persons to whom the Software is// furnished to do so, subject to the following conditions://// The above copyright notice and this permission notice shall be included in// all copies or substantial portions of the Software.//// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN// THE SOFTWARE.//
#include <iostream>#include <dxgi1_4.h>#include <d3d11_3.h>
#pragma comment( lib, "dxgi" )#pragma comment( lib, "d3d11" )
bool isUMA(ID3D11Device* pDevice){ bool result = false; ID3D11Device3* pD3D11Device3 = nullptr; if (S_OK == pDevice->QueryInterface(IID_PPV_ARGS(&pD3D11Device3)) && pD3D11Device3) { D3D11_FEATURE_DATA_D3D11_OPTIONS2 data = {}; if (S_OK == pD3D11Device3->CheckFeatureSupport( D3D11_FEATURE_D3D11_OPTIONS2, &data, sizeof(data))) { result = data.UnifiedMemoryArchitecture; } pD3D11Device3->Release(); } return result;}
int main(){ UINT flags = NULL; // D3D11_CREATE_DEVICE_SINGLETHREADED; D3D_FEATURE_LEVEL featureLevels[] = { D3D_FEATURE_LEVEL_11_0 }; UINT numFeatureLevels = ARRAYSIZE(featureLevels); D3D_FEATURE_LEVEL featureLevel; ID3D11Device* pDevice = nullptr; ID3D11DeviceContext* pImmediateContext = nullptr; if SUCCEEDED(D3D11CreateDevice( NULL, D3D_DRIVER_TYPE_HARDWARE, NULL, flags, featureLevels, numFeatureLevels, D3D11_SDK_VERSION, &pDevice, &featureLevel, &pImmediateContext)) { IDXGIDevice* pDXGIDevice = nullptr; IDXGIAdapter* pAdapter = nullptr; DXGI_ADAPTER_DESC desc; if (SUCCEEDED(pDevice->QueryInterface(__uuidof(IDXGIDevice), (void**)&pDXGIDevice)) && SUCCEEDED(pDXGIDevice->GetAdapter(&pAdapter)) && SUCCEEDED(pAdapter->GetDesc(&desc))) { printf("DedicatedVideoMemory %I64u\n", desc.DedicatedVideoMemory); printf("DedicatedSystemMemory %I64u\n", desc.DedicatedSystemMemory); printf("SharedSystemMemory %I64u\n", desc.SharedSystemMemory); printf("isUMA %i\n", isUMA(pDevice)); SIZE_T budget = desc.DedicatedVideoMemory; if (isUMA(pDevice)) { budget += desc.DedicatedSystemMemory + desc.SharedSystemMemory; } IDXGIAdapter3* pAdapter3 = nullptr; DXGI_QUERY_VIDEO_MEMORY_INFO info = {}; if (SUCCEEDED(pAdapter->QueryInterface(__uuidof(IDXGIAdapter3), (void**)&pAdapter3)) && SUCCEEDED(pAdapter3->QueryVideoMemoryInfo(0, DXGI_MEMORY_SEGMENT_GROUP_LOCAL, &info))) { budget = info.Budget; } printf("budget %I64u\n", budget); } }}共享视频内存与CPU的集成显卡部分在检测VRAM预算时需要特殊考虑。
首选方法
<code readonly="true" class="language-cpp"> <xmp>IDXGIAdapter3* pAdapter3 = nullptr;DXGI_QUERY_VIDEO_MEMORY_INFO info = {};if (SUCCEEDED(pAdapter->QueryInterface(__uuidof(IDXGIAdapter3), (void**)&pAdapter3)) && SUCCEEDED(pAdapter3->QueryVideoMemoryInfo(0, DXGI_MEMORY_SEGMENT_GROUP_LOCAL, &info))){ budget = info.Budget;}</xmp> </code>备选方法
DXGI_ADAPTER_DESC desc;if (SUCCEEDED(pAdapter->GetDesc(&desc))){ SIZE_T budget = desc.DedicatedVideoMemory; if (isUMA(pDevice)) { budget += desc.DedicatedSystemMemory + desc.SharedSystemMemory; }}DedicatedVideoMemory:这代表独立GPU上的实际本地内存,以及集成GPU上专用的系统内存划分。DedicatedSystemMemory:此值在AMD GPU上始终为零。SharedSystemMemory:此值由GPU KMD确定,最多可能返回系统内存的一半。DedicatedVideoMemorySize单独使用可能不足以在具有集成显卡(UMA)的系统上运行某些游戏应用程序。SharedSystemMemorySize,然后依赖GPU KMD和vidMm来最优地分配系统内存。CheckFeatureSupport查询UMA。//// Copyright (c) 2021 Advanced Micro Devices, Inc. All rights reserved.//// Permission is hereby granted, free of charge, to any person obtaining a copy// of this software and associated documentation files (the "Software"), to deal// in the Software without restriction, including without limitation the rights// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell// copies of the Software, and to permit persons to whom the Software is// furnished to do so, subject to the following conditions://// The above copyright notice and this permission notice shall be included in// all copies or substantial portions of the Software.//// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN// THE SOFTWARE.//
#include <iostream>#include <dxgi1_4.h>#include <d3d12.h>
#pragma comment( lib, "dxgi" )#pragma comment( lib, "d3d12" )
bool isUMA(ID3D12Device* pDevice){ bool result = false; D3D12_FEATURE_DATA_ARCHITECTURE data = {}; if (S_OK == pDevice->CheckFeatureSupport( D3D12_FEATURE_ARCHITECTURE, &data, sizeof(data))) { result = data.UMA; } return result;}
int main(){ ID3D12Device* pDevice = nullptr; if (SUCCEEDED(D3D12CreateDevice( NULL, D3D_FEATURE_LEVEL_11_0, _uuidof(ID3D12Device), (void**)&pDevice))) { IDXGIFactory* pFactory; IDXGIFactory4* pFactory4; if (SUCCEEDED(CreateDXGIFactory(__uuidof(IDXGIFactory), (void**)(&pFactory))) && SUCCEEDED(pFactory->QueryInterface(__uuidof(IDXGIFactory4), (void**)&pFactory4))) { LUID luid = pDevice->GetAdapterLuid(); IDXGIAdapter* pAdapter; DXGI_ADAPTER_DESC desc; if (SUCCEEDED(pFactory4->EnumAdapterByLuid(luid, __uuidof(IDXGIAdapter), (void**)&pAdapter)) && SUCCEEDED(pAdapter->GetDesc(&desc))) { printf("DedicatedVideoMemory %I64u\n", desc.DedicatedVideoMemory); printf("DedicatedSystemMemory %I64u\n", desc.DedicatedSystemMemory); printf("SharedSystemMemory %I64u\n", desc.SharedSystemMemory); printf("isUMA %i\n", isUMA(pDevice)); SIZE_T budget = desc.DedicatedVideoMemory; if (isUMA(pDevice)) { budget += desc.DedicatedSystemMemory + desc.SharedSystemMemory; } IDXGIAdapter3* pAdapter3 = nullptr; DXGI_QUERY_VIDEO_MEMORY_INFO info = {}; if (SUCCEEDED(pAdapter->QueryInterface(__uuidof(IDXGIAdapter3), (void**)&pAdapter3)) && SUCCEEDED(pAdapter3->QueryVideoMemoryInfo(0, DXGI_MEMORY_SEGMENT_GROUP_LOCAL, &info))) { budget = info.Budget; } printf("budget %I64u\n", budget); } } }}//// Copyright (c) 2021 Advanced Micro Devices, Inc. All rights reserved.//// Permission is hereby granted, free of charge, to any person obtaining a copy// of this software and associated documentation files (the "Software"), to deal// in the Software without restriction, including without limitation the rights// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell// copies of the Software, and to permit persons to whom the Software is// furnished to do so, subject to the following conditions://// The above copyright notice and this permission notice shall be included in// all copies or substantial portions of the Software.//// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN// THE SOFTWARE.//
#include <iostream>#include <dxgi1_4.h>#include <d3d11_3.h>
#pragma comment( lib, "dxgi" )#pragma comment( lib, "d3d11" )
bool isUMA(ID3D11Device* pDevice){ bool result = false; ID3D11Device3* pD3D11Device3 = nullptr; if (S_OK == pDevice->QueryInterface(IID_PPV_ARGS(&pD3D11Device3)) && pD3D11Device3) { D3D11_FEATURE_DATA_D3D11_OPTIONS2 data = {}; if (S_OK == pD3D11Device3->CheckFeatureSupport( D3D11_FEATURE_D3D11_OPTIONS2, &data, sizeof(data))) { result = data.UnifiedMemoryArchitecture; } pD3D11Device3->Release(); } return result;}
int main(){ UINT flags = NULL; // D3D11_CREATE_DEVICE_SINGLETHREADED; D3D_FEATURE_LEVEL featureLevels[] = { D3D_FEATURE_LEVEL_11_0 }; UINT numFeatureLevels = ARRAYSIZE(featureLevels); D3D_FEATURE_LEVEL featureLevel; ID3D11Device* pDevice = nullptr; ID3D11DeviceContext* pImmediateContext = nullptr; if SUCCEEDED(D3D11CreateDevice( NULL, D3D_DRIVER_TYPE_HARDWARE, NULL, flags, featureLevels, numFeatureLevels, D3D11_SDK_VERSION, &pDevice, &featureLevel, &pImmediateContext)) { IDXGIDevice* pDXGIDevice = nullptr; IDXGIAdapter* pAdapter = nullptr; DXGI_ADAPTER_DESC desc; if (SUCCEEDED(pDevice->QueryInterface(__uuidof(IDXGIDevice), (void**)&pDXGIDevice)) && SUCCEEDED(pDXGIDevice->GetAdapter(&pAdapter)) && SUCCEEDED(pAdapter->GetDesc(&desc))) { printf("DedicatedVideoMemory %I64u\n", desc.DedicatedVideoMemory); printf("DedicatedSystemMemory %I64u\n", desc.DedicatedSystemMemory); printf("SharedSystemMemory %I64u\n", desc.SharedSystemMemory); printf("isUMA %i\n", isUMA(pDevice)); SIZE_T budget = desc.DedicatedVideoMemory; if (isUMA(pDevice)) { budget += desc.DedicatedSystemMemory + desc.SharedSystemMemory; } IDXGIAdapter3* pAdapter3 = nullptr; DXGI_QUERY_VIDEO_MEMORY_INFO info = {}; if (SUCCEEDED(pAdapter->QueryInterface(__uuidof(IDXGIAdapter3), (void**)&pAdapter3)) && SUCCEEDED(pAdapter3->QueryVideoMemoryInfo(0, DXGI_MEMORY_SEGMENT_GROUP_LOCAL, &info))) { budget = info.Budget; } printf("budget %I64u\n", budget); } }}有时需要进行功能缩放,以便在热限制平台上实现可接受的帧率。
DXGI_FORMAT_R11G11B10_FLOAT而不是DXGI_FORMAT_R16G16B16A16_FLOAT。r.SceneColorFormat
r.AmbientOcclusionLevels
为了确保混合图形平台上的预期GPU得到利用,可能需要额外的考虑。
IDXGIFactory6::EnumAdapterByGpuPreference。DXGI_GPU_PREFERENCE_HIGH_PERFORMANCE。DXGI_GPU_PREFERENCE=2 (DXGI_GPU_PREFERENCE_HIGH_PERFORMANCE)。bp dxgi!CDXGIFactory::EnumAdapterByGpuPreference ".printf \"FOUND DXGIFactory::EnumAdapterByGpuPreference DXGI_GPU_PREFERENCE=%x\\n\",@r8"
memcpy、memset和其他C运行时优化。memcpy的源和目标对齐到4096字节页面边界,可以减少Zen 2的存储到加载转发事件(请参阅AMD µProf中的STLIOther)。4096页面边界,可能有利于AMD Threadripper™和EPYC™处理器上的探针过滤。64字节)对齐可以减少假共享。_aligned_malloc或C++17 aligned new。64字节缓存行或4096字节页面。

现代同步API包括std::mutex、std::shared_mutex、SRWLock和EnterCriticalSection。
WaitForSingleObject或用户自旋锁更快,功耗更低。mwaitx指令来等待地址或超时。Syscall开销。SetEventOnCompletion()这样的调用也可以与旧的fence轮询模型一样高效,同时避免使其他线程饿死或不必要地消耗电力。%NUMBER_OF_PROCESSORS%的可伸缩性此建议特定于AMD处理器,并非对所有处理器供应商的通用指导。
通常,应用程序会显示SMT的好处,并建议使用所有逻辑处理器。然而,游戏在游戏过程中经常在主线程或渲染线程上遭受SMT争用。