@@ -1757,6 +1757,55 @@ As part of the AMDGPU MC layer, AMDGPU provides the following target specific
17571757
17581758 =================== ================= ========================================================
17591759
1760+ Function Resource Usage
1761+ -----------------------
1762+
1763+ A function's resource usage depends on each of its callees' resource usage. The
1764+ expressions used to denote resource usage reflect this by propagating each
1765+ callees' equivalent expressions. Said expressions are emitted as symbols by the
1766+ compiler when compiling to either assembly or object format and should not be
1767+ overwritten or redefined.
1768+
1769+ The following describes all emitted function resource usage symbols:
1770+
1771+ .. table:: Function Resource Usage:
1772+ :name: function-usage-table
1773+
1774+ ===================================== ========= ========================================= ===============================================================================
1775+ Symbol Type Description Example
1776+ ===================================== ========= ========================================= ===============================================================================
1777+ <function_name>.num_vgpr Integer Number of VGPRs used by <function_name>, .set foo.num_vgpr, max(32, bar.num_vgpr, baz.num_vgpr)
1778+ worst case of itself and its callees'
1779+ VGPR use
1780+ <function_name>.num_agpr Integer Number of AGPRs used by <function_name>, .set foo.num_agpr, max(35, bar.num_agpr)
1781+ worst case of itself and its callees'
1782+ AGPR use
1783+ <function_name>.numbered_sgpr Integer Number of SGPRs used by <function_name>, .set foo.num_sgpr, 21
1784+ worst case of itself and its callees'
1785+ SGPR use (without any of the implicitly
1786+ used SGPRs)
1787+ <function_name>.private_seg_size Integer Total stack size required for .set foo.private_seg_size, 16+max(bar.private_seg_size, baz.private_seg_size)
1788+ <function_name>, expression is the
1789+ locally used stack size + the worst case
1790+ callee
1791+ <function_name>.uses_vcc Bool Whether <function_name>, or any of its .set foo.uses_vcc, or(0, bar.uses_vcc)
1792+ callees, uses vcc
1793+ <function_name>.uses_flat_scratch Bool Whether <function_name>, or any of its .set foo.uses_flat_scratch, 1
1794+ callees, uses flat scratch or not
1795+ <function_name>.has_dyn_sized_stack Bool Whether <function_name>, or any of its .set foo.has_dyn_sized_stack, 1
1796+ callees, is dynamically sized
1797+ <function_name>.has_recursion Bool Whether <function_name>, or any of its .set foo.has_recursion, 0
1798+ callees, contains recursion
1799+ <function_name>.has_indirect_call Bool Whether <function_name>, or any of its .set foo.has_indirect_call, max(0, bar.has_indirect_call)
1800+ callees, contains an indirect call
1801+ ===================================== ========= ========================================= ===============================================================================
1802+
1803+ Futhermore, three symbols are additionally emitted describing the compilation
1804+ unit's worst case (i.e, maxima) ``num_vgpr``, ``num_agpr``, and
1805+ ``numbered_sgpr`` which may be referenced and used by the aforementioned
1806+ symbolic expressions. These three symbols are ``amdgcn.max_num_vgpr``,
1807+ ``amdgcn.max_num_agpr``, and ``amdgcn.max_num_sgpr``.
1808+
17601809.. _amdgpu-elf-code-object:
17611810
17621811ELF Code Object
0 commit comments