Skip to content

Commit 073725b

Browse files
authored
Merge pull request godotengine#7647 from Calinou/update-performance
2 parents 6e94674 + df9bf74 commit 073725b

File tree

5 files changed

+135
-118
lines changed

5 files changed

+135
-118
lines changed

tutorials/performance/cpu_optimization.rst

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,10 @@
1-
:article_outdated: True
2-
31
.. _doc_cpu_optimization:
42

53
CPU optimization
64
================
75

86
Measuring performance
9-
=====================
7+
---------------------
108

119
We have to know where the "bottlenecks" are to know how to speed up our program.
1210
Bottlenecks are the slowest parts of the program that limit the rate that
@@ -18,7 +16,7 @@ lead to small performance improvements.
1816
For the CPU, the easiest way to identify bottlenecks is to use a profiler.
1917

2018
CPU profilers
21-
=============
19+
-------------
2220

2321
Profilers run alongside your program and take timing measurements to work out
2422
what proportion of time is spent in each function.
@@ -31,7 +29,7 @@ slow down your project significantly.
3129
After profiling, you can look back at the results for a frame.
3230

3331
.. figure:: img/godot_profiler.png
34-
.. figure:: img/godot_profiler.png
32+
:align: center
3533
:alt: Screenshot of the Godot profiler
3634

3735
Results of a profile of one of the demo projects.
@@ -51,7 +49,7 @@ For more info about using Godot's built-in profiler, see
5149
:ref:`doc_debugger_panel`.
5250

5351
External profilers
54-
~~~~~~~~~~~~~~~~~~
52+
------------------
5553

5654
Although the Godot IDE profiler is very convenient and useful, sometimes you
5755
need more power, and the ability to profile the Godot engine source code itself.
@@ -87,7 +85,7 @@ batching, which greatly speeds up 2D rendering by reducing bottlenecks in this
8785
area.
8886

8987
Manually timing functions
90-
=========================
88+
-------------------------
9189

9290
Another handy technique, especially once you have identified the bottleneck
9391
using a profiler, is to manually time the function or area under test.
@@ -115,7 +113,7 @@ time them as you go. This will give you crucial feedback as to whether the
115113
optimization is working (or not).
116114

117115
Caches
118-
======
116+
------
119117

120118
CPU caches are something else to be particularly aware of, especially when
121119
comparing timing results of two different versions of a function. The results
@@ -148,7 +146,7 @@ rendering and physics. Still, you should be especially aware of caching when
148146
writing GDExtensions.
149147

150148
Languages
151-
=========
149+
---------
152150

153151
Godot supports a number of different languages, and it is worth bearing in mind
154152
that there are trade-offs involved. Some languages are designed for ease of use
@@ -159,7 +157,7 @@ language you choose. If your project is making a lot of calculations in its own
159157
code, consider moving those calculations to a faster language.
160158

161159
GDScript
162-
~~~~~~~~
160+
^^^^^^^^
163161

164162
:ref:`GDScript <toc-learn-scripting-gdscript>` is designed to be easy to use and iterate,
165163
and is ideal for making many types of games. However, in this language, ease of
@@ -168,7 +166,7 @@ calculations, consider moving some of your project to one of the other
168166
languages.
169167

170168
C#
171-
~~
169+
^^
172170

173171
:ref:`C# <toc-learn-scripting-C#>` is popular and has first-class support in Godot. It
174172
offers a good compromise between speed and ease of use. Beware of possible
@@ -177,13 +175,13 @@ common approach to workaround issues with garbage collection is to use *object
177175
pooling*, which is outside the scope of this guide.
178176

179177
Other languages
180-
~~~~~~~~~~~~~~~
178+
^^^^^^^^^^^^^^^
181179

182180
Third parties provide support for several other languages, including `Rust
183181
<https://github.com/godot-rust/gdext>`_.
184182

185183
C++
186-
~~~
184+
^^^
187185

188186
Godot is written in C++. Using C++ will usually result in the fastest code.
189187
However, on a practical level, it is the most difficult to deploy to end users'
@@ -192,7 +190,7 @@ GDExtensions and
192190
:ref:`custom modules <doc_custom_modules_in_cpp>`.
193191

194192
Threads
195-
=======
193+
-------
196194

197195
Consider using threads when making a lot of calculations that can run in
198196
parallel to each other. Modern CPUs have multiple cores, each one capable of
@@ -211,7 +209,7 @@ debugger doesn't support setting up breakpoints in threads yet.
211209
For more information on threads, see :ref:`doc_using_multiple_threads`.
212210

213211
SceneTree
214-
=========
212+
---------
215213

216214
Although Nodes are an incredibly powerful and versatile concept, be aware that
217215
every node has a cost. Built-in functions such as ``_process()`` and
@@ -236,7 +234,7 @@ You can avoid the SceneTree altogether by using Server APIs. For more
236234
information, see :ref:`doc_using_servers`.
237235

238236
Physics
239-
=======
237+
-------
240238

241239
In some situations, physics can end up becoming a bottleneck. This is
242240
particularly the case with complex worlds and large numbers of physics objects.

tutorials/performance/general_optimization.rst

Lines changed: 23 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,10 @@
1-
:article_outdated: True
2-
31
.. _doc_general_optimization:
42

53
General optimization tips
64
=========================
75

86
Introduction
9-
~~~~~~~~~~~~
7+
------------
108

119
In an ideal world, computers would run at infinite speed. The only limit to
1210
what we could achieve would be our imagination. However, in the real world, it's
@@ -48,7 +46,7 @@ But in reality, there are several different kinds of performance problems:
4846
Each of these are annoying to the user, but in different ways.
4947

5048
Measuring performance
51-
=====================
49+
---------------------
5250

5351
Probably the most important tool for optimization is the ability to measure
5452
performance - to identify where bottlenecks are, and to measure the success of
@@ -57,19 +55,24 @@ our attempts to speed them up.
5755
There are several methods of measuring performance, including:
5856

5957
- Putting a start/stop timer around code of interest.
60-
- Using the Godot profiler.
61-
- Using external third-party CPU profilers.
62-
- Using GPU profilers/debuggers such as
63-
`NVIDIA Nsight Graphics <https://developer.nvidia.com/nsight-graphics>`__
64-
or `apitrace <https://apitrace.github.io/>`__.
65-
- Checking the frame rate (with V-Sync disabled).
58+
- Using the :ref:`Godot profiler <doc_the_profiler>`.
59+
- Using :ref:`external CPU profilers <doc_using_cpp_profilers>`.
60+
- Using external GPU profilers/debuggers such as
61+
`NVIDIA Nsight Graphics <https://developer.nvidia.com/nsight-graphics>`__,
62+
`Radeon GPU Profiler <https://gpuopen.com/rgp/>`__ or
63+
`Intel Graphics Performance Analyzers <https://www.intel.com/content/www/us/en/developer/tools/graphics-performance-analyzers/overview.html>`__.
64+
- Checking the frame rate (with V-Sync disabled). Third-party utilities such as
65+
`RivaTuner Statistics Server <https://www.guru3d.com/files-details/rtss-rivatuner-statistics-server-download.html>`__
66+
(Windows) or `MangoHud <https://github.com/flightlessmango/MangoHud>`__
67+
(Linux) can also be useful here.
68+
- Using an unofficial `debug menu add-on <https://github.com/godot-extended-libraries/godot-debug-menu>`.
6669

6770
Be very aware that the relative performance of different areas can vary on
6871
different hardware. It's often a good idea to measure timings on more than one
6972
device. This is especially the case if you're targeting mobile devices.
7073

7174
Limitations
72-
~~~~~~~~~~~
75+
^^^^^^^^^^^
7376

7477
CPU profilers are often the go-to method for measuring performance. However,
7578
they don't always tell the whole story.
@@ -87,7 +90,7 @@ As a result of these limitations, you often need to use detective work to find
8790
out where bottlenecks are.
8891

8992
Detective work
90-
~~~~~~~~~~~~~~
93+
--------------
9194

9295
Detective work is a crucial skill for developers (both in terms of performance,
9396
and also in terms of bug fixing). This can include hypothesis testing, and
@@ -119,7 +122,7 @@ Once you know which of the two halves contains the bottleneck, you can
119122
repeat this process until you've pinned down the problematic area.
120123

121124
Profilers
122-
=========
125+
---------
123126

124127
Profilers allow you to time your program while running it. Profilers then
125128
provide results telling you what percentage of time was spent in different
@@ -133,7 +136,7 @@ and lead to slower performance.
133136
For more info about using Godot's built-in profiler, see :ref:`doc_the_profiler`.
134137

135138
Principles
136-
==========
139+
----------
137140

138141
`Donald Knuth <https://en.wikipedia.org/wiki/Donald_Knuth>`__ said:
139142

@@ -163,7 +166,7 @@ optimization is (by definition) undesirable, performant software is the result
163166
of performant design.
164167

165168
Performant design
166-
~~~~~~~~~~~~~~~~~
169+
^^^^^^^^^^^^^^^^^
167170

168171
The danger with encouraging people to ignore optimization until necessary, is
169172
that it conveniently ignores that the most important time to consider
@@ -178,7 +181,7 @@ will often run many times faster than a mediocre design with low-level
178181
optimization.
179182

180183
Incremental design
181-
~~~~~~~~~~~~~~~~~~
184+
^^^^^^^^^^^^^^^^^^
182185

183186
Of course, in practice, unless you have prior knowledge, you are unlikely to
184187
come up with the best design the first time. Instead, you'll often make a series
@@ -195,7 +198,7 @@ structures and algorithms for *cache locality* of data and linear access, rather
195198
than jumping around in memory.
196199

197200
The optimization process
198-
~~~~~~~~~~~~~~~~~~~~~~~~
201+
^^^^^^^^^^^^^^^^^^^^^^^^
199202

200203
Assuming we have a reasonable design, and taking our lessons from Knuth, our
201204
first step in optimization should be to identify the biggest bottlenecks - the
@@ -212,7 +215,7 @@ The process is thus:
212215
3. Return to step 1.
213216

214217
Optimizing bottlenecks
215-
~~~~~~~~~~~~~~~~~~~~~~
218+
^^^^^^^^^^^^^^^^^^^^^^
216219

217220
Some profilers will even tell you which part of a function (which data accesses,
218221
calculations) are slowing things down.
@@ -237,10 +240,10 @@ positive effect will be outweighed by the negatives of more complex code, and
237240
you may choose to leave out that optimization.
238241

239242
Appendix
240-
========
243+
--------
241244

242245
Bottleneck math
243-
~~~~~~~~~~~~~~~
246+
^^^^^^^^^^^^^^^
244247

245248
The proverb *"a chain is only as strong as its weakest link"* applies directly to
246249
performance optimization. If your project is spending 90% of the time in

0 commit comments

Comments
 (0)