Skip to content

Commit df9bf74

Browse files
committed
Update performance optimization pages for Godot 4.x
1 parent 3c2412d commit df9bf74

File tree

5 files changed

+135
-118
lines changed

5 files changed

+135
-118
lines changed

tutorials/performance/cpu_optimization.rst

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,10 @@
1-
:article_outdated: True
2-
31
.. _doc_cpu_optimization:
42

53
CPU optimization
64
================
75

86
Measuring performance
9-
=====================
7+
---------------------
108

119
We have to know where the "bottlenecks" are to know how to speed up our program.
1210
Bottlenecks are the slowest parts of the program that limit the rate that
@@ -18,7 +16,7 @@ lead to small performance improvements.
1816
For the CPU, the easiest way to identify bottlenecks is to use a profiler.
1917

2018
CPU profilers
21-
=============
19+
-------------
2220

2321
Profilers run alongside your program and take timing measurements to work out
2422
what proportion of time is spent in each function.
@@ -31,7 +29,7 @@ slow down your project significantly.
3129
After profiling, you can look back at the results for a frame.
3230

3331
.. figure:: img/godot_profiler.png
34-
.. figure:: img/godot_profiler.png
32+
:align: center
3533
:alt: Screenshot of the Godot profiler
3634

3735
Results of a profile of one of the demo projects.
@@ -51,7 +49,7 @@ For more info about using Godot's built-in profiler, see
5149
:ref:`doc_debugger_panel`.
5250

5351
External profilers
54-
~~~~~~~~~~~~~~~~~~
52+
------------------
5553

5654
Although the Godot IDE profiler is very convenient and useful, sometimes you
5755
need more power, and the ability to profile the Godot engine source code itself.
@@ -98,7 +96,7 @@ batching, which greatly speeds up 2D rendering by reducing bottlenecks in this
9896
area.
9997

10098
Manually timing functions
101-
=========================
99+
-------------------------
102100

103101
Another handy technique, especially once you have identified the bottleneck
104102
using a profiler, is to manually time the function or area under test.
@@ -126,7 +124,7 @@ time them as you go. This will give you crucial feedback as to whether the
126124
optimization is working (or not).
127125

128126
Caches
129-
======
127+
------
130128

131129
CPU caches are something else to be particularly aware of, especially when
132130
comparing timing results of two different versions of a function. The results
@@ -159,7 +157,7 @@ rendering and physics. Still, you should be especially aware of caching when
159157
writing GDExtensions.
160158

161159
Languages
162-
=========
160+
---------
163161

164162
Godot supports a number of different languages, and it is worth bearing in mind
165163
that there are trade-offs involved. Some languages are designed for ease of use
@@ -170,7 +168,7 @@ language you choose. If your project is making a lot of calculations in its own
170168
code, consider moving those calculations to a faster language.
171169

172170
GDScript
173-
~~~~~~~~
171+
^^^^^^^^
174172

175173
:ref:`GDScript <toc-learn-scripting-gdscript>` is designed to be easy to use and iterate,
176174
and is ideal for making many types of games. However, in this language, ease of
@@ -179,7 +177,7 @@ calculations, consider moving some of your project to one of the other
179177
languages.
180178

181179
C#
182-
~~
180+
^^
183181

184182
:ref:`C# <toc-learn-scripting-C#>` is popular and has first-class support in Godot. It
185183
offers a good compromise between speed and ease of use. Beware of possible
@@ -188,13 +186,13 @@ common approach to workaround issues with garbage collection is to use *object
188186
pooling*, which is outside the scope of this guide.
189187

190188
Other languages
191-
~~~~~~~~~~~~~~~
189+
^^^^^^^^^^^^^^^
192190

193191
Third parties provide support for several other languages, including `Rust
194192
<https://github.com/godot-rust/gdext>`_.
195193

196194
C++
197-
~~~
195+
^^^
198196

199197
Godot is written in C++. Using C++ will usually result in the fastest code.
200198
However, on a practical level, it is the most difficult to deploy to end users'
@@ -203,7 +201,7 @@ GDExtensions and
203201
:ref:`custom modules <doc_custom_modules_in_cpp>`.
204202

205203
Threads
206-
=======
204+
-------
207205

208206
Consider using threads when making a lot of calculations that can run in
209207
parallel to each other. Modern CPUs have multiple cores, each one capable of
@@ -222,7 +220,7 @@ debugger doesn't support setting up breakpoints in threads yet.
222220
For more information on threads, see :ref:`doc_using_multiple_threads`.
223221

224222
SceneTree
225-
=========
223+
---------
226224

227225
Although Nodes are an incredibly powerful and versatile concept, be aware that
228226
every node has a cost. Built-in functions such as `_process()` and
@@ -247,7 +245,7 @@ You can avoid the SceneTree altogether by using Server APIs. For more
247245
information, see :ref:`doc_using_servers`.
248246

249247
Physics
250-
=======
248+
-------
251249

252250
In some situations, physics can end up becoming a bottleneck. This is
253251
particularly the case with complex worlds and large numbers of physics objects.

tutorials/performance/general_optimization.rst

Lines changed: 23 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,10 @@
1-
:article_outdated: True
2-
31
.. _doc_general_optimization:
42

53
General optimization tips
64
=========================
75

86
Introduction
9-
~~~~~~~~~~~~
7+
------------
108

119
In an ideal world, computers would run at infinite speed. The only limit to
1210
what we could achieve would be our imagination. However, in the real world, it's
@@ -48,7 +46,7 @@ But in reality, there are several different kinds of performance problems:
4846
Each of these are annoying to the user, but in different ways.
4947

5048
Measuring performance
51-
=====================
49+
---------------------
5250

5351
Probably the most important tool for optimization is the ability to measure
5452
performance - to identify where bottlenecks are, and to measure the success of
@@ -57,19 +55,24 @@ our attempts to speed them up.
5755
There are several methods of measuring performance, including:
5856

5957
- Putting a start/stop timer around code of interest.
60-
- Using the Godot profiler.
61-
- Using external third-party CPU profilers.
62-
- Using GPU profilers/debuggers such as
63-
`NVIDIA Nsight Graphics <https://developer.nvidia.com/nsight-graphics>`__
64-
or `apitrace <https://apitrace.github.io/>`__.
65-
- Checking the frame rate (with V-Sync disabled).
58+
- Using the :ref:`Godot profiler <doc_the_profiler>`.
59+
- Using :ref:`external CPU profilers <doc_using_cpp_profilers>`.
60+
- Using external GPU profilers/debuggers such as
61+
`NVIDIA Nsight Graphics <https://developer.nvidia.com/nsight-graphics>`__,
62+
`Radeon GPU Profiler <https://gpuopen.com/rgp/>`__ or
63+
`Intel Graphics Performance Analyzers <https://www.intel.com/content/www/us/en/developer/tools/graphics-performance-analyzers/overview.html>`__.
64+
- Checking the frame rate (with V-Sync disabled). Third-party utilities such as
65+
`RivaTuner Statistics Server <https://www.guru3d.com/files-details/rtss-rivatuner-statistics-server-download.html>`__
66+
(Windows) or `MangoHud <https://github.com/flightlessmango/MangoHud>`__
67+
(Linux) can also be useful here.
68+
- Using an unofficial `debug menu add-on <https://github.com/godot-extended-libraries/godot-debug-menu>`.
6669

6770
Be very aware that the relative performance of different areas can vary on
6871
different hardware. It's often a good idea to measure timings on more than one
6972
device. This is especially the case if you're targeting mobile devices.
7073

7174
Limitations
72-
~~~~~~~~~~~
75+
^^^^^^^^^^^
7376

7477
CPU profilers are often the go-to method for measuring performance. However,
7578
they don't always tell the whole story.
@@ -87,7 +90,7 @@ As a result of these limitations, you often need to use detective work to find
8790
out where bottlenecks are.
8891

8992
Detective work
90-
~~~~~~~~~~~~~~
93+
--------------
9194

9295
Detective work is a crucial skill for developers (both in terms of performance,
9396
and also in terms of bug fixing). This can include hypothesis testing, and
@@ -119,7 +122,7 @@ Once you know which of the two halves contains the bottleneck, you can
119122
repeat this process until you've pinned down the problematic area.
120123

121124
Profilers
122-
=========
125+
---------
123126

124127
Profilers allow you to time your program while running it. Profilers then
125128
provide results telling you what percentage of time was spent in different
@@ -133,7 +136,7 @@ and lead to slower performance.
133136
For more info about using Godot's built-in profiler, see :ref:`doc_the_profiler`.
134137

135138
Principles
136-
==========
139+
----------
137140

138141
`Donald Knuth <https://en.wikipedia.org/wiki/Donald_Knuth>`__ said:
139142

@@ -163,7 +166,7 @@ optimization is (by definition) undesirable, performant software is the result
163166
of performant design.
164167

165168
Performant design
166-
~~~~~~~~~~~~~~~~~
169+
^^^^^^^^^^^^^^^^^
167170

168171
The danger with encouraging people to ignore optimization until necessary, is
169172
that it conveniently ignores that the most important time to consider
@@ -178,7 +181,7 @@ will often run many times faster than a mediocre design with low-level
178181
optimization.
179182

180183
Incremental design
181-
~~~~~~~~~~~~~~~~~~
184+
^^^^^^^^^^^^^^^^^^
182185

183186
Of course, in practice, unless you have prior knowledge, you are unlikely to
184187
come up with the best design the first time. Instead, you'll often make a series
@@ -195,7 +198,7 @@ structures and algorithms for *cache locality* of data and linear access, rather
195198
than jumping around in memory.
196199

197200
The optimization process
198-
~~~~~~~~~~~~~~~~~~~~~~~~
201+
^^^^^^^^^^^^^^^^^^^^^^^^
199202

200203
Assuming we have a reasonable design, and taking our lessons from Knuth, our
201204
first step in optimization should be to identify the biggest bottlenecks - the
@@ -212,7 +215,7 @@ The process is thus:
212215
3. Return to step 1.
213216

214217
Optimizing bottlenecks
215-
~~~~~~~~~~~~~~~~~~~~~~
218+
^^^^^^^^^^^^^^^^^^^^^^
216219

217220
Some profilers will even tell you which part of a function (which data accesses,
218221
calculations) are slowing things down.
@@ -237,10 +240,10 @@ positive effect will be outweighed by the negatives of more complex code, and
237240
you may choose to leave out that optimization.
238241

239242
Appendix
240-
========
243+
--------
241244

242245
Bottleneck math
243-
~~~~~~~~~~~~~~~
246+
^^^^^^^^^^^^^^^
244247

245248
The proverb *"a chain is only as strong as its weakest link"* applies directly to
246249
performance optimization. If your project is spending 90% of the time in

0 commit comments

Comments
 (0)