Update performance optimization pages for Godot 4.x

This commit is contained in:
Hugo Locurcio
2023-07-12 00:43:12 +02:00
parent 3c2412dd8f
commit df9bf74ca1
5 changed files with 135 additions and 118 deletions

View File

@@ -1,12 +1,10 @@
:article_outdated: True
.. _doc_gpu_optimization:
GPU optimization
================
Introduction
~~~~~~~~~~~~
------------
The demand for new graphics features and progress almost guarantees that you
will encounter graphics bottlenecks. Some of these can be on the CPU side, for
@@ -25,16 +23,16 @@ more difficult to take measurements. In many cases, the only way of measuring
performance is by examining changes in the time spent rendering each frame.
Draw calls, state changes, and APIs
===================================
-----------------------------------
.. note:: The following section is not relevant to end-users, but is useful to
provide background information that is relevant in later sections.
Godot sends instructions to the GPU via a graphics API (OpenGL, OpenGL ES or
Vulkan). The communication and driver activity involved can be quite costly,
especially in OpenGL and OpenGL ES. If we can provide these instructions in a
way that is preferred by the driver and GPU, we can greatly increase
performance.
Godot sends instructions to the GPU via a graphics API (Vulkan, OpenGL, OpenGL
ES or WebGL). The communication and driver activity involved can be quite
costly, especially in OpenGL, OpenGL ES and WebGL. If we can provide these
instructions in a way that is preferred by the driver and GPU, we can greatly
increase performance.
Nearly every API command in OpenGL requires a certain amount of validation to
make sure the GPU is in the correct state. Even seemingly simple commands can
@@ -44,17 +42,21 @@ as much as possible so they can be rendered together, or with the minimum number
of these expensive state changes.
2D batching
~~~~~~~~~~~
^^^^^^^^^^^
In 2D, the costs of treating each item individually can be prohibitively high -
there can easily be thousands of them on the screen. This is why 2D *batching*
is used. Multiple similar items are grouped together and rendered in a batch,
via a single draw call, rather than making a separate draw call for each item.
In addition, this means state changes, material and texture changes can be kept
to a minimum.
is used with OpenGL-based rendering methods. Multiple similar items are grouped
together and rendered in a batch, via a single draw call, rather than making a
separate draw call for each item. In addition, this means state changes,
material and texture changes can be kept to a minimum.
Vulkan-based rendering methods do not use 2D batching yet. Since draw calls are
much cheaper with Vulkan compared to OpenGL, there is less of a need to have 2D
batching (although it can still be beneficial in some cases).
3D batching
~~~~~~~~~~~
^^^^^^^^^^^
In 3D, we still aim to minimize draw calls and state changes. However, it can be
more difficult to batch together several objects into a single draw call. 3D
@@ -62,7 +64,7 @@ meshes tend to comprise hundreds or thousands of triangles, and combining large
meshes in real-time is prohibitively expensive. The costs of joining them quickly
exceeds any benefits as the number of triangles grows per mesh. A much better
alternative is to **join meshes ahead of time** (static meshes in relation to each
other). This can either be done by artists, or programmatically within Godot.
other). This can be done by artists, or programmatically within Godot using an add-on.
There is also a cost to batching together objects in 3D. Several objects
rendered as one cannot be individually culled. An entire city that is off-screen
@@ -75,8 +77,8 @@ numbers of distant or low-poly objects.
For more information on 3D specific optimizations, see
:ref:`doc_optimizing_3d_performance`.
Reuse Shaders and Materials
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reuse shaders and materials
^^^^^^^^^^^^^^^^^^^^^^^^^^^
The Godot renderer is a little different to what is out there. It's designed to
minimize GPU state changes as much as possible. :ref:`StandardMaterial3D
@@ -94,12 +96,12 @@ possible. Godot's priorities are:
that are enabled or disabled with a check box) even if they have different
parameters.
If a scene has, for example, ``20,000`` objects with ``20,000`` different
materials each, rendering will be slow. If the same scene has ``20,000``
objects, but only uses ``100`` materials, rendering will be much faster.
If a scene has, for example, 20,000 objects with 20,000 different
materials each, rendering will be slow. If the same scene has 20,000
objects, but only uses 100 materials, rendering will be much faster.
Pixel cost versus vertex cost
=============================
-----------------------------
You may have heard that the lower the number of polygons in a model, the faster
it will be rendered. This is *really* relative and depends on many factors.
@@ -152,15 +154,17 @@ Pay attention to the additional vertex processing required when using:
- Skinning (skeletal animation)
- Morphs (shape keys)
- Vertex-lit objects (common on mobile)
.. Not implemented in Godot 4.x yet. Uncomment when this is implemented.
- Vertex-lit objects (common on mobile)
Pixel/fragment shaders and fill rate
====================================
------------------------------------
In contrast to vertex processing, the costs of fragment (per-pixel) shading have
increased dramatically over the years. Screen resolutions have increased (the
increased dramatically over the years. Screen resolutions have increased: the
area of a 4K screen is 8,294,400 pixels, versus 307,200 for an old 640×480 VGA
screen, that is 27x the area), but also the complexity of fragment shaders has
screen. That is 27 times the area! Also, the complexity of fragment shaders has
exploded. Physically-based rendering requires complex calculations for each
fragment.
@@ -190,7 +194,7 @@ their material to decrease the shading cost.
you can reasonably afford to use.**
Reading textures
~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^
The other factor in fragment shaders is the cost of reading textures. Reading
textures is an expensive operation, especially when reading from several
@@ -203,7 +207,7 @@ mobiles.
algorithms that require as few texture reads as possible.**
Texture compression
~~~~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^^^^
By default, Godot compresses textures of 3D models when imported using video RAM
(VRAM) compression. Video RAM compression isn't as efficient in size as PNG or
@@ -227,9 +231,8 @@ textures with transparency (only opaque), so keep this in mind.
will negatively affect their appearance, without improving performance
significantly due to their low resolution.
Post-processing and shadows
~~~~~~~~~~~~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Post-processing effects and shadows can also be expensive in terms of fragment
shading activity. Always test the impact of these on different hardware.
@@ -241,7 +244,7 @@ possible. Smaller or distant OmniLights/SpotLights can often have their shadows
disabled with only a small visual impact.
Transparency and blending
=========================
-------------------------
Transparent objects present particular problems for rendering efficiency. Opaque
objects (especially in 3D) can be essentially rendered in any order and the
@@ -266,7 +269,7 @@ very expensive. Indeed, in many situations, rendering more complex opaque
geometry can end up being faster than using transparency to "cheat".
Multi-platform advice
=====================
---------------------
If you are aiming to release on multiple platforms, test *early* and test
*often* on all your platforms, especially mobile. Developing a game on desktop
@@ -278,7 +281,7 @@ to use the Compatibility rendering method for both desktop and mobile platforms
where you target both.
Mobile/tiled renderers
======================
----------------------
As described above, GPUs on mobile devices work in dramatically different ways
from GPUs on desktop. Most mobile devices use tile renderers. Tile renderers