mirror of
https://github.com/godotengine/godot-docs.git
synced 2026-01-04 14:11:02 +03:00
Update performance optimization pages for Godot 4.x
This commit is contained in:
@@ -1,12 +1,10 @@
|
||||
:article_outdated: True
|
||||
|
||||
.. _doc_gpu_optimization:
|
||||
|
||||
GPU optimization
|
||||
================
|
||||
|
||||
Introduction
|
||||
~~~~~~~~~~~~
|
||||
------------
|
||||
|
||||
The demand for new graphics features and progress almost guarantees that you
|
||||
will encounter graphics bottlenecks. Some of these can be on the CPU side, for
|
||||
@@ -25,16 +23,16 @@ more difficult to take measurements. In many cases, the only way of measuring
|
||||
performance is by examining changes in the time spent rendering each frame.
|
||||
|
||||
Draw calls, state changes, and APIs
|
||||
===================================
|
||||
-----------------------------------
|
||||
|
||||
.. note:: The following section is not relevant to end-users, but is useful to
|
||||
provide background information that is relevant in later sections.
|
||||
|
||||
Godot sends instructions to the GPU via a graphics API (OpenGL, OpenGL ES or
|
||||
Vulkan). The communication and driver activity involved can be quite costly,
|
||||
especially in OpenGL and OpenGL ES. If we can provide these instructions in a
|
||||
way that is preferred by the driver and GPU, we can greatly increase
|
||||
performance.
|
||||
Godot sends instructions to the GPU via a graphics API (Vulkan, OpenGL, OpenGL
|
||||
ES or WebGL). The communication and driver activity involved can be quite
|
||||
costly, especially in OpenGL, OpenGL ES and WebGL. If we can provide these
|
||||
instructions in a way that is preferred by the driver and GPU, we can greatly
|
||||
increase performance.
|
||||
|
||||
Nearly every API command in OpenGL requires a certain amount of validation to
|
||||
make sure the GPU is in the correct state. Even seemingly simple commands can
|
||||
@@ -44,17 +42,21 @@ as much as possible so they can be rendered together, or with the minimum number
|
||||
of these expensive state changes.
|
||||
|
||||
2D batching
|
||||
~~~~~~~~~~~
|
||||
^^^^^^^^^^^
|
||||
|
||||
In 2D, the costs of treating each item individually can be prohibitively high -
|
||||
there can easily be thousands of them on the screen. This is why 2D *batching*
|
||||
is used. Multiple similar items are grouped together and rendered in a batch,
|
||||
via a single draw call, rather than making a separate draw call for each item.
|
||||
In addition, this means state changes, material and texture changes can be kept
|
||||
to a minimum.
|
||||
is used with OpenGL-based rendering methods. Multiple similar items are grouped
|
||||
together and rendered in a batch, via a single draw call, rather than making a
|
||||
separate draw call for each item. In addition, this means state changes,
|
||||
material and texture changes can be kept to a minimum.
|
||||
|
||||
Vulkan-based rendering methods do not use 2D batching yet. Since draw calls are
|
||||
much cheaper with Vulkan compared to OpenGL, there is less of a need to have 2D
|
||||
batching (although it can still be beneficial in some cases).
|
||||
|
||||
3D batching
|
||||
~~~~~~~~~~~
|
||||
^^^^^^^^^^^
|
||||
|
||||
In 3D, we still aim to minimize draw calls and state changes. However, it can be
|
||||
more difficult to batch together several objects into a single draw call. 3D
|
||||
@@ -62,7 +64,7 @@ meshes tend to comprise hundreds or thousands of triangles, and combining large
|
||||
meshes in real-time is prohibitively expensive. The costs of joining them quickly
|
||||
exceeds any benefits as the number of triangles grows per mesh. A much better
|
||||
alternative is to **join meshes ahead of time** (static meshes in relation to each
|
||||
other). This can either be done by artists, or programmatically within Godot.
|
||||
other). This can be done by artists, or programmatically within Godot using an add-on.
|
||||
|
||||
There is also a cost to batching together objects in 3D. Several objects
|
||||
rendered as one cannot be individually culled. An entire city that is off-screen
|
||||
@@ -75,8 +77,8 @@ numbers of distant or low-poly objects.
|
||||
For more information on 3D specific optimizations, see
|
||||
:ref:`doc_optimizing_3d_performance`.
|
||||
|
||||
Reuse Shaders and Materials
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Reuse shaders and materials
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The Godot renderer is a little different to what is out there. It's designed to
|
||||
minimize GPU state changes as much as possible. :ref:`StandardMaterial3D
|
||||
@@ -94,12 +96,12 @@ possible. Godot's priorities are:
|
||||
that are enabled or disabled with a check box) even if they have different
|
||||
parameters.
|
||||
|
||||
If a scene has, for example, ``20,000`` objects with ``20,000`` different
|
||||
materials each, rendering will be slow. If the same scene has ``20,000``
|
||||
objects, but only uses ``100`` materials, rendering will be much faster.
|
||||
If a scene has, for example, 20,000 objects with 20,000 different
|
||||
materials each, rendering will be slow. If the same scene has 20,000
|
||||
objects, but only uses 100 materials, rendering will be much faster.
|
||||
|
||||
Pixel cost versus vertex cost
|
||||
=============================
|
||||
-----------------------------
|
||||
|
||||
You may have heard that the lower the number of polygons in a model, the faster
|
||||
it will be rendered. This is *really* relative and depends on many factors.
|
||||
@@ -152,15 +154,17 @@ Pay attention to the additional vertex processing required when using:
|
||||
|
||||
- Skinning (skeletal animation)
|
||||
- Morphs (shape keys)
|
||||
- Vertex-lit objects (common on mobile)
|
||||
|
||||
.. Not implemented in Godot 4.x yet. Uncomment when this is implemented.
|
||||
- Vertex-lit objects (common on mobile)
|
||||
|
||||
Pixel/fragment shaders and fill rate
|
||||
====================================
|
||||
------------------------------------
|
||||
|
||||
In contrast to vertex processing, the costs of fragment (per-pixel) shading have
|
||||
increased dramatically over the years. Screen resolutions have increased (the
|
||||
increased dramatically over the years. Screen resolutions have increased: the
|
||||
area of a 4K screen is 8,294,400 pixels, versus 307,200 for an old 640×480 VGA
|
||||
screen, that is 27x the area), but also the complexity of fragment shaders has
|
||||
screen. That is 27 times the area! Also, the complexity of fragment shaders has
|
||||
exploded. Physically-based rendering requires complex calculations for each
|
||||
fragment.
|
||||
|
||||
@@ -190,7 +194,7 @@ their material to decrease the shading cost.
|
||||
you can reasonably afford to use.**
|
||||
|
||||
Reading textures
|
||||
~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
The other factor in fragment shaders is the cost of reading textures. Reading
|
||||
textures is an expensive operation, especially when reading from several
|
||||
@@ -203,7 +207,7 @@ mobiles.
|
||||
algorithms that require as few texture reads as possible.**
|
||||
|
||||
Texture compression
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
By default, Godot compresses textures of 3D models when imported using video RAM
|
||||
(VRAM) compression. Video RAM compression isn't as efficient in size as PNG or
|
||||
@@ -227,9 +231,8 @@ textures with transparency (only opaque), so keep this in mind.
|
||||
will negatively affect their appearance, without improving performance
|
||||
significantly due to their low resolution.
|
||||
|
||||
|
||||
Post-processing and shadows
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Post-processing effects and shadows can also be expensive in terms of fragment
|
||||
shading activity. Always test the impact of these on different hardware.
|
||||
@@ -241,7 +244,7 @@ possible. Smaller or distant OmniLights/SpotLights can often have their shadows
|
||||
disabled with only a small visual impact.
|
||||
|
||||
Transparency and blending
|
||||
=========================
|
||||
-------------------------
|
||||
|
||||
Transparent objects present particular problems for rendering efficiency. Opaque
|
||||
objects (especially in 3D) can be essentially rendered in any order and the
|
||||
@@ -266,7 +269,7 @@ very expensive. Indeed, in many situations, rendering more complex opaque
|
||||
geometry can end up being faster than using transparency to "cheat".
|
||||
|
||||
Multi-platform advice
|
||||
=====================
|
||||
---------------------
|
||||
|
||||
If you are aiming to release on multiple platforms, test *early* and test
|
||||
*often* on all your platforms, especially mobile. Developing a game on desktop
|
||||
@@ -278,7 +281,7 @@ to use the Compatibility rendering method for both desktop and mobile platforms
|
||||
where you target both.
|
||||
|
||||
Mobile/tiled renderers
|
||||
======================
|
||||
----------------------
|
||||
|
||||
As described above, GPUs on mobile devices work in dramatically different ways
|
||||
from GPUs on desktop. Most mobile devices use tile renderers. Tile renderers
|
||||
|
||||
Reference in New Issue
Block a user