mirror of
https://github.com/godotengine/godot-docs.git
synced 2025-12-31 17:49:03 +03:00
Merge pull request #7647 from Calinou/update-performance
This commit is contained in:
@@ -1,12 +1,10 @@
|
||||
:article_outdated: True
|
||||
|
||||
.. _doc_cpu_optimization:
|
||||
|
||||
CPU optimization
|
||||
================
|
||||
|
||||
Measuring performance
|
||||
=====================
|
||||
---------------------
|
||||
|
||||
We have to know where the "bottlenecks" are to know how to speed up our program.
|
||||
Bottlenecks are the slowest parts of the program that limit the rate that
|
||||
@@ -18,7 +16,7 @@ lead to small performance improvements.
|
||||
For the CPU, the easiest way to identify bottlenecks is to use a profiler.
|
||||
|
||||
CPU profilers
|
||||
=============
|
||||
-------------
|
||||
|
||||
Profilers run alongside your program and take timing measurements to work out
|
||||
what proportion of time is spent in each function.
|
||||
@@ -31,7 +29,7 @@ slow down your project significantly.
|
||||
After profiling, you can look back at the results for a frame.
|
||||
|
||||
.. figure:: img/godot_profiler.png
|
||||
.. figure:: img/godot_profiler.png
|
||||
:align: center
|
||||
:alt: Screenshot of the Godot profiler
|
||||
|
||||
Results of a profile of one of the demo projects.
|
||||
@@ -51,7 +49,7 @@ For more info about using Godot's built-in profiler, see
|
||||
:ref:`doc_debugger_panel`.
|
||||
|
||||
External profilers
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
------------------
|
||||
|
||||
Although the Godot IDE profiler is very convenient and useful, sometimes you
|
||||
need more power, and the ability to profile the Godot engine source code itself.
|
||||
@@ -87,7 +85,7 @@ batching, which greatly speeds up 2D rendering by reducing bottlenecks in this
|
||||
area.
|
||||
|
||||
Manually timing functions
|
||||
=========================
|
||||
-------------------------
|
||||
|
||||
Another handy technique, especially once you have identified the bottleneck
|
||||
using a profiler, is to manually time the function or area under test.
|
||||
@@ -115,7 +113,7 @@ time them as you go. This will give you crucial feedback as to whether the
|
||||
optimization is working (or not).
|
||||
|
||||
Caches
|
||||
======
|
||||
------
|
||||
|
||||
CPU caches are something else to be particularly aware of, especially when
|
||||
comparing timing results of two different versions of a function. The results
|
||||
@@ -148,7 +146,7 @@ rendering and physics. Still, you should be especially aware of caching when
|
||||
writing GDExtensions.
|
||||
|
||||
Languages
|
||||
=========
|
||||
---------
|
||||
|
||||
Godot supports a number of different languages, and it is worth bearing in mind
|
||||
that there are trade-offs involved. Some languages are designed for ease of use
|
||||
@@ -159,7 +157,7 @@ language you choose. If your project is making a lot of calculations in its own
|
||||
code, consider moving those calculations to a faster language.
|
||||
|
||||
GDScript
|
||||
~~~~~~~~
|
||||
^^^^^^^^
|
||||
|
||||
:ref:`GDScript <toc-learn-scripting-gdscript>` is designed to be easy to use and iterate,
|
||||
and is ideal for making many types of games. However, in this language, ease of
|
||||
@@ -168,7 +166,7 @@ calculations, consider moving some of your project to one of the other
|
||||
languages.
|
||||
|
||||
C#
|
||||
~~
|
||||
^^
|
||||
|
||||
:ref:`C# <toc-learn-scripting-C#>` is popular and has first-class support in Godot. It
|
||||
offers a good compromise between speed and ease of use. Beware of possible
|
||||
@@ -177,13 +175,13 @@ common approach to workaround issues with garbage collection is to use *object
|
||||
pooling*, which is outside the scope of this guide.
|
||||
|
||||
Other languages
|
||||
~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^
|
||||
|
||||
Third parties provide support for several other languages, including `Rust
|
||||
<https://github.com/godot-rust/gdext>`_.
|
||||
|
||||
C++
|
||||
~~~
|
||||
^^^
|
||||
|
||||
Godot is written in C++. Using C++ will usually result in the fastest code.
|
||||
However, on a practical level, it is the most difficult to deploy to end users'
|
||||
@@ -192,7 +190,7 @@ GDExtensions and
|
||||
:ref:`custom modules <doc_custom_modules_in_cpp>`.
|
||||
|
||||
Threads
|
||||
=======
|
||||
-------
|
||||
|
||||
Consider using threads when making a lot of calculations that can run in
|
||||
parallel to each other. Modern CPUs have multiple cores, each one capable of
|
||||
@@ -211,7 +209,7 @@ debugger doesn't support setting up breakpoints in threads yet.
|
||||
For more information on threads, see :ref:`doc_using_multiple_threads`.
|
||||
|
||||
SceneTree
|
||||
=========
|
||||
---------
|
||||
|
||||
Although Nodes are an incredibly powerful and versatile concept, be aware that
|
||||
every node has a cost. Built-in functions such as ``_process()`` and
|
||||
@@ -236,7 +234,7 @@ You can avoid the SceneTree altogether by using Server APIs. For more
|
||||
information, see :ref:`doc_using_servers`.
|
||||
|
||||
Physics
|
||||
=======
|
||||
-------
|
||||
|
||||
In some situations, physics can end up becoming a bottleneck. This is
|
||||
particularly the case with complex worlds and large numbers of physics objects.
|
||||
|
||||
@@ -1,12 +1,10 @@
|
||||
:article_outdated: True
|
||||
|
||||
.. _doc_general_optimization:
|
||||
|
||||
General optimization tips
|
||||
=========================
|
||||
|
||||
Introduction
|
||||
~~~~~~~~~~~~
|
||||
------------
|
||||
|
||||
In an ideal world, computers would run at infinite speed. The only limit to
|
||||
what we could achieve would be our imagination. However, in the real world, it's
|
||||
@@ -48,7 +46,7 @@ But in reality, there are several different kinds of performance problems:
|
||||
Each of these are annoying to the user, but in different ways.
|
||||
|
||||
Measuring performance
|
||||
=====================
|
||||
---------------------
|
||||
|
||||
Probably the most important tool for optimization is the ability to measure
|
||||
performance - to identify where bottlenecks are, and to measure the success of
|
||||
@@ -57,19 +55,24 @@ our attempts to speed them up.
|
||||
There are several methods of measuring performance, including:
|
||||
|
||||
- Putting a start/stop timer around code of interest.
|
||||
- Using the Godot profiler.
|
||||
- Using external third-party CPU profilers.
|
||||
- Using GPU profilers/debuggers such as
|
||||
`NVIDIA Nsight Graphics <https://developer.nvidia.com/nsight-graphics>`__
|
||||
or `apitrace <https://apitrace.github.io/>`__.
|
||||
- Checking the frame rate (with V-Sync disabled).
|
||||
- Using the :ref:`Godot profiler <doc_the_profiler>`.
|
||||
- Using :ref:`external CPU profilers <doc_using_cpp_profilers>`.
|
||||
- Using external GPU profilers/debuggers such as
|
||||
`NVIDIA Nsight Graphics <https://developer.nvidia.com/nsight-graphics>`__,
|
||||
`Radeon GPU Profiler <https://gpuopen.com/rgp/>`__ or
|
||||
`Intel Graphics Performance Analyzers <https://www.intel.com/content/www/us/en/developer/tools/graphics-performance-analyzers/overview.html>`__.
|
||||
- Checking the frame rate (with V-Sync disabled). Third-party utilities such as
|
||||
`RivaTuner Statistics Server <https://www.guru3d.com/files-details/rtss-rivatuner-statistics-server-download.html>`__
|
||||
(Windows) or `MangoHud <https://github.com/flightlessmango/MangoHud>`__
|
||||
(Linux) can also be useful here.
|
||||
- Using an unofficial `debug menu add-on <https://github.com/godot-extended-libraries/godot-debug-menu>`.
|
||||
|
||||
Be very aware that the relative performance of different areas can vary on
|
||||
different hardware. It's often a good idea to measure timings on more than one
|
||||
device. This is especially the case if you're targeting mobile devices.
|
||||
|
||||
Limitations
|
||||
~~~~~~~~~~~
|
||||
^^^^^^^^^^^
|
||||
|
||||
CPU profilers are often the go-to method for measuring performance. However,
|
||||
they don't always tell the whole story.
|
||||
@@ -87,7 +90,7 @@ As a result of these limitations, you often need to use detective work to find
|
||||
out where bottlenecks are.
|
||||
|
||||
Detective work
|
||||
~~~~~~~~~~~~~~
|
||||
--------------
|
||||
|
||||
Detective work is a crucial skill for developers (both in terms of performance,
|
||||
and also in terms of bug fixing). This can include hypothesis testing, and
|
||||
@@ -119,7 +122,7 @@ Once you know which of the two halves contains the bottleneck, you can
|
||||
repeat this process until you've pinned down the problematic area.
|
||||
|
||||
Profilers
|
||||
=========
|
||||
---------
|
||||
|
||||
Profilers allow you to time your program while running it. Profilers then
|
||||
provide results telling you what percentage of time was spent in different
|
||||
@@ -133,7 +136,7 @@ and lead to slower performance.
|
||||
For more info about using Godot's built-in profiler, see :ref:`doc_the_profiler`.
|
||||
|
||||
Principles
|
||||
==========
|
||||
----------
|
||||
|
||||
`Donald Knuth <https://en.wikipedia.org/wiki/Donald_Knuth>`__ said:
|
||||
|
||||
@@ -163,7 +166,7 @@ optimization is (by definition) undesirable, performant software is the result
|
||||
of performant design.
|
||||
|
||||
Performant design
|
||||
~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^
|
||||
|
||||
The danger with encouraging people to ignore optimization until necessary, is
|
||||
that it conveniently ignores that the most important time to consider
|
||||
@@ -178,7 +181,7 @@ will often run many times faster than a mediocre design with low-level
|
||||
optimization.
|
||||
|
||||
Incremental design
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Of course, in practice, unless you have prior knowledge, you are unlikely to
|
||||
come up with the best design the first time. Instead, you'll often make a series
|
||||
@@ -195,7 +198,7 @@ structures and algorithms for *cache locality* of data and linear access, rather
|
||||
than jumping around in memory.
|
||||
|
||||
The optimization process
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Assuming we have a reasonable design, and taking our lessons from Knuth, our
|
||||
first step in optimization should be to identify the biggest bottlenecks - the
|
||||
@@ -212,7 +215,7 @@ The process is thus:
|
||||
3. Return to step 1.
|
||||
|
||||
Optimizing bottlenecks
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Some profilers will even tell you which part of a function (which data accesses,
|
||||
calculations) are slowing things down.
|
||||
@@ -237,10 +240,10 @@ positive effect will be outweighed by the negatives of more complex code, and
|
||||
you may choose to leave out that optimization.
|
||||
|
||||
Appendix
|
||||
========
|
||||
--------
|
||||
|
||||
Bottleneck math
|
||||
~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^
|
||||
|
||||
The proverb *"a chain is only as strong as its weakest link"* applies directly to
|
||||
performance optimization. If your project is spending 90% of the time in
|
||||
|
||||
@@ -1,12 +1,10 @@
|
||||
:article_outdated: True
|
||||
|
||||
.. _doc_gpu_optimization:
|
||||
|
||||
GPU optimization
|
||||
================
|
||||
|
||||
Introduction
|
||||
~~~~~~~~~~~~
|
||||
------------
|
||||
|
||||
The demand for new graphics features and progress almost guarantees that you
|
||||
will encounter graphics bottlenecks. Some of these can be on the CPU side, for
|
||||
@@ -25,16 +23,16 @@ more difficult to take measurements. In many cases, the only way of measuring
|
||||
performance is by examining changes in the time spent rendering each frame.
|
||||
|
||||
Draw calls, state changes, and APIs
|
||||
===================================
|
||||
-----------------------------------
|
||||
|
||||
.. note:: The following section is not relevant to end-users, but is useful to
|
||||
provide background information that is relevant in later sections.
|
||||
|
||||
Godot sends instructions to the GPU via a graphics API (OpenGL, OpenGL ES or
|
||||
Vulkan). The communication and driver activity involved can be quite costly,
|
||||
especially in OpenGL and OpenGL ES. If we can provide these instructions in a
|
||||
way that is preferred by the driver and GPU, we can greatly increase
|
||||
performance.
|
||||
Godot sends instructions to the GPU via a graphics API (Vulkan, OpenGL, OpenGL
|
||||
ES or WebGL). The communication and driver activity involved can be quite
|
||||
costly, especially in OpenGL, OpenGL ES and WebGL. If we can provide these
|
||||
instructions in a way that is preferred by the driver and GPU, we can greatly
|
||||
increase performance.
|
||||
|
||||
Nearly every API command in OpenGL requires a certain amount of validation to
|
||||
make sure the GPU is in the correct state. Even seemingly simple commands can
|
||||
@@ -44,17 +42,21 @@ as much as possible so they can be rendered together, or with the minimum number
|
||||
of these expensive state changes.
|
||||
|
||||
2D batching
|
||||
~~~~~~~~~~~
|
||||
^^^^^^^^^^^
|
||||
|
||||
In 2D, the costs of treating each item individually can be prohibitively high -
|
||||
there can easily be thousands of them on the screen. This is why 2D *batching*
|
||||
is used. Multiple similar items are grouped together and rendered in a batch,
|
||||
via a single draw call, rather than making a separate draw call for each item.
|
||||
In addition, this means state changes, material and texture changes can be kept
|
||||
to a minimum.
|
||||
is used with OpenGL-based rendering methods. Multiple similar items are grouped
|
||||
together and rendered in a batch, via a single draw call, rather than making a
|
||||
separate draw call for each item. In addition, this means state changes,
|
||||
material and texture changes can be kept to a minimum.
|
||||
|
||||
Vulkan-based rendering methods do not use 2D batching yet. Since draw calls are
|
||||
much cheaper with Vulkan compared to OpenGL, there is less of a need to have 2D
|
||||
batching (although it can still be beneficial in some cases).
|
||||
|
||||
3D batching
|
||||
~~~~~~~~~~~
|
||||
^^^^^^^^^^^
|
||||
|
||||
In 3D, we still aim to minimize draw calls and state changes. However, it can be
|
||||
more difficult to batch together several objects into a single draw call. 3D
|
||||
@@ -62,7 +64,7 @@ meshes tend to comprise hundreds or thousands of triangles, and combining large
|
||||
meshes in real-time is prohibitively expensive. The costs of joining them quickly
|
||||
exceeds any benefits as the number of triangles grows per mesh. A much better
|
||||
alternative is to **join meshes ahead of time** (static meshes in relation to each
|
||||
other). This can either be done by artists, or programmatically within Godot.
|
||||
other). This can be done by artists, or programmatically within Godot using an add-on.
|
||||
|
||||
There is also a cost to batching together objects in 3D. Several objects
|
||||
rendered as one cannot be individually culled. An entire city that is off-screen
|
||||
@@ -75,8 +77,8 @@ numbers of distant or low-poly objects.
|
||||
For more information on 3D specific optimizations, see
|
||||
:ref:`doc_optimizing_3d_performance`.
|
||||
|
||||
Reuse Shaders and Materials
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Reuse shaders and materials
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The Godot renderer is a little different to what is out there. It's designed to
|
||||
minimize GPU state changes as much as possible. :ref:`StandardMaterial3D
|
||||
@@ -94,12 +96,12 @@ possible. Godot's priorities are:
|
||||
that are enabled or disabled with a check box) even if they have different
|
||||
parameters.
|
||||
|
||||
If a scene has, for example, ``20,000`` objects with ``20,000`` different
|
||||
materials each, rendering will be slow. If the same scene has ``20,000``
|
||||
objects, but only uses ``100`` materials, rendering will be much faster.
|
||||
If a scene has, for example, 20,000 objects with 20,000 different
|
||||
materials each, rendering will be slow. If the same scene has 20,000
|
||||
objects, but only uses 100 materials, rendering will be much faster.
|
||||
|
||||
Pixel cost versus vertex cost
|
||||
=============================
|
||||
-----------------------------
|
||||
|
||||
You may have heard that the lower the number of polygons in a model, the faster
|
||||
it will be rendered. This is *really* relative and depends on many factors.
|
||||
@@ -152,15 +154,17 @@ Pay attention to the additional vertex processing required when using:
|
||||
|
||||
- Skinning (skeletal animation)
|
||||
- Morphs (shape keys)
|
||||
- Vertex-lit objects (common on mobile)
|
||||
|
||||
.. Not implemented in Godot 4.x yet. Uncomment when this is implemented.
|
||||
- Vertex-lit objects (common on mobile)
|
||||
|
||||
Pixel/fragment shaders and fill rate
|
||||
====================================
|
||||
------------------------------------
|
||||
|
||||
In contrast to vertex processing, the costs of fragment (per-pixel) shading have
|
||||
increased dramatically over the years. Screen resolutions have increased (the
|
||||
increased dramatically over the years. Screen resolutions have increased: the
|
||||
area of a 4K screen is 8,294,400 pixels, versus 307,200 for an old 640×480 VGA
|
||||
screen, that is 27x the area), but also the complexity of fragment shaders has
|
||||
screen. That is 27 times the area! Also, the complexity of fragment shaders has
|
||||
exploded. Physically-based rendering requires complex calculations for each
|
||||
fragment.
|
||||
|
||||
@@ -190,7 +194,7 @@ their material to decrease the shading cost.
|
||||
you can reasonably afford to use.**
|
||||
|
||||
Reading textures
|
||||
~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
The other factor in fragment shaders is the cost of reading textures. Reading
|
||||
textures is an expensive operation, especially when reading from several
|
||||
@@ -203,7 +207,7 @@ mobiles.
|
||||
algorithms that require as few texture reads as possible.**
|
||||
|
||||
Texture compression
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
By default, Godot compresses textures of 3D models when imported using video RAM
|
||||
(VRAM) compression. Video RAM compression isn't as efficient in size as PNG or
|
||||
@@ -227,9 +231,8 @@ textures with transparency (only opaque), so keep this in mind.
|
||||
will negatively affect their appearance, without improving performance
|
||||
significantly due to their low resolution.
|
||||
|
||||
|
||||
Post-processing and shadows
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Post-processing effects and shadows can also be expensive in terms of fragment
|
||||
shading activity. Always test the impact of these on different hardware.
|
||||
@@ -241,7 +244,7 @@ possible. Smaller or distant OmniLights/SpotLights can often have their shadows
|
||||
disabled with only a small visual impact.
|
||||
|
||||
Transparency and blending
|
||||
=========================
|
||||
-------------------------
|
||||
|
||||
Transparent objects present particular problems for rendering efficiency. Opaque
|
||||
objects (especially in 3D) can be essentially rendered in any order and the
|
||||
@@ -266,7 +269,7 @@ very expensive. Indeed, in many situations, rendering more complex opaque
|
||||
geometry can end up being faster than using transparency to "cheat".
|
||||
|
||||
Multi-platform advice
|
||||
=====================
|
||||
---------------------
|
||||
|
||||
If you are aiming to release on multiple platforms, test *early* and test
|
||||
*often* on all your platforms, especially mobile. Developing a game on desktop
|
||||
@@ -278,7 +281,7 @@ to use the Compatibility rendering method for both desktop and mobile platforms
|
||||
where you target both.
|
||||
|
||||
Mobile/tiled renderers
|
||||
======================
|
||||
----------------------
|
||||
|
||||
As described above, GPUs on mobile devices work in dramatically different ways
|
||||
from GPUs on desktop. Most mobile devices use tile renderers. Tile renderers
|
||||
|
||||
@@ -1,5 +1,3 @@
|
||||
:article_outdated: True
|
||||
|
||||
.. _doc_performance:
|
||||
|
||||
Performance
|
||||
@@ -9,7 +7,7 @@ Introduction
|
||||
------------
|
||||
|
||||
Godot follows a balanced performance philosophy. In the performance world,
|
||||
there are always trade-offs, which consist of trading speed for usability
|
||||
there are always tradeoffs, which consist of trading speed for usability
|
||||
and flexibility. Some practical examples of this are:
|
||||
|
||||
- Rendering large amounts of objects efficiently is easy, but when a
|
||||
|
||||
@@ -1,5 +1,3 @@
|
||||
:article_outdated: True
|
||||
|
||||
.. meta::
|
||||
:keywords: optimization
|
||||
|
||||
@@ -9,7 +7,7 @@ Optimizing 3D performance
|
||||
=========================
|
||||
|
||||
Culling
|
||||
=======
|
||||
-------
|
||||
|
||||
Godot will automatically perform view frustum culling in order to prevent
|
||||
rendering objects that are outside the viewport. This works well for games that
|
||||
@@ -17,7 +15,7 @@ take place in a small area, however things can quickly become problematic in
|
||||
larger levels.
|
||||
|
||||
Occlusion culling
|
||||
~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^
|
||||
|
||||
Walking around a town for example, you may only be able to see a few buildings
|
||||
in the street you are in, as well as the sky and a few birds flying overhead. As
|
||||
@@ -29,23 +27,14 @@ than what is visible.
|
||||
|
||||
Things aren't quite as bad as they seem, because the Z-buffer usually allows the
|
||||
GPU to only fully shade the objects that are at the front. This is called *depth
|
||||
prepass* and is enabled by default in Godot when using the GLES3 renderer.
|
||||
However, unneeded objects are still reducing performance.
|
||||
prepass* and is enabled by default in Godot when using the Forward+ or
|
||||
Compatibility rendering methods. However, unneeded objects are still reducing
|
||||
performance.
|
||||
|
||||
One way we can potentially reduce the amount to be rendered is to take advantage
|
||||
of occlusion. As of Godot 3.2.2, there is no built in support for occlusion in
|
||||
Godot. However, with careful design you can still get many of the advantages.
|
||||
|
||||
For instance, in our city street scenario, you may be able to work out in advance
|
||||
that you can only see two other streets, ``B`` and ``C``, from street ``A``.
|
||||
Streets ``D`` to ``Z`` are hidden. In order to take advantage of occlusion, all
|
||||
you have to do is work out when your viewer is in street ``A`` (perhaps using
|
||||
Godot Areas), then you can hide the other streets.
|
||||
|
||||
This is a manual version of what is known as a "potentially visible set". It is
|
||||
a very powerful technique for speeding up rendering. You can also use it to
|
||||
restrict physics or AI to the local area, and speed these up as well as
|
||||
rendering.
|
||||
One way we can potentially reduce the amount to be rendered is to **take advantage
|
||||
of occlusion**. Godot 4.0 and later offers a new approach to occlusion culling
|
||||
using occluder nodes. See :ref:`doc_occlusion_culling` for instructions on
|
||||
setting up occlusion culling in your scene.
|
||||
|
||||
.. note::
|
||||
|
||||
@@ -54,15 +43,8 @@ rendering.
|
||||
from seeing too far away, which would decrease performance due to the lost
|
||||
opportunities for occlusion culling.
|
||||
|
||||
Other occlusion techniques
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
There are other occlusion techniques such as portals, automatic PVS, and
|
||||
raster-based occlusion culling. Some of these may be available through add-ons
|
||||
and may be available in core Godot in the future.
|
||||
|
||||
Transparent objects
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
-------------------
|
||||
|
||||
Godot sorts objects by :ref:`Material <class_Material>` and :ref:`Shader
|
||||
<class_Shader>` to improve performance. This, however, can not be done with
|
||||
@@ -76,7 +58,7 @@ For more information, see the :ref:`GPU optimizations <doc_gpu_optimization>`
|
||||
doc.
|
||||
|
||||
Level of detail (LOD)
|
||||
=====================
|
||||
---------------------
|
||||
|
||||
In some situations, particularly at a distance, it can be a good idea to
|
||||
**replace complex geometry with simpler versions**. The end user will probably
|
||||
@@ -85,13 +67,30 @@ in the far distance. There are several strategies for replacing models at
|
||||
varying distance. You could use lower poly models, or use transparency to
|
||||
simulate more complex geometry.
|
||||
|
||||
Godot 4 offers several ways to control level of detail:
|
||||
|
||||
- An automatic approach on mesh import using :ref:`doc_mesh_lod`.
|
||||
- A manual approach configured in the 3D node using :ref:`doc_visibility_ranges`.
|
||||
- :ref:`Decals <doc_using_decals>` and :ref:`lights <doc_lights_and_shadows>`
|
||||
can also benefit from level of detail using their respective
|
||||
**Distance Fade** properties.
|
||||
|
||||
While they can be used independently, these approaches are most effective when
|
||||
used together. For example, you can set up visibility ranges to hide particle
|
||||
effects that are too far away from the player to notice. At the same time, you
|
||||
can rely on mesh LOD to make the particle effect's meshes rendered with less
|
||||
detail at a distance.
|
||||
|
||||
Visibility ranges are also a good way to set up *impostors* for distant geometry
|
||||
(see below).
|
||||
|
||||
Billboards and imposters
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The simplest version of using transparency to deal with LOD is billboards. For
|
||||
example, you can use a single transparent quad to represent a tree at distance.
|
||||
This can be very cheap to render, unless of course, there are many trees in
|
||||
front of each other. In which case transparency may start eating into fill rate
|
||||
front of each other. In this case, transparency may start eating into fill rate
|
||||
(for more information on fill rate, see :ref:`doc_gpu_optimization`).
|
||||
|
||||
An alternative is to render not just one tree, but a number of trees together as
|
||||
@@ -106,7 +105,7 @@ significantly. This can be complex to get working, but may be worth it depending
|
||||
on the type of project you are making.
|
||||
|
||||
Use instancing (MultiMesh)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
If several identical objects have to be drawn in the same place or nearby, try
|
||||
using :ref:`MultiMesh <class_MultiMesh>` instead. MultiMesh allows the drawing
|
||||
@@ -114,32 +113,46 @@ of many thousands of objects at very little performance cost, making it ideal
|
||||
for flocks, grass, particles, and anything else where you have thousands of
|
||||
identical objects.
|
||||
|
||||
Also see the :ref:`Using MultiMesh <doc_using_multimesh>` doc.
|
||||
See also the :ref:`Using MultiMesh <doc_using_multimesh>` documentation.
|
||||
|
||||
Bake lighting
|
||||
=============
|
||||
-------------
|
||||
|
||||
Lighting objects is one of the most costly rendering operations. Realtime
|
||||
lighting, shadows (especially multiple lights), and GI are especially expensive.
|
||||
They may simply be too much for lower power mobile devices to handle.
|
||||
lighting, shadows (especially multiple lights), and
|
||||
:ref:`global illumination <doc_introduction_to_global_illumination>` are especially
|
||||
expensive. They may simply be too much for lower power mobile devices to handle.
|
||||
|
||||
**Consider using baked lighting**, especially for mobile. This can look fantastic,
|
||||
but has the downside that it will not be dynamic. Sometimes, this is a trade-off
|
||||
but has the downside that it will not be dynamic. Sometimes, this is a tradeoff
|
||||
worth making.
|
||||
|
||||
In general, if several lights need to affect a scene, it's best to use
|
||||
:ref:`doc_using_lightmap_gi`. Baking can also improve the scene quality by adding
|
||||
indirect light bounces.
|
||||
See :ref:`doc_using_lightmap_gi` for instructions on using baked lightmaps. For
|
||||
best performance, you should set lights' bake mode to **Static** as opposed to
|
||||
the default **Dynamic**, as this will skip real-time lighting on meshes that
|
||||
have baked lighting.
|
||||
|
||||
The downside of lights with the **Static** bake mode is that they can't cast
|
||||
shadows onto meshes with baked lighting. This can make scenes with outdoor
|
||||
environments and dynamic objects look flat. A good balance between performance
|
||||
and quality is to keep **Dynamic** for the :ref:`class_DirectionalLight3D` node,
|
||||
and use **Static** for most (if not all) omni and spot lights.
|
||||
|
||||
Animation and skinning
|
||||
======================
|
||||
----------------------
|
||||
|
||||
Animation and vertex animation such as skinning and morphing can be very
|
||||
expensive on some platforms. You may need to lower the polycount considerably
|
||||
for animated models or limit the number of them on screen at any one time.
|
||||
for animated models, or limit the number of them on screen at any given time.
|
||||
You can also reduce the animation rate for distant or occluded meshes, or pause
|
||||
the animation entirely if the player is unlikely to notice the animation being
|
||||
stopped.
|
||||
|
||||
The :ref:`class_VisibleOnScreenEnabler3D` and :ref:`class_VisibleOnScreenNotifier3D`
|
||||
nodes can be useful for this purpose.
|
||||
|
||||
Large worlds
|
||||
============
|
||||
------------
|
||||
|
||||
If you are making large worlds, there are different considerations than what you
|
||||
may be familiar with from smaller games.
|
||||
@@ -149,6 +162,8 @@ move around the world. This can prevent memory use from getting out of hand, and
|
||||
also limit the processing needed to the local area.
|
||||
|
||||
There may also be rendering and physics glitches due to floating point error in
|
||||
large worlds. You may be able to use techniques such as orienting the world
|
||||
around the player (rather than the other way around), or shifting the origin
|
||||
periodically to keep things centred around ``Vector3(0, 0, 0)``.
|
||||
large worlds. This can be resolved using :ref:`doc_large_world_coordinates`.
|
||||
If using large world coordinates is an option, you may be able to use techniques
|
||||
such as orienting the world around the player (rather than the other way
|
||||
around), or shifting the origin periodically to keep things centred around
|
||||
``Vector3(0, 0, 0)``.
|
||||
|
||||
Reference in New Issue
Block a user