2018年5月27日 星期日

Async Shadow Mapping With DirectX 12


This is about rendering shadows with DirectX 12 plugin on Unity 2017.4.3f1.
As the title, the shadows are rendered completely on another thread and aren't batched.
It renders 10000 GameObjects on the screen.
And also uses GPU Instancing (not only color, but textures.) for saving performance.
(Instancing has been in place for years.)

After enabling indirect drawing, shadow rendering got a 0.5ms boost.
This is the power of DX12!

Why multithread  rendering is hard and limited in D3D11?

In D3D11, we need to submit works through contexts.


  • Immediate context - used for executing works, a D3D11 device only has one immediate context.
  • Deferred context - used for recording work commands, won't be executed immediately.
Both contexts are not thread-safe.
We can use deferred context for recording works (draw, copy, map, etc.) on other threads.
But eventually, we need to synchronize our threads and use immediate context for executing commands recorded by deferred context.

Best case, we can divide our works into trunks and synchronize at some points to submit works. But we can't truly submit works asynchronously.

New rendering models in D3D12.

In D3D12, we no longer use contexts.
Instead, Microsoft presents new models for work submission:
  • CommandAllocator - An allocator for command list. (not thread-safe)
  • CommandList - An interface for recording works. (not thread-safe)
  • CommandQueue - An interface for executing recorded works by command lists. (thread-safe)
CommandList is similar to deferred context, we can create different command list on threads and recording render works.

But this time, we can execute our commands through CommandQueue.
And CommandQueue can be used on any threads.
This makes async rendering possible. The only thing we need to care is how to submit our works correctly.

D3D12 also provides Bundles, and we can use bundles for recording works only once.
Then playback our bundles with command list.


Resource Binding in D3D12.

In D3D11, resource binding is almost done automatically. 
In D3D12, the resource binding is separated from management tasks.
This time, we need to sure the resources are going to be used by GPU are resident.
And we can't use the resources on CPU side that are used by GPU.

For accessing resources, Descriptors are proposed.
It's similar to a pointer, and point to both CPU/GPU address.
  • CPU Handles - Immediate use, such as copying resource.
  • GPU Handles - Not for immediate, used at GPU execution time.
With descriptors, we can access our resources efficiently.
We can even build a descriptor table for dynamic indexing our shader resources.

For example, we may want to use textures like this:

                 
Texture2D DiffuseMap[128] : register(t0);
SamplerState sampler_DiffuseMap;
uint _TexIndex;

float4 PS(v2f i)

{

     return DiffuseMap[_TexIndex].Sample(sampler_DiffuseMap, i.uv);

}
         


It is possible with D3D12.
In D3D11, dynamic indexing to shader resources is limited. Index can only be literal, and shader turns out some waterfall code for indexing shader resource view.

Dynamic indexing provides unprecedented flexibility and unlocks new rendering techniques.

CPU/GPU Synchronization

In D3D12, we need to synchronize CPU/GPU by our own.
D3D12 provides Fence for this.
We can synchronize CPU/GPU efficiently used with a ring buffer implementation.

In D3D11, synchronization is almost automatically. But it usually waits for all jobs done by GPU. Apparently it's not so efficient.

Powerful Indirect Drawing

It's basically an enhanced version of DrawInstancedIndirect/DrawIndexedInstancedIndirect.
With this technique, a million draw calls is possible.
By integrating draw calls into indirect argument buffer, the CPU overhead is reduced significantly.

But it seems only 32-bit index buffer can work properly now.
I posted a question on MSDN forum but still got no answer.

D3D12 in Unity

Before Unity 2017.3.0, we can only use DX12 in editor through command: -force-d3d12.
But it isn't a safe way. At the beginning, I create my demo with Unity 5.5. If I modify one code of CG shader and makes editor compile it, the Unity will crash.

After 2017.3.0, we can use DX12 with Unity editor. But it is still experimental.
Unity hasn't fully implement D3D12.
For example, the target level of CG shader still caps at 5.0. But native D3D12 supports Shader Model 5.1.

If we want to utilize D3D12 now, we must write native plugin interface.
With this interface, we can utilize native D3D12 rendering.

My demo needs to execute above Unity 2017.3.0. (It also provides 32-bit index buffer. I need this for indirect drawing.)

Summary

D3D12 provides more controls to the rendering pipeline. But also increase the complexity of development. It's recommended to use D3D12 if you are skilled with D3D11.

D3D12 is aim for reducing CPU overhead. If your project isn't CPU-bound, put into D3D12 won't have a significant change.

Last, there are some games implemented with D3D12.
Not all developers done well with D3D12. (Some games even perform worse than D3D11).
And some of these game done well and are impressive. (For example, Gear of Wars 4)

Since D3D12 is a huge change, developers need to cost some times for optimizing it. (Also the Vulkan).
I'm looking forward to the development of new APIs, one day we can enjoy more impressive games!

Project is available on my github:
https://github.com/SquallLiu99/Async-Shadow-Mapping