If Command has z==0, then those elements won't be sorted.
Only Z !=0 will be sorted, and it will use `sort` instead of `stable_sort`
for z!=0, since it is faster
Instead of using a 64-bit int key with viewport, opaque and depth,
it only uses a 32-bit float with only the depth.
Saves time in:
- No need to convert the 32-bit float into 24-bit int
- keys are shorter
QuadCommand no longer stores a copy of the quads.
Instead it just stores a reference and the MV matrix.
Later, the Renderer when it copies the Quads to the queue, it will
convert the Quads to world coordinates