Wednesday, February 6, 2013

Polycount vs Vertex count

I was asked this question too often by junior artists, so I should address it. There is some talking about in Polycount.com but I should make it very clear.This article has mainly been written for artists

How I should measure the cost of my models, by polycount, triangle count??

Short answer: VERTEX COUNT!!! The poly- or triangle count says nothing very little about the cost of your model.

Long answer: For GPUs, all that matters is how many vertices need to be processed, and how much of the screen the triangle will be covering.
It's true that 4 vertices may form 2 triangles, or it may form up to 4 triangles (i.e. overlapping each other) and the latter will cause the GPU to process the same 4 vertices more times. However modern GPUs have large enough caches that will store the result of the vertex, making this problem meaningless. Any decent exporter (or even the modeling application) will be arranging the vertices in a cache-friendly way. It's not something the artist has to worry about.
If the exporter isn't rearranging the vertices, introduce your tools programmer to the wonderful AMD Tootle

As for N-Gons, it is just for Maya/Max/Blender. It eases modeling and textures for the artist, but when it's exported to a game engine, everything is converted to triangles. Everything, even quads.
That's because all the GPU can process are triangles. GPUs that can natively render quads haven't been popular since the Sega Saturn.

What if told you, a cube can be made with 36 vertices, 24 vertices, or 8?

The vertex count that Max/Maya/Blender show you is not the vertex count that really cares, although sometimes it can be a rough estimate times a factor (that factor will depend on each model)
For example, Nathan's model from Distant Souls is 12.959 vertices in Blender (22.876 tris for those not yet used to it), but once exported it's 14.003 vertices.


So, why is that? Because every time there is a discontinuity between two triangles, the vertex must be duplicated. Normals are stored per vertex, not per face. Strong differences between the normals of two faces means the same vertex has two different normals, which equals two duplicated vertex: Same position, different normal.
Let's take a look at the following cube, it's 8 vertices, and 6 faces (obviously!):
So, you would assume that's 8 vertices, right? Guess again. If exported correctly, it should be 24. If badly exported, it will be 36.
Let's see bad case, how it could be 36 vertices:


 

The exporter sees we're using flat shading/hard edges, so it just goes the easy routes and makes one vertex for each triangle. Quick 'n dirty. The cube has 6 faces, 2 triangles per face. That means 12 triangles. Because there are 3 vertices per triangle, 12x3 = 36
 6 faces x 2 triangles per face x 3 vertices per face = 36
 
But let's get a little smarter, we can reuse a few vertices:


 
As seen in the picture, 2 vertices from each triangle share the same normal, hence we don't need to duplicate them. That leaves us with 4 vertices per face. Every other face doesn't share the same normal, so we'll need to duplicate the vertices. What we get is:
6 faces x 4 vertex per face = 24
And finally goes smooth shading, due to the way the normals are placed, it gives a soft look, since it's trying to mimic a round ball lighting:

Here, every vertex from each triangle has the same normal, hence no need to duplicate anything. From a technical standpoint, it's the best case scenario. Not only they're less vertices, but they're also shared which allows the GPU to better utilize the post vertex cache (it's a cache that reuses the output from a vertex shader for a different triangle, so that the shader doesn't have to process it again)
From an artistic standpoint, whether this is good will depend on the look you're trying to achieve.
Anyway, the result is 8 vertices total.
In a real world scenario, practice tells us even this smoothly shaded cube won't be 8 vertices, probably 12-16. This is because unwrapping the UV will generate a discontinuity at some point unless you use an extremely distorted UV mapping:
 
When exporting this cube using smooth shading with this UV layout, it contains 14 vertices. Still better than 24.
Green: Duplicated vertices.
Yellow: The original vertices that presented a discontinuity.

Causes for vertex duplication / discontinuities?

There are many factors that can cause a discontinuity. For example:
  • Using flat shading. Use smooth shading instead when possible.
    • Called “hard edges” in Maya (use smooth edges)
    • Using multiple, faceted “smoothing groups” in 3DS Max
    • Called “Flat shading” in Blender
  • Too many UV islands. If the vertex has U=0,5 for one face, and U=0.7 for another face, the vertex can't have two U values at the same time, it has to be duplicated
    • Note that multiple UV sets are fine as long as they keep continuous (which can be rarer to keep in harmony the more sets you use)
  • Using different Material Ids. Try to batch everything into one material.
Polycount already covers this in excellent detail. There is no need to keep repeating what is said there.

Common misconception: Why is the GPU so dumb?

A common misconception that confuses many junior artists is why the GPU is dumb? In the example given above (let's forget about UVs) it's clearly smarter to just store 8 vertices with 6 face normals. Why does the GPU need 24 vertices? Why store per vertex normals? The first reaction I get is “That's insane”
The answer is simply because that's how GPUs work. GPUs are all about raw power. In simple terms, the main difference between a CPU & GPU is that one tries to be smart, while the other tries to brute force everything.
Think an analogy: Suppose the Hulk wants to enter a house. The smart thing to do is knock on the door, wait until someone opens it, then close it again. But he's the Hulk. It's clearly easier for him to just smash the whole wall. You wouldn't expect the Hulk not to do that. Even something as simple as opening door or knocking would be hard for him. Same goes for GPUs.
3DS Max, Maya & Blender all of them will internally store 8 vertices and 6 face normals. This is because it's storage efficient. Imagine those packages saving a file that is 3 to 6 times their current size. Also from an artist point of view, it is much easier to just work face normals, so the internal representation matches the artist's normal workflow.

But here's one little secret: When they have to display it on the viewport, they have to convert the model on the fly. That perfect 8-vertex cube with 6 face normals gets converted to 24 vertices. That's why you may have noticed modifying one single vertex in an object with 10 million vertices is so painfully slow even in high end machines. Also another big reason modeling packages such as Maya/Max/Blender can't match the framerate of a game engine. They either convert the whole model on the fly, or sacrifice real time performance for incremental updates so that editing doesn't become so slow.


This is going to change in the future, right?

Well, it's been like this for more than 20 years. So don't keep hope. You can try teaching the Hulk to knock on doors. He will keep preferring to break the wall.
There are progress in alternatives to rasterization for real time applications, mainly using Compute Shaders, and sometimes they include good ol' raytracing. But nothing ground breaking so far yet (limited use cases right now), so 8-vertex cube with hard edges may become feasible in ?? years.

It's worth mentioning that there is a way to use hard edges with just 8 vertices (but still needs to duplicate due to UVs) in Direct3D & OpenGL, which is simply called “Flat shading” (of course!). This feature has been there since the dawn of time.
The reason the tool's programmer doesn't ever want you to know that is because:
  • It's an all or nothing – Either everything is flat shaded, or smooth shaded. You can't mix.
  • The order in which vertex are submitted to the GPU becomes very important to achieve correct lighting otherwise the wrong normal will be used. And it's a major PITA for the exporter to guarantee that correct order. Sometimes there is not enough information and becomes practically (but not theoretically) impossible to export the right order, unless we resort to duplicating a few, which is what we were trying to avoid. It's easy for standard primitives (box, sphere, pyramid) but analyzing complex surfaces' triangle indexing to get the right order is challenging.
If accurate lighting can be sacrificed, or the programmer who wrote the exporter is a genius, and you don't need smooth triangles in your mesh, then flat shading is a feasible alternative. Note that using UVs and other causes for vertex discontinuity will still force to duplicate vertices.

 

 Conclusion

So here you go. This explanation covers why the vertex count of the final exported model is far more important than the vertex count or the poly count that Maya/Max/Blender will show you. A good pipeline workflow should allow you to quickly export & iterate so that you can experiment with the vertex count. Be sure that your model is always “export ready” and have a tool ready that can show you the vertex count.
It's nearly impossible for the human brain to calculate how many more vertices will need to be duplicated, at least for more complex models. The best way is “just try” and follow the hints: Use smooth normals, avoid UV seams, avoid UV islands, use as fewer material Ids as possible, etc. A rule of thumb is that your vertex count will be somewhere between “vertex count” (according to Maya/Max/Blender) and “Number of tris * 3”, since there's no excuse for having more than three vertices per triangle. If there are N-Gons, you'll need to triangulate them to get an estimate.

1 comment:

  1. Thanks for explaining this. I really like the Hulk analogy. I can understand hardware optimised for pure number crunching, and that being more "creative" costs more processing. This explained to me that in my voxel game, I NEED to have 4 verts to each face (and indeed, I saw the strange "smooth shading" when I tried to share the faces amongst 8 verts), so when I encounter Unity's "65000 vert per mesh" limit I can't share verts, I have to limit how many cubes are put into a single mesh.

    ReplyDelete