I just significantly increased my frame-rate by efficiently hiding off-screen objects

Author

Message

The Tall Man

10

Years of Service

User Offline

Joined: 16th Nov 2013

Location: Earth

Posted: 13th Dec 2013 07:10 Edited at: 2nd Jan 2014 01:23

Link

I have tripled my frame rate by hiding all objects, except those that are within the camera's immediate view!

I had noticed that objects that were behind walls or other objects in front of me would slow down my games' performance, even though they were not on-screen. And I could increase performance by hiding them. I've read that the DarkGDK code or DirectX is supposed to do this automatically, but quite obviously it doesn't, or is extremely ineffective at it.

So I just created some functions that allow me to show only the objects that are within the camera's view at its present position, and hide all the rest. And I'm getting much better performance! - Now I can be much more liberal in the numbers and details of the objects I put into my games!

I've made too many customizations (including with the source code) to share the actual code here, it would be a bit too involved, but the way it works is this...

How it's done efficiently at run-time.
In FPS Creator, the maps are basically quantized to a grid. Every block is about 3 or 4 meters square, or cubed. I've written my own map creation class that also works in blocks. And I've taken that idea a bit further by quantizing the player's/camera's position - that is I determine which integer grid block the player's position is currently in. Then I have a 2D array, indexed by integers x and z, that points to a NULL-terminated list of sObject*'s (what you get when you call dbGetObject()), where I directly modify the bVisible flag (the same thing that dbHideObject() and dbShowObject() modifies). All objects are hidden, but then whenever the player's position changes to another grid block, it hides the presently visible objects from the list at the previous grid position, and then shows all the ones from the list at the current grid position.

2D Grid Array of Lists - 2 Memory Allocations
The 2D array is done with one memory allocation, using a memory management class I wrote some time ago. And all the sObject* lists are actually part of the same allocation - so it doesn't drive the windows memory manager crazy to have a variable that's an sObject****.

Quote: "Insert: January 1, 2014: You could make this even more efficient - instead of each list containing all visible objects to that grid block, because consecutive grid blocks would share most of its list contents in common. You could instead have 8 lists per block (1 for each surrounding block), that would store only the changes in the hide/show status of objects. So each of the 8 lists would actually consist of 2 lists, 1 of objects to hide, and 1 of objects to unhide. Although you'd have 16 times as many lists, each list would be far less than 1/16 the size, so it would not only reduce the already very low overhead at run-time, but it would use even less memory, as well (it doesn't use much already). And remember that each list shares a single allocation, and the grid array to access a "list" merely stores a pointer to the spot in the single list allocation where that particular NULL-terminated list begins (each list being separated by a single NULL. So 16 instead of 1 is free). Keep in mind also that moving from block A to block B, and moving from block B to block A, shares identical lists, so there's no need to copy the actual lists. Just swap the hide/show designations.

Additional: With the lists being this efficient and the run-time overhead being virtually non-existent, you could sacrifice some of those without significant cost, to show/hide individual polygons rather than individual object/meshes (if DirectX would efficiently support this). This way a high-poly object would be half the processing: the side facing you."

Keeping the UpdateFOV function and dbHideObject()/dbShowObject() efficiently independent
But of course, using this UpdateFOV function, I still want to be able to hide and show objects independently of that, given certain events in the game and so forth. So in the source-code, I changed the bool bVisible to a BYTE bHidden. The dbHideObject() and dbShowObject() effects only bit 0, and my FOV function uses bit 1. The objects only show if all bits are 0, in other words, if (!bHidden).

Seeing Which Objects Are Visible From Each Block
To generate the list of visible objects for each grid block was the trickier part. There is much room here for improvement, but I wanted to program this quickly, and the results are great! For now, what I've done is that before run-time, I create a camera that renders to an image at 320x240 (it just needs to detect which objects are in the camera's view). I had to create a second image that's lockable so I can access the data, then tell DirectX to copy the image the camera renders to the second image. Then I can access the data of that second image that contains what the camera saw. The way I identify which objects the camera is seeing - I use a memblock to create a bunch of 1x1 images where the color is set to: (object+2)*50, and I temporarily texture all the objects with a 1x1 image who's color is unique to that object. There are no lights and ambient is set to 100%, so I can check each pixel to be sure that ((pixel % 50) == 0). Be sure all your objects are showing first, unhide any that are hidden, or they won't ever be visible in your game

Compiling the list of objects
Now here's the slow part

I position the camera in each block in 25 different equidistant positions (5x5), at 4 angles each (90 degrees apart). And the camera's FOV is set at 100 degrees, so there is a little overlap - not too much because high FOVs lose lots of their resolution. I don't take the camera all the way ~to~ the block's boundaries, because there may be walls ~at~ those boundaries. So I take it to within 90% of the edges of the block. And I compile a histogram of objects present in the frame. Then once all 100 shots have been taken for that block, I can read back the histogram to see which objects were visible from somewhere in that block.

If you did the actual math instead, based on the camera and vertices and all the connections, I'm sure you could do it quickly during pre-game time, without a big delay. But I didn't want to invest my time right now in that direction.

Saving to a file which can be quickly loaded in the future
Now, all this is done ~before~ the game's run-time. And I'm saving the results to a file, which I associate with the map it was created from. So it's only slow once. The results saved to the file are the object numbers (not the sObject*'s).

The kind of map this is ideal with, and how to adapt from single-object .dbo's
As I said above, the map class I've created is how I'm creating my maps. Each unique floor, and each unique ceiling are each one object. Each wall is one object. I don't have the entire barrage of static objects all grouped as "1 object" as is done with a .dbo file. This class assembles the walls by creating planes. Each plane is a new object, which can be shown or hidden at will.

If you're starting from a .dbo file or some other grouping of objects into one "object" - chances are each object has a ton of meshes, each being a unique wall or other object, each with its own texture. I haven't checked into this, but it would make sense. You have access to those meshes their and textures via the sObject* that you get back from dbGetObject(). So you should be able to cycle through your meshes and textures and create a new object using each one, which can then be more efficiently used with this technique. On second thought, I just looked, and the meshes do each have their own bVisible flag, so you could try it per mesh within each object (except for characters or local objects or those that really are single objects in their use).

The speed increase I'm getting is very significant and well worth the effort!

To Access the meshes and textures within an object:

+ Code Snippet

sObject* objectPtr = dbGetObject(object);
if (objectPtr)
{
   // Cycle through all the meshes within that object
   int meshIndex = objectPtr->iMeshCount;
   while (meshIndex--)
   {
      // Cycle through all the textures within each mesh
      int textureIndex = objectPtr->ppMeshList[meshIndex]->dwTextureCount;
      while (textureIndex--)
      {
         objectPtr->ppMeshList[meshIndex]->pTextures[textureIndex].dwMinState = D3DTEXF_ANISOTROPIC; // Switch filtering to anisotropic.
      }
   }
}

Edit #1 - Pre-Transformation
I think I saw a transform flag in the objects or meshes. I'm thinking if I pre-transform all my static objects, then set those flags to false, after doing some reading about how DirectX renders, I might speed things up significantly, yet again! Although I ~think~ .dbo files already take advantage of this, unless you change its scale or position or something like that.

Quote: "Insert: January 1, 2014: But regardless - any object transformations done only need be done if/when something changes, not for every object and every frame. Even if a transformation relates to the camera angle, this won't necessary change each frame either. And the extra processing time would be available to do other lower-priority things in the background."

Edit #2 - Adapting To Include Dynamic Objects
The above idea could be adapted for dynamic objects too. So no more of having to limit which ones are dynamic for fear of them slowing things down. Another such list could be created that marks which grid blocks are visible from each grid block. The way to do that would be to temporarily place a cube in each grid block that matches its dimensions. If the cube is visible to the camera, then so is that grid block. But you'd have to do one cube at a time or else they would end up blocking each other! During game-time, any dynamic object within that block would then be known to be in or out of the camera's FOV from within any grid block. To figure if a dynamic object is within a grid block, have a radius for it on-hand (precomputed with the rest of the FOV pre-computations). Then during the game, with the position and radius, you could quickly compute all the grid blocks the object is intersecting - and is only necessary to do when it's moving. And it doesn't have to be perfect like we'd all like collision physics to be, to get the benefit here.

Q: ...you mean computer imagery was still based on the paradigm that the world was flat? Even into the 21st century??? Talk about doing something the hard way!

A: Yep! Back then people would render simple shapes with complex meshes of thousands of flat little triangles. Next to the bottleneck processors they used, it's the main reason why their computers were so slow. In the last days of the religious atmosphere of centralization and trade, corporate dogmas had people believing that flat was faster.

Back to top

Profile PM

Rudolpho

18

Years of Service

User Offline

Joined: 28th Dec 2005

Location: Sweden

Posted: 13th Dec 2013 11:54

Link

Interesting take on this.
I believe frustum culling would be more efficient in the long run however. You could still divide your scene into segments and then cull those (as cubes of whatever fits) in order that you would then only need to carry out the frustum culling check for each object that belongs to the visible segments.

"Why do programmers get Halloween and Christmas mixed up?"

+ Code Snippet

Because Oct(31) = Dec(25)

Back to top

Profile PM Email Website

The Tall Man

10

Years of Service

User Offline

Joined: 16th Nov 2013

Location: Earth

Posted: 13th Dec 2013 18:43 Edited at: 14th Dec 2013 20:46

Link

Yeah I'd read a little bit about frustum culling, but it seemed that was still repeated for each frame. Combining the two would certainly improve things a little bit, I don't think too much though. It would be the equivalent of using smaller, more precise blocks (so a 2 or 3 less objects might be visible at a time), which would use more system (not video) memory. Although this really doesn't use that much memory anyway as it stands now, so there is room for making the blocks smaller. And actually with smaller blocks, 3x3 checks could be done instead of the 5x5. Or maybe even 2x2 - that would double the speed of my laughably inefficient camera-views process

Hmmmmm..... Perhaps I should double the grid resolution. I could reuse identical entries of the sObjects*'s list (like when multiple blocks see the exact same list of objects).

About angle of view, not wasting processing power with objects behind the camera, I've noticed that DarkGDK already seems to do that pretty well, so at least I don't have to take that into account as well, I can let it do its job there. It's just the objects in front, specifically those blocked by others that needs the fixing.

Most games I've played would gain a massive performance boost if they would implement something like this, and they don't for some strange reason...

Edit: - Update
I've now doubled the resolution of my grid. So each block I create with my map class is divided into 4 blocks for the FOV lists. I also lowered the resolution of the camera output to 256x128, and an aspect ratio of 1. But I think I can do better with the camera, I'm gonna create a little function to allow me to set the FOV of the X and Y independently instead of with an aspect ratio. I'm now using a 2x2 matrix (for camera positions within each grid mini-block), instead of the 5x5 I was using before.

I've also just learned how to lift the 60fps cap of my frame rate by disabling the v-sync, and I can now see that with a huge map and lots of mesh objects all showing with the dbShowObject() command, since adding these new functions my frame-rate has approximately tripled!!!

Btw, here is the function to set a camera's FOV independently for X and Y:

+ Code Snippet

void SetCameraFOV (int camera, double fovX, double fovY)
{
   float aspectRatio;

   aspectRatio = tan(fovX * M_PI / 360.0) / tan(fovY * M_PI / 360.0);
   dbSetCameraFOV(camera, fovY);
   dbSetCameraAspect(camera, aspectRatio);
}

Edit: Another Update:
I ended up deciding to go with the same resolution as my map blocks, rather than doubling it. It just makes more sense and is easier to manage.

Back to top

Profile PM

Sorry your browser is not supported!

Dark GDK / I just significantly increased my frame-rate by efficiently hiding off-screen objects