Sorry your browser is not supported!

You are using an outdated browser that does not support modern web technologies, in order to use this site please update to a new browser.

Browsers supported include Chrome, FireFox, Safari, Opera, Internet Explorer 10+ or Microsoft Edge.

Program Announcements / jGfx - Rendering and Effects Plugin for DBPro

Author
Message
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 8th Aug 2017 17:50 Edited at: 10th Nov 2017 01:24
Hi Everyone, I would like to present my newest plugin: jGfx. It combines the commandsets of my previous ID3DXEffect and MRT plugins, plus adds commandsets for dynamic cubemap textures, geometry instancing and postfilter cameras. There are no help files yet, but the download comes with a keywords file and a few example projects.

Shader_ Commandset Features:
*) Get/Set effect constants using pointers or by value
*) Supports all 32-bit numerical HLSL datatypes and arrays
*) Uses parameter handles to decrease string-based constant lookups

MRT_ Commandset Features:
*) Provides concurrent rendering to up to 4 target images with a single sync

CubeTex_ Commandset Features:
*) Provides creation of dynamic cubemaps using any supported image format
*) No hard-coded limit to the number of dynamic cubemaps in existance
*) Supports cubemap resolutions which are greater than the screen height
*) Allows rendering individual cubemap faces
*) Able to Save rendered cubemaps to file

GeoInst_ Commandset Features:
*) Provides hardware accelerated rendering of mesh instances by interleaving vertexdata with an array of instance data
*) Supports user-defined element definition for storing custom data on a per-instance basis

EfxCam_ Commandset Features:
*) Performs off-screen quad renders without being slowed by dbpro object counts
*) Mimics the native camera commands for ease of use



Update 11/2/2017:
================================================
*) Added commands for getting/setting shader vector arrays using arrays of vector pointers
*) Added a command for setting shader techniques by name
*) Added commands for sorting the GeoInst render queue based on user-designated priority values
*) Added MRT capabilities to CubeTex and EfxCam command sets



New Commands (10):
--------------------------------
Shader_SetVectorPointerArray pEfx, ptr_VecPtrArr, VecStride, Count
Shader_GetVectorPointerArray pEfx, ptr_VecPtrArr, VecStride, Count
Shader_SetTechniqueName pEfx, NameStr$

GeoInst_SetRenderPriority pGeoInst, Priority
Priority = GeoInst_GetRenderPriority(pGeoInst)
Index = GeoInst_GetRenderIndex(pGeoInst)
GeoInst_SortRenderQueue

CubeTex_SyncMRT pCubeTex, pCubeTexB, pCubeTexC, pCubeTexD, CamID
CubeTex_SyncFacesMRT pCubeTex, pCubeTexB, pCubeTexC, pCubeTexD, CamID, dwFaceMask

EfxCam_SetMRTImage FxCamID, TargetIndex, ImageID [, D3DFMT]



Command List (154 Total):
+ Code Snippet

All comments, questions, and suggestions are welcome.

Attachments

Login to view attachments
EVOLVED
14
Years of Service
User Offline
Joined: 9th Feb 2003
Location: unknown
Posted: 9th Aug 2017 14:06
This is a fantastic plugin revenant, finally we can have the ability to have a fully deferred rendering pipeline.

The current AL sys has been converted to use your plugin here AdvancedLightingMRT.zip
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 10th Aug 2017 05:14
Thanks EVOLVED, I'm glad to see this is being put to good use.
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 10th Aug 2017 11:32
Updated the first post with a new version.
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 10th Aug 2017 14:04
Good job!Plugin is realy great!
I want to use your object instance system instead of DBPro's - its works fine both on my PC and laptop
but will it work on others? You have "CheckDeviceSupport" commands so is there a chance that someone will not
get things work properly?
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 10th Aug 2017 14:20
GeoInst_CheckDeviceSupport() returns true if the device supports concurrent data streams, and probably wasn't necessary. Even old Dx8 gpus supported up to 8 so it is very unlikely any hardware in use today couldn't support it.
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 13th Aug 2017 23:21
Updated the first post with a new version.
Chris Tate
DBPro Master
9
Years of Service
Recently Online
Joined: 29th Aug 2008
Location: London, England
Posted: 22nd Aug 2017 12:28
I look forward to trying this out
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 22nd Aug 2017 19:41
I try to use GeoInst system with objects which have multiple UV texture stages and custom diffuse color and this looks to be impossible.
GeoInst use TEXCOORD1-TEXCOORD4 to generate matrix and TEXCOORD5 for special individual data. COLOR0 still not in use but do not work.
So I guess only one solution is to move object UV data to stages TEXCOORD6,TEXCOORD7 manually?
Also it requires to switch on up to 8 UV stages by "Convert Object FVF" function - this will inflict more memory
usage.When you load complex level geometry it can be crucial.
Maybe it will be better to move GeoInst functions to stages TEXCOORD3 - TEXCOORD7 which offen are not in use at all?

revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 23rd Aug 2017 07:42 Edited at: 23rd Aug 2017 11:45
Quote: "GeoInst use TEXCOORD1-TEXCOORD4 to generate matrix and TEXCOORD5 for special individual data. COLOR0 still not in use but do not work."

The TEXCOORD indices used for passing instance data (world matrix and/or custom data) are not rigid and will be shifted to accommodate a mesh's existing UV stages. If the object uses TEXCOORD0 and 1, then the world matrix will be passed using TEXCOORDs 2-5. The plugin uses DBPro's tangent and binormal generation which is known to cause issues with vertex colors, does your shader make use of normal mapping?

Quote: "So I guess only one solution is to move object UV data to stages TEXCOORD6,TEXCOORD7 manually?"
That won't work, the world matrix (and custom data) are stored within a separate vertexbuffer which is interleaved with the mesh's vertexbuffer at render time. If you add TEXCOORDs 6 and 7 to the source object's vertexdata it will consume all 8 TEXCOORDs and there will be no remaining elements for passing instance data.

I've attached an example which demonstrates geometry instancing for objects with 2 texture stages and vertex diffuse.

Attachments

Login to view attachments
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 23rd Aug 2017 09:56 Edited at: 23rd Aug 2017 11:44
[Orig Post Deleted]
It turns out the "bug" I had been experiencing with FVF formats was just me being silly...
0x100 = 256
0x200 = 512
0x300 = 768
0x100||0x200 = 768 = 0x300
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 23rd Aug 2017 18:35
Quote: "I've attached an example which demonstrates geometry instancing for objects with 2 texture stages and vertex diffuse."

Thanks a lot! For me works fine even with SetCustomData function.
Conflict between DIFFUSE and binormal can be avoided with computing binormal data to some of UV chanels which can be done manually
with DBPro.

Quote: "It turns out the "bug" I had been experiencing with FVF formats was just me being silly..."

Yep it is kind of sum. All my life I did this the wrong way

PS. This works also:

`dont generate TEXCOORD3 at all
Convert Object FVF Object,0x002||0x010||0x100
Lock Vertexdata for limb Object,0
VtxCount = Get VertexData Vertex Count()-1
for i=0 to VtxCount
u#=get vertexdata u(i,0)
v#=get vertexdata v(i,0)
set vertexdata uv i,0,u#*0.125,v# //TEXCOORD0
set vertexdata uv i,1,u#*0.25,v# //TEXCOORD1
set vertexdata uv i,2,u#*0.5,v# //TEXCOORD2
set vertexdata uv i,3,u#*2,v# //TEXCOORD3
next i
Unlock Vertexdata

So, DBPro generate FVF automatically when you start to edit uv data which is not
exist yet.

revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 23rd Aug 2017 19:24
Quote: "PS. This works also:"
Quote: "So, DBPro generate FVF automatically when you start to edit uv data which is not exist yet. "

Actually it seems those calls to set TEXCOORD1-3 would simply do nothing. If you apply a shader which requires those additional stages it will not work unless the FVF was setup properly.
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 23rd Aug 2017 20:28
Also find this small bug:
"disable object zwrite" command doesn't work =(
It useful for backdrops and all such things which most be always be drawn behind all others.
( I mean that disable object zwrite is set to object which is NOT instanced )
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 23rd Aug 2017 22:23
Hmm.. I hadn't noticed that issue, thanks for bringing it to my attention. Could you check if the problem exists when using blitzterrain as well? At the moment I don't have a clue what could be causing this problem, but in the meantime a possible work-around is to achieve the same effect using zbias:
+ Code Snippet
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 24th Aug 2017 10:41 Edited at: 24th Aug 2017 10:46
Your snippet works excellent, as well!
So, bug become unnoticeable IMHO

Yes, disable object zwrite doesn't work with BlitzTerrain.
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 24th Aug 2017 11:47
Quote: "Your snippet works excellent, as well! So, bug become unnoticeable IMHO"

Awesome, I'm glad it works.

Quote: "Yes, disable object zwrite doesn't work with BlitzTerrain."

I am both relieved and troubled to hear the bug also exists with BlitzTerrain. Relieved because I think it means I am not doing something incorrectly in my render code, but troubled because that leads me to think the problem occurs when external renderers are setup with dbpro... what other undiscovered problems may crop up? On the other hand BT has been around for quite some time and has been used by many, so hopefully this small issue is the exception. Thank you for testing Kuper
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 27th Aug 2017 04:31
Updated the first post with a new version.
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 29th Aug 2017 19:19
Is it available to create two CustomElements float4 with GeoInst?
I've tried something like this:

GeoInst_DeclareCustomElements pGInst,1
GeoInst_AddCustomElement pGInst, 4
GeoInst_DeclareCustomElements pGInst,2
GeoInst_AddCustomElement pGInst, 4
GeoInst_FinishCustomElements pGInst

but it doesn't work
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 29th Aug 2017 21:03
Try this:
+ Code Snippet
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 31st Aug 2017 18:17 Edited at: 31st Aug 2017 20:18
Thanks again, man!
PS
Looks like instanced object doesn't work correctly with alpha blending.
On example image left- instnaced , right standart DBPro plain
Blue is the color of camera backdrop - so alpha chanel on instanced objects blends with it not the background
object, however blending between instanced object themselves look correct.

Attachments

Login to view attachments
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 1st Sep 2017 02:07
Could you post a small example for me to experiment with?
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 1st Sep 2017 12:11

Attachments

Login to view attachments
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 2nd Sep 2017 04:14 Edited at: 2nd Sep 2017 04:18
I had a look around the DBPro source and it seems news isn't good. Correctly handling certain render states (such as alpha blending) requires usage of dbpro's object manager, but the object manager wasn't written to be utilized by third-party renderers. External render functions are called once per sync, but it seems proper support would require DBPro to call external render functions once for each phase of it's rendering process. Unless I am mistaken (and I hope someone can correct me), it looks like third-party renderers are limited to alphatest/clip based transparency. Ghosting also seems to be effected.
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 2nd Sep 2017 18:40
So the only solution is that all objects must be created witn GeoInst to have proper blending..
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 19th Oct 2017 21:09 Edited at: 20th Oct 2017 01:31
Find that alpha blending do not work perfect also with GeoInstances.
Something wrong with z-sorting, I guess

Hope there is someway to solve this
Maybe way to set z-sorting manually?
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 20th Oct 2017 21:37 Edited at: 20th Oct 2017 21:38
Hi Kuper,
Geometry instances are rendered as one gigantic object using a single draw call. Unfortunately this doesn't play well with alpha blending as polygons which are rendered first cause future polygons to fail their z-depth test.

+ Code Snippet

Instances are drawn sequentially by their index so you could manually sort them, but updating geoinstances involves manipulating a vertex buffer which can be costly to perform each loop and would still not solve all potential problems. Have you considered using shader-based batch rendering?
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 20th Oct 2017 22:57
@revenant chaos
Quote: "Have you considered using shader-based batch rendering?"

I just play with little demo which I upload time before.So I use common shader ( which you provide for geo instances )
Quote: "so you could manually sort them"
- how can I do this? I try to finish my GUI system.I use your plugin for instances
because it is pretty fast - but I need to sort all stuff ( windows, buttons etc. ) at least once
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 22nd Oct 2017 02:41
You could use a UDT array to store each instance's properties along with a depth variable, then use MatrixUtils' Sort Array command to sort by depth in descending order. Once everything is sorted write each element's data into the instances (geoinst index = array element index+1).
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 22nd Oct 2017 21:19 Edited at: 22nd Oct 2017 21:22
Thanks I'll try this
but found that z-sorting is not working between different GeoInstance object groups ( I mean GeoInst itself , not each of it instance )
GeoInst A and B here are made in sequence
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 23rd Oct 2017 00:07 Edited at: 23rd Oct 2017 00:08
I'm not really sure what to do about that one. I could add functions to query and modify the order which geoinstances are rendered, but I'm not sure how user friendly it would be, and it would only work in cases where all of GeoInstA's geometry can be guaranteed to be infront of all of GeoInstB's geometry or vice versa. Mabey this article could help you, the two pass rendering technique it describes might be what you are looking for.
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 23rd Oct 2017 01:35
Thanks for answer!
Quote: "I could add functions to query and modify the order which geoinstances are rendered, but I'm not sure how user friendly it would be"

each instance ZDepth = instance to render camera distance
then GeoInst_SetZDepth pGInst, InstanceId ,ZDepth
something like that, I guess
though z-sorting can slow down your super fast instance rendering but if it is not updating every sync it can be avoid
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 23rd Oct 2017 03:45 Edited at: 23rd Oct 2017 03:57
Quote: "each instance ZDepth = instance to render camera distance
then GeoInst_SetZDepth pGInst, InstanceId ,ZDepth
something like that, I guess"
GeoInst groups are rendered as individual objects, so that would not solve any problems when it comes to sorting polygons across different GeoInst groups. When I said I could add functions to query and modify the order which geoinstances are rendered, I meant the GeoInst groups themselves (not the individual instances within each group). Adding the ability to specify per-instance depth would substantially complicate the library, yet would not provide anything that cannot already be achieved by ordering indices by depth.

In your screen shots ObjectA and ObjectB are being rendered using two separate draw calls; I could allow user control over the order in which those draw calls occur. Properly rendering all instances back-to-front would require each individual instance to be rendered using a separate draw call, but that is exactly what geometry instancing was designed to avoid.

It seems your best option might be to use a single geoinst group combined with a texture atlas. You could pass UV offset and scale values as custom elements, then use them to control which parts of the atlas are mapped to each instance.
Chris Tate
DBPro Master
9
Years of Service
Recently Online
Joined: 29th Aug 2008
Location: London, England
Posted: 30th Oct 2017 19:24
Can't wait to try this plugin out, I could do with a post rendering camera which works with anti-aliasing; hopefully this plug-in will contain the solution.
Kuper
9
Years of Service
User Offline
Joined: 25th Feb 2008
Playing: Planescape:Torment
Posted: 30th Oct 2017 22:09
@Chris Tate
I think only way to achieve true anti-aliasing is to use STYX plugin
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 31st Oct 2017 04:15
Hi Chris,
I'm afraid Kuper is correct. The EfxCam commandset doesn't add anything which isn't already possible using the native camera commands, it merely provides a faster way to render post-filter screen quads. DBPro's native cameras slow down as the object count increases, regardless of how many objects actually render to those cameras. The advantage of using EfxCams is that they are rendered without iterating through dbpro's object list, in exchange for being limited to rendering a single object per camera. I'm not sure how/if anti-aliasing could be implemented, but EfxCams should be useful for speeding up existing shader based solutions.
Chris Tate
DBPro Master
9
Years of Service
Recently Online
Joined: 29th Aug 2008
Location: London, England
Posted: 31st Oct 2017 13:47 Edited at: 31st Oct 2017 13:48
Quote: "@Chris Tate
I think only way to achieve true anti-aliasing is to use STYX plugin "


In my game I have been using multi-sampling parameter in display mode. It's good enough for my needs, the only problem is it appears to disregard multisampling when you render to an image; which is a pain. For this reason I have not paid much attention to post rendering because for my project I would rather the Anti-aliasing than post rendering because of the basic cartoon look I am going for.

I have a feeling it does not work on all GPUs because I am sure it didn't work on my old system. I might consider using Styx as an alternative option for certain machines.

Here is the code from my project; where the Antialias variable was set to the median of the amount available on the GPU. My edges are nice and smooth, by no means any pun intended.

+ Code Snippet
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 2nd Nov 2017 19:07 Edited at: 2nd Nov 2017 19:09
Uploaded a new version, see first post for details.
Chris Tate
DBPro Master
9
Years of Service
Recently Online
Joined: 29th Aug 2008
Location: London, England
Posted: 30th Nov 2017 05:51 Edited at: 30th Nov 2017 05:51
I commenced with the implementation of the plugin in my game, starting with GeoInstancing. Looks good thus far.

A performance test yielded some interesting results on my hardware; it turns out the procedure I am already using works the best on my machine when the vertex count per object is high, which involves the use of the standard instance/clone object features.

Using GeoInstance functions provides a considerable performance boost when the objects are not that detailed; but when vertex dense, there was no benefit of using the GeoInstance functions, it ran much slower. Not sure if it is because of the diffuse shader applied to the GeoInstances.

However, whether standard objects are excluded or not, there is an impact on performance; which is not the case with GeoInstances, excluded GeoInstances have no performance impact, quite similar to when regular instances are excluded, but much better.

Changing the detail of the spheres in the snippet below showcases the difference. Setting the mode to 2 in the code snippet below tests the GeoInstances functionality. 0 for standard objects.

+ Code Snippet

I also noticed that there was no difference in performance in the PostFilters demos whether using standard cameras or EfxCams on my machine.
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 30th Nov 2017 15:45 Edited at: 30th Nov 2017 16:07
Hi Chris,
The geoinst shader is executing many more instructions on a much larger dataset then the FFP modes. Try using the geoinst diffuse shader (your example uses the normal mapping shader), and apply a simple diffuse shader to the DBPro objects for a more accurate comparison. Due to geometry instancing's increased data requirements, instance polycounts can have a greater impact than with traditional methods.

Geometry instancing works by creating a second vertex buffer with 1 vertex per instance, where each vertex stores a world matrix and/or custom data. At render time, this additional vertex buffer is bound to the D3D device (as a second data stream) along with the object's vertex and index buffers. As the object is being drawn, the gpu interleaves the instance stream with the object's vertex data. The object's data is iterated once for each instance vertex, where each object vertex is appended with the current instance vertex's data. Similar to static batching, this allows multiple copies of a mesh to be rendered within a single draw call, but still provides a semi-efficient method of transforming instances when needed.

This decreases stalls in the graphics pipeline, but increases the amount of data which is processed per-vertex. Each vertex is interpreted as having it's own 4x4 world matrix; that is equivalent to a standard object having 8 float2 texcoords, and that is without considering the object's actual texcoord data. When using many high-poly instances the amount of data processed by the GPU becomes tremendous. This demand grows even higher when combined with normal mapping; dbpro adds tangent and binormal vectors to each vertex, resulting in each vertex containing two matrices (a 4x4 and a 3x3) in addition to position, color and texcoord data.

The plugin's world matrix system builds 4x4 transformation matrices for each instance. In cases where full transformation matrices are not required, custom data can be used to reduce the amount of data which is processed for each vertex. For example; position, single-axis rotation, and uniform scaling could be represented using 2x float3 custom elements. This would reduce the amount of per-vertex transformation data from 64bytes to 24bytes, and should allow you to push vertex counts a bit higher.

GeoInst Notes:
- Geometry instances are not culled on the cpu. All (non-excluded) instances are sent through the graphics pipeline, leaving the GPU to clip off-screen geometry. Using large numbers of high-poly instances will greatly increase demands placed on the gpu.
- D3D9's geometry instancing requires indexed vertex data. If GeoInst_SetMesh is called on a limb which contains non-indexed vertexdata, the plugin will create a redundant index buffer (1 index per vertex, vertices are listed in sequential order) to use when rendering. It is highly recommended to use welded geometry in conjunction with geometry instancing.
- When it comes to performance, geometry instancing and lod appear to be opposite sides of the same coin. LOD is beneficial in GPU bound scenerios, while Geometry Instancing is beneficial when CPU bound. Both techniques tend reduce performance in the opposite cases, and may not have predictable effects on performance when both/neither cpu or gpu are being stressed.



In the past I had problems with the EfxCam example where the IDE would compile "Post Filters_jGfx.exe", but would then execute "Post Filters.exe" instead. Could you compare the exes and try running them directly?
Chris Tate
DBPro Master
9
Years of Service
Recently Online
Joined: 29th Aug 2008
Location: London, England
Posted: 30th Nov 2017 19:26
Quote: "The geoinst shader is executing many more instructions on a much larger dataset then the FFP modes. Try using the geoinst diffuse shader"


Thanks. Using the shader on regular objects indeed caused a slow down. The GeoInstances performed better with low and high poly counts.

Quote: "The object's data is iterated once for each instance vertex, where each object vertex is appended with the current instance vertex's data.
"

Quote: "
When using many high-poly instances the amount of data processed by the GPU becomes tremendous"


Understood. I have created an option system in my level parse to use different instancing methods per object class; so I am attempting to have all the floors, walls, rocks, windows and slopes use GeoInstancing. For the medium detailed objects, I will decide on a per object basis; using a complex shader would be best applied to regular cloned/instanced objects.

I am considering using GeoInstanced for the grass and trees, using one mesh for branches, and another mesh for the leaf and grass planes and meshes. I noticed there is an issue with alpha blending. I will try my luck with multithreaded sorting, using a dll to do the distance checks, and DBP to update progressively.

Quote: "
In cases where full transformation matrices are not required, custom data can be used to reduce the amount of data which is processed for each vertex. For example; position, single-axis rotation, and uniform scaling could be represented using 2x float3 custom elements."


Sounds good. So the system is applying transforms on each vertex finalized by shader extract below

matrix World = float4x4(IN.WorldRow1, IN.WorldRow2, IN.WorldRow3, IN.WorldRow4);
OUT.OPos=mul(IN.Pos,mul(World,ViewProj));



Change of subject, and I do realise that this may not be the intended use, but what about static vertex position data, the performance would be further improved if there is no need for scale/rotation transformation whatsoever; is that correct? If so, then that might be a better option for static objects like floors and walls, no need to keep transforming them if they can't move; some form of baking their transforms to the data would suffice. Glass windows could also be static, then I could exclude them when broken.

This opens up my mind now; most objects in the game do not need dynamic scale, rotation or positions; just a static location. In some cases with a little animation. Most of the objects in the game level have no need to be transformed once set in place.

I do not wish to put much emphasis on optimization prematurely, but it does not seem like much work on my part to simply replace one instancing method with another. I will look into the details of this aspect of the tool as an option for improving performance further down the road. I will have to focus on the function and aesthetics for now.

Thanks for the timely advice. I did try both Post Filter executables, both yielding the same results, FPS is exactly the same: 150.

A basic GPU: NVidia GeForce GT 740, 4gb video ram (2gb shared)
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 30th Nov 2017 22:46 Edited at: 30th Nov 2017 22:48
Quote: "I noticed there is an issue with alpha blending."

Yep, alpha blending isn't supported within external renderers. Even if manual sorting is used, alpha-blended geoinstances will never play nicely with other alpha blended objects. You can still use clip-based transparency within the pixel shader for 1-bit transparency.

Quote: "what about static vertex position data, the performance would be further improved if there is no need for scale/rotation transformation whatsoever; is that correct?"

That technique is often referred to as static batch rendering, and can possibly render large amounts of geometry more efficiently than geoinstancing. In exchange for higher memory usage, static batching allows each instance's geometry to be completely unique, grass instance vertices can be manipulated to match terrain height. Similar to geometry instancing, static batching often leads to processing lots of off-screen geometry. In my experience the performance gains from pre-transforming vertices into world space are negligible, but I still see it as a good idea because it makes for slightly simpler vertex shaders.

In DBPro the easiest way to assemble a static batch is to construct a multi-limb object containing your static instances, then re-create the object from mesh. This will unweld and combine all limbs into a single gigantic vertex buffer which can exceed DBPro's typical limit of 65535 vertices. Keep in mind that some of DBP's commands aren't fond of objects which exceed 65535 vertices; for example shaders must be loaded with the extra DoNotGenerateExtraData flag set to true (normal mapping doesn't work without extra work-arounds). DBPro's vertexdata commands only work up to vertex 65535, extra work is required to access vertices higher than that. These objects still work with dbpro's collision commands, and also have full support for dbpro's render states.

Quote: "I did try both Post Filter executables, both yielding the same results, FPS is exactly the same: 150."

It sounds like your computer is gpu bound with that particular application, so the fps isn't effected by the decreased cpu usage. You may be able to see the difference by enabling v-sync and watching the cpu usage while running each application. Another option may be to simply decrease the screen resolution.
Chris Tate
DBPro Master
9
Years of Service
Recently Online
Joined: 29th Aug 2008
Location: London, England
Posted: 1st Dec 2017 00:34
Quote: "Yep, alpha blending isn't supported within external renderers. Even if manual sorting is used, alpha-blended geoinstances will never play nicely with other alpha blended objects. You can still use clip-based transparency within the pixel shader for 1-bit transparency."


Understood

Quote: "That technique is often referred to as static batch rendering, and can possibly render large amounts of geometry more efficiently than geoinstancing.

In DBPro the easiest way to assemble a static batch is to construct a multi-limb object containing your static instances, then re-create the object from mesh. "


Interesting. My levels have lots of walls and floors as separate objects; it is impossible for me to test collision against them all at once, and maintain solid performance. I resorted to grouping the objects together into collision objects based on their distance from each other, using a single mesh object.

So it seems I should do something similar with nearby identical static objects for use in rendering. The snippet provided demonstrates how much better it is to have all the spheres in a single mesh. Although the benefit is lost if I attempt to render off-screen parts of the mesh, as you indicated.

I'll take advantage of the frustum culling by grouping, and use a single mesh to improve iterated procedures. The sparkys collision plugin already has a complex option to group vertices together; but there also other functions which need to check objects in proximity. I am using PhysX in combination with sparkys collision; I just use sparky for raycasting because it provides more useful information than is available in DBP's PhysX plugins.

Quote: "DBPro's typical limit of 65535 vertices. Keep in mind that some of DBP's commands aren't fond of objects which exceed 65535 vertices; for example shaders must be loaded with the extra DoNotGenerateExtraData flag set to true (normal mapping doesn't work without extra work-arounds). DBPro's vertexdata commands only work up to vertex 65535, extra work is required to access vertices higher than that. These objects still work with dbpro's collision commands, and also have full support for dbpro's render states."


That's good to know. I am going to have to reuse lots of prefabs to save memory and keep the vertex count as low as possible.

Quote: "
It sounds like your computer is gpu bound with that particular application, so the fps isn't effected by the decreased cpu usage. You may be able to see the difference by enabling v-sync and watching the cpu usage while running each application. Another option may be to simply decrease the screen resolution."


I tried that; although both apps performance with the same FPS, when jGFX version runs, its CPU usage was 14%, and the normal one was 25%; probably on the same CPU core; but only with VSync on, otherwise their usage and performance was identical.
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 1st Dec 2017 02:19
Quote: "So it seems I should do something similar with nearby identical static objects for use in rendering."

With static batching there is no need to limit grouping to identical objects, nearby objects which share the same material can be combined into the same batch.

Quote: "I am using PhysX in combination with sparkys collision"

The official release version of Sparky's doesn't work with objects containing over 65535 vertices, but Rudolpho's recompiled version he distributes with DBP9Ex does (and also works with DBP7.7).
Chris Tate
DBPro Master
9
Years of Service
Recently Online
Joined: 29th Aug 2008
Location: London, England
Posted: 1st Dec 2017 04:39
Quote: "With static batching there is no need to limit grouping to identical objects, nearby objects which share the same material can be combined into the same batch.
"


It was my intention to do just so; having a batch for different types of rocks and floors that share the same material.

Howver, after reading your explanation, it got me thinking about texture maps. I am now thinking that it might be better for me to use atlas textures proper texture maps for the static geometry, so that static buildings could have their static geometry share one texture, and thus be rendered in a single batch.

The 'lazy' method I've used thus far is to have one material per image, which is like paradise when working in an image editor for keeping the texturing work easy to accomplish on my own; but it might be worth me getting into using proper texture maps which require more work, but with the use of clamped/wrapped UV coordinate system, all materials for a building could be contained in one image.

I can use vertex data to store attributes about the material. A single image with a multi-functional shader might take up more processing on their own, but in a world of many entities, I have a feeling it might lead to desirable results.


I'm not to fussed about pixel resolution, most GPUs can manage large textures, and there are lots of UV mapping and blending techniques I can use to sharpen the result.

I have these wrap value functions from milleniums ago that I wrote for wrapping the UV coordinates in a range different from the usual 0.0 to 1.0, good for selecting different bricks on a brick pattern randomly

Not very pretty, needs to be revised, but should work. Would have been nice to have an intrinsic wrap function in HLSL.

+ Code Snippet


Crucial points regarding the collision function.

I do not think I'll need to worry about the vertex limit because I am taking the semi 'minecraft approach' of building with blocks; not just not quite voxel, but very modular; as the players move from one location to another; these building blocks get instances/cloned or loaded from disk on demand. Any large object files will need to be streamed or preloaded before start of play. Not all that smooth, but sufficient.

I am also aware also that the old collision plugin does not work with object IDs above 65535. I have chosen to work within the limit to avoid having to change too much old code, which is pointless at this phase.
Chris Tate
DBPro Master
9
Years of Service
Recently Online
Joined: 29th Aug 2008
Location: London, England
Posted: 10th Dec 2017 01:48
While learning how to use your system; I was thinking of how to go about rendering vehicles and aircraft. I was thinking that your batch rendering example would be a good way to procedure for grouping all moving, destructible parts into a single unit; then use instancing and custom properties to render all instances of the same vehicle class with different textures.

Does that sound like a good idea? Or do you have a better idea? So as long as my object count is minimal; DBP's under optimized internal object management system can be kept out of the way. Any alpha blended elements like windows would still need to be dealt with separately.
revenant chaos
Valued Member
10
Years of Service
User Offline
Joined: 21st Mar 2007
Location: Robbinsdale, MN
Posted: 10th Dec 2017 09:01
That sounds like a wonderful idea. Using shader arrays to store each piece's transforms would solve the limitation of one mesh (limb) per GeoInst, but it might be a good idea to render vehicle tires using a separate geoinst to avoid needing to pass their rotations through custom data.

Login to post a reply

Server time is: 2017-12-18 03:22:08
Your offset time is: 2017-12-18 03:22:08