And this whole topic is why NON-developers should not discuss apis.
DirectX10 is one hell of alot quicker than DirectX9, what people seem to forget when developing between the two is that it isn't a 1:1 relationship performance and development wise.
What most Dx9-Gen developers seem to do is just convert a Dx9 pipeline to the Dx10 calls, and pile on new effects and more polygons because of a quicker pipeline.
The end result is you get a better looking version of the Dx9 engine but at quite a damn sight slower performance!
I mean what is forgotten is a card like say the 8800 handles 24million polygons... no matter what api you're using that is it's technical maximum. You can't go beyond that no matter how fewer draw calls are made.
What's more is Dx10-gen cards (particularly nvidia) seem to pile on more video ram. A fact that has really confused me, given that DirectX10 is orientated more towards streaming memory. So developers are still using the much slower seperate memory management way of loading everything first.
Yes, it can handle more shaders... but if you want to see performance increase you can't add more. Yet this is what developers are doing.
Hell look at FPSCreator X9 and X10. Lee has shown the performance difference between the two conserning shaders, but performance of the engine overall is half because he's doing a crapload more.
I mean in FPSCreator X9 with the 8800GTX you get around 1,500fps with shaders activated. With X10 you get around 500fps. So from that standpoint you can say DX10 is slower!
But what you're forgetting is now the engine has, Water, Bloom, Full-Dynamic Lighting, Soft-Particles, Soft-Shadowmapping, etc. plus models that are 5x more polygons!
So yeah you're dropping to almost 33% speed, but you've gained so much graphical fluff. I mean for gods sake, there is no longer a need for lightmapping at all.. the lighting is all real-time via shaders.
That is a HUGE change in not only how the engine is working, but also the work it needs to do. While FPSCreator X10, isn't the best example of what is actually capable with Dx10 itself; fact is compared to it's X9 engine it shows and increase in visual quality with a performance drop that will have it still running at a reasonable speed on lowest performance cards that can run it.
For years it has pissed me off how Dx9.0c developers seem to refuse to use the streaming memory system microsoft added, as it greatly increases performance. You wouldn't believe how much performance is lost via relying on the processor for transfer speeds. It also means low-end systems always end up loosing out even if their hardware can physically handle what is going on!
At the end of the day, PC (Windows) developer are just plain lazy. While APIs like Dx10 are being designed to help, it's not up to the API or Graphics Card to fix crappy engines.