Quote: "There isn't any, particularly (apart from speed differences & that can go for chips from the same manufacturer, as well) to the 'end user' (i.e. People who haven't got a clue how it does it, just that it does do it!)."
Actually, yes... there is a BIG difference.
If you were to say do a test to see the FLOPs (FLoating-point OPerationS) a processor can do and they achieve the same score on Whetstone, doesn't mean that you can run the same code for rotating a cube in DirectX and expect the exact same performance on the exact same GPU.
There are some very fundimental differences with the processors and how they do things, which makes a very large difference in what they can achieve in an R|T situation.
Quote: "PS3 = not a PC!
I use a PC & want to know which version of PC physics wil be the choice between 2008-2011."
No, the Playstation3 isn't a PC (although technically with the software provided Sony will probably get it classed as one in Europe) .. but given the Playstation 3 has an Ageia PPU built-in to the system, this means that games
WILL support the PhysX API in order to take full advantage and allow graphics and AI to be processed independantly of the physics.
When games are then translated between systems, so that they don't have to recode the engine on several platforms; again PhysX will be used. On the x86-Compatible and PPC-Compatible platforms such-as Windows, MacOSX and AmigaOS4 platforms you can expect to see greater performance enhancements from having a PPU installed.
Now take in to account that while HavokFX can use Shader 3 which all the new consoles use, unless developers are willing to seriously sacrifice graphics in order to have physics .. which honestly do you believe many developers will do this.
The only platform HavokFX makes sense on are SLi or CrossFire systems with 1 or more GPUs available. Costs aside, there is still a minority of gamers who have such systems... more to the point is that there are even few that have Quantum Effects, which are possible to be programmed without ANY physics API; however these are done using a new technology, which will no doubt be different to ATI.. so they will not be used in a mainstream fashion for a good while yet.
So basically the Playstation 3 if it becomes even as popular as the Xbox360 is currently, will mean that PhsyX will become one of the most used Physics APIs.. although yes Xbox 360 and Windows games using DirectX will possibly use the Physics API Microsoft are
rumoured to be working on this will
only be likely with titles that are exclusive to the DirectX-Capable platforms.
Quote: "Multicore vs. Cell - point taken, as is RISC vs. an Intel (or AMD) core. I wasn't suggesting a core was as capable of as many FLOPS as a dedicated GPU/PPU RISC chip, speed for speed. Just that it'll be another option for the end user."
There's nothing to say it can't achieve the same FLOPs speed.
For example, here are two made up chips.
R-x86 CISC 10MHz / 8 VMX Registers / 4-Cycle Per Clock (CPU)
R-PZX RISC 7.5MHz / 32 VMX Registers / 1-Cycle Per Clock (PPU)
(Both will have 2D Floating Point Vector Units to make this easier to explain through example)
Note: Every Op takes 4-Cycles.
CPU @ 10MHz = 40 Ops per Register per Loop = 240 Ops/Second
PPU @ 10MHz = 7.5 Ops Per Register per Loop = 240 Ops/Second
So we've established that they have the exact same performance.
(before anyone quotes this all and picks faults, remember this is overly simplified to make a point)
Now both have Floating-Point Units, with a Basic 4 Operations.
Add, Multiply, Subtract and Divide.
However, because the PPU is specialised for physics operations it also has Dot Product, Cross Product, Square Root, and Normalise Operations as well. This is because it's specialised, and not designed for general purpose tasks.
So while, they can both do 240 operations per second; in order for the CPU to do what the PPU does.. say Dot Product for example, this would have to be done with the existing operations.
REG[0] = REG[1][1] + REG[2][1] * REG[1][2] + REG[2][2]
So while the PPU only takes 1 Loop to achieve this, the CPU takes 3 Loops. So let's assume that your compiler automatically transfers FP operations entirely to the PPU when present, this code:
VAR1 AS VECTOR = {4.5, 7.5}
VAR2 AS VECTOR = {5.5, 2.5}
VAR3 AS VECTOR
VAR3 = Multiply(VAR1, VAR2)
VAR1 = Dot Product(VAR3, VAR2)
VAR1 = Add(VAR1, VAR1)
VAR3 = Dot Product(VAR1, VAR2)
Would take 4:240 Ops on the PPU, while on the CPU it would take 8:240 Ops. While both are well within the operation limit they have per second, so you wouldn't see any performance difference; you'll notice it takes the CPU twice as many resource to do the same code. This means that it has less space so it can't do
as many operations as the PPU can, despite technically having the same FLOP performance.
Obviously for this subject is actually far more complex than my example above, but in essence it is the same principal behind the performance differences and why specialised hardware will ALWAYS perform better than generalised hardware in a given area.
I'm not going to say if I honestly believe the PPU is worth it, nor will I say that 'yes, this is how the industry will go' ... because no one truely knows what technology will be used in years to come or to what degree.
What I will say is that PhsyX and the PPU are not just fads, that will come and go. Thanks to the Playstation 3, the technology is only likely to grow; but
only if the Playstation 3 itself performs well
and provided that Sony don't force too many games to be exclusive to their console.
Also something else to remember, is that the graphics processor will always focus on graphics first, and other FP operations secondry. Just as the central processor will always focus on providing a broader processing ability and not specialising in anything.
It's the colmination of how all of these technologies work together that honestly will make a difference. Remember that for PhysX it doesn't offload extra work to the PPU, but runs ALL of it's floating-point operations on it freeing up the CPU entirely.
Same goes for GPUs and Shaders, however this doesn't mean that it will accelerate ANY FP-Ops that are done outside of these APIs.
Hense why it's so important to actually have chips that compliment each other rather than bottlenecking. This is why no matter how powerful RAM, GPU, CPU, or PPU gets; unless they can all perform together, one aspect will always mean that the others won't be able to perform to the best of their abilities.
Have you ever noticed that when you've upgraded your processor, that your graphics cards seem to also improve in performance? this is because on lower resolutions your processor is actually forcing your graphics card to slow down to it's level to keep up; this is also why you experience a jitter-frame effect in a number of games.
Personally I'll be happy when each aspect of the industry sits down and designed their hardware around the other industries to provide the best performance solutions rather than just trying to improve their own despite the lacking abilities of others. It's just honestly segregating each aspect; forcing them to more and more be performed seperately in order to get the best out of each hardware.
Intel Core 2 Duo E6400, 512MB DDR2 667MHz, ATi Radeon X1900 XT 256MB PCI-E, Windows Vista Business / XP Professional SP2