I’ve been really interested in GPU computing for the last year or so. I think it’s a fascinating area. What makes GPUs so good at computing is their massive ability to do parallel tasks. CPUs are still needed for conditional stuff such as if x is below 100, y is equal to 20. But any algorithm that can be run in parallel, in theory, could be computed much faster on a GPU. How much faster you might ask? About 10-20 times faster for most applications. That’s the main idea behind Nvidia’s push for GPU computing.
In order to get better speed when previewing my scenes I decided to take a look at new graphic cards on the market. I don’t really have time to explain all the stuff I learned (I’ll provide some links at the end of the post if you’re interested in details) so here’s the gist:
Nvidia made an awesome GPU called Geforce 680; however, they really really fucked its compute ability in order to promote their professional line up of cards such as Quadro and Tesla. The card seems to be great for gaming, based on the benchmarks that I read. But, unfortunately, it performs really poorly for GPU computing intensive tasks such as 3d image rendering. How poorly? Orders of magnitude poorly compared to the ATI 7970.
I’ve always bought Nvidia GPUs in the past and have been pretty happy with them, especially on the gaming side. ATI’s reputation among game developers is not too good (Rage and Skyrim anyone?), that much I know, but it seems that the 7970 is a really great card in all areas including GPU computing – an area that Nvidia pretty much single handedly dominated before.
So, based on my research, I’ll be getting an ATI 7970 (probably an Asus one) soon.
Here are a few interesting quotes from some of the articles/benchmarks I’ve read:
Regardless of the reason, it is becoming increasingly evident that NVIDIA has sacrificed compute performance to reach their efficiency targets for GK104, which is an interesting shift from a company that was so gung-ho about compute performance, and a slightly concerning sign that NVIDIA may have lost faith in the GPU Computing market for consumer applications.
This time around, at the event introducing GeForce GTX 680 to press from around the world, the company refused to discuss compute, joking that it took a lot of heat for pushing the subject with Fermi and didn’t want to go there again.
The more complete story is that it doesn’t want to go there…yet. Sandra 2012 just showed us that the GeForce GTX 680 trails AMD’s Radeon HD 7900 cards in 32-bit math. And it gets absolutely decimated in 64-bit floating-point operations, as Nvidia purposely protects its profitable professional graphics business by artificially capping perfrmance.
Wow! The claim of beating the HD7970 goes right into the thin air, it seems. Nvidia’s new GPU is beaten by the Radeon HD 7970 by an order of magnitude here in double precision floating-point, as well as nearly twice in ordinary single precision floating point. One is speechless here. Even the Radeon HD 7870 with its restricted double precision floating-point still outperforms the GTX680 by a noticeable margin in this department, as you can see here. Only the Radeon HD 7850 is substantially slower.
One might ask, why bother? Well, compute GPU performance can’t rely on tweaked drivers, application detection turnarounds and similar tricks as well as other such shortcuts. It is pure, raw processing ability that defines the GPU general purpose computing useability. After all, Nvidia created the GPGPU market and CUDA programming environment. This situation not only badly hurts its prestige in this area but also forces the need for a, say, GK110 ‘real Kepler high end’ follow-on to be delivered soon. Not to mention, Nvidia’s GPU compute optimised cards like Tesla sell for thousands apiece, even though they are based on essentially the same dies as high end consumer GPUs, therefore GPU compute is important.