Gtx 480 comparison. Review and testing of NVIDIA GTX480. Test configuration, tools and testing methodology

Nvidia Geforce GTX 480:

description of the video card and results of synthetic tests

It makes sense to say that the card requires additional power, and two connectors, one of which is 8-pin and the other 6-pin. If there are no problems with the latter, since all modern power supplies already have such “tails,” then for power supply through an 8-pin connector, a special adapter is required, which should be supplied with serial video cards.

The chip was received in the fourth week of this year, that is, at the end of January.

About the cooling system.

Nvidia Geforce GTX 480 1536MB PCI-E

Fundamentally, the cooler does not differ from previous solutions of the GTX family: a cylindrical fan drives air through the radiator and removes heat outside the system unit. However, due to the excessive power consumption of the new product, and therefore heating, CO has undergone improvements in terms of enhancing heat dissipation using heat pipes. As we can see, the central radiator with tubes cools only the core. When memory chips are cooled by a plate pressed against them, located under the casing.

Probably the possibilities of searching for COs of this type have already been exhausted so that they can cope with a very hot core without noise. Therefore, we must say that the CO turned out to be noisy. Even in 2D mode, the cooler operates at 44% of the maximum, although previously this figure was somewhere around 20-25%. The noise starts after 50%. Therefore, the cooler operates on the verge of audible noise, and this is during idle time! What can we say about the load when the CO begins to gradually increase the rotation speed of the turbine, reaching an average of 70-80% when the card is operating in three-dimensional mode.

We conducted a temperature study using the EVGA Precision utility (author A. Nikolaychuk AKA Unwinder) and obtained the following results:

Nvidia Geforce GTX 480 1536MB PCI-E

And this is not surprising, because the heating of the core reaches 95 degrees, and even such a high figure is achieved at the cost of very noisy operation of the CO. So lovers of the most advanced and fast three-dimensional gaming graphics will have to forget what silence is when running games or any tests. Even in 2D, when the card is loaded with all sorts of complex content (such as flash or video), the cooler is already quite audible.

Equipment.

This is a reference product, so there is no kit or packaging.

Now let's move on to the tests. First, we will show the test bench configuration.

Installation and drivers

Test bench configuration:

  • Computer based Intel Core I7 CPU 920 (Socket 1366 LGA)
    • processor Intel Core I7 CPU 920 (2667 MHz);
    • Asus P6T Deluxe motherboard based on Intel X58 chipset;
    • RAM 3 GB DDR3 SDRAM Corsair 1066MHz;
    • hard drive WD Caviar SE WD1600JD 160GB SATA;
    • power supply Tagan TG900-BZ 900W.
  • operating system Windows 7 32bit; DirectX 11;
  • Dell 3007WFP monitor (30");
  • ATI drivers version CATALYST 10.3; Nvidia version 197.17.

VSync is disabled.

Synthetic tests

The synthetic test packages we use can be downloaded here:

  • D3D RightMark Beta 4 (1050) with a description on the website http://3d.rightmark.org.
  • D3D RightMark Pixel Shading 2 and D3D RightMark Pixel Shading 3 tests of pixel shaders versions 2.0 and 3.0 link.
  • RightMark3D 2.0 with a brief description: , .

Since we do not have our own synthetic DirectX 11 tests, we had to use examples from various SDK packages and demo programs. First, there are HDRToneMappingCS11.exe and NBodyGravityCS11.exe from the DirectX SDK (February 2010).

We also took two examples from both manufacturers: Nvidia and AMD, so that there would be no claims of bias from anyone. The examples DetailTessellation11.exe and PNTriangles11.exe were taken from the ATI Radeon SDK (they are also in the DX SDK, by the way). Well, Nvidia presented two demo programs: Realistic Character Hair and Realistic Water Terrain, which should soon become available for download on the company’s website.

Synthetic tests were carried out on the following video cards:

  • GeForce GTX 480 GTX 480)
  • GeForce GTX 295 with standard parameters (further GTX 295)
  • GeForce GTX 285 with standard parameters (further GTX 285)
  • Radeon HD 5970 with standard parameters (further HD 5970)
  • Radeon HD 5870 with standard parameters (further HD 5870)

To compare the results of the new Geforce GTX 480 model, these particular video cards were chosen for the following reasons: Radeon HD 5870 and HD 5970 are the most productive single-chip and dual-chip models from the competing company AMD, with prices closest to the GTX 480. With Nvidia’s solutions, everything is even simpler: Geforce GTX 285 is the most powerful single-chip card on a GPU of the last generation, by which we will judge architectural changes, and GTX 295 is the most powerful dual-chip card from Nvidia until the release of new solutions.

Direct3D 9: Pixel Filling tests

The test determines the peak texture sampling performance (texel rate) in FFP mode for a different number of textures applied to one pixel:

Our test is a little outdated, and the video cards in it do not reach theoretically possible values, but it still correctly shows the peak texturing speed of video cards relative to each other. As usual, the synthetic results do not reach the peak values; it turns out that the GTX 480 selects up to 40 texels per clock cycle from 32-bit textures with bilinear filtering in this test, which is one and a half times lower than the theoretical figure of 60 filtered texels.

This is not enough to reach at least the GTX 285, which selects texture data 5-7% faster. Not to mention catching up with the competing HD 5870, which has more than one and a half times the performance in almost all modes, judging by our DX9 synthetics. The dual-chip Nvidia card clearly fell victim to software problems, but the HD 5970 is even more powerful than the HD 5870.

The difference between the GTX 480 and GTX 285 is almost always the same, except in cases with a small number of textures, where the limitation in bandwidth has a greater effect. And the HD 5870 isn't that far ahead in these tests. But with 4-8 textures, the difference becomes larger, which hints at the lack of texturing speed of the GF100 in order to always be ahead of the competitor in legacy gaming applications. Let's look at the same results in the fill rate test:

The second synthetic test shows the fill rate, and in it we see the same situation, but taking into account the number of pixels written to the frame buffer. The maximum result remains with AMD solutions, which have a larger number of TMUs and are more efficient in achieving high efficiency in our synthetic test. In cases with 0-3 overlaid textures, the difference between the solutions is much smaller; in such modes, performance is limited by bandwidth, first of all.

Direct3D 9: Pixel Shaders tests

The first group of pixel shaders that we are considering is very simple for modern video chips; it includes various versions of pixel programs of relatively low complexity: 1.1, 1.4 and 2.0, found in older games.

The tests are very, very simple for modern architectures and do not show all the capabilities of modern GPUs, but they are interesting for assessing the balance between texture samples and mathematical calculations, especially when changing architectures, which happened this time for Nvidia.

In these tests, performance is limited mainly by the speed of texture modules, but taking into account the efficiency of blocks and caching of texture data in real tasks. Let's see how the changes in architecture affected us compared to the GT200? It is clearly visible that the architecture has changed, and the new GTX card The 480 performs better than a single-chip card based on the previous architecture. Moreover, in most tests the GTX 480 catches up with the dual-chip GTX 295, which is not bad in itself.

Memory bandwidth in these tests only slightly limits the new solutions, and the speed is dependent on texturing, which prevents the GF100-based card from performing even at the level of the Radeon HD 5870, let alone AMD's dual-chip solution. Video cards based on Nvidia chips clearly lag behind in this set of tests, which is a wake-up call for our other tests, where texturing speed is important. Let's look at the results of somewhat more complex intermediate pixel programs:

In tests of pixel shaders version 2.a, everything is even worse when compared with the speed of competitors. The highly texturing-speed-dependent procedural water rendering test "Water" uses dependent sampling from highly nested textures, and maps are always ranked by texturing speed, but adjusted for varying TMU efficiency.

Cards based on RV870 chips show maximum results, but the speed of the GTX 480 was somewhere between single-chip and dual-chip models on GPUs of the previous architecture. It’s a little weak, of course, but at least it’s faster than the GTX 285, which indicates a more efficient use of the available TMUs.

The results of the second test are almost the same, although it is more computationally intensive and was always better suited for the AMD architecture with more compute units. Modern AMD solutions are far ahead here, especially the dual-chip version.

The GTX 480 outperforms the GTX 285 by only 25%, and lags behind the dual-chip model by almost the same amount. This clearly points to the performance limitation of the GTX 480 due to the low number of TMUs compared to the next generation architecture. Our fears are confirmed in the form of the main drawback of the GF100 architecture.

Direct3D 9: pixel shader tests Pixel Shaders 2.0

These DirectX 9 pixel shader tests are more complex than the previous ones, they are close to what we now see in multi-platform games, and are divided into two categories. Let's start with the simpler version 2.0 shaders:

  • Parallax Mapping a method of texture mapping familiar to most modern games, described in detail in the article.
  • Frozen Glass complex procedural frozen glass texture with controlled parameters.

There are two variants of these shaders: those with a focus on mathematical calculations, and those with a preference for sampling values ​​from textures. Let's consider mathematically intensive options that are more promising from the point of view of future applications:

These are universal tests that depend on both the speed of the ALU units and the texturing speed; the overall balance of the chip is important in them. It can be seen that the performance of video cards in the “Frozen Glass” test is limited not only by mathematics, but also by the speed of texture samples. The situation in it is similar to what we saw a little higher in “Cook-Torrance”, but the new GTX 480 this time is much closer to the dual-chip GTX 295 based on the GPU of the old Nvidia architecture. On the other hand, even the single-chip HD 5870 is still far ahead.

In the second test "Parallax Mapping" the results are again very similar to the previous ones. However, this time the HD 5870 did not come off as far ahead of the Nvidia cards as in the first test. We'll see what happens next, but games are usually more multifaceted than synthetics and don't rely so obviously on texturing alone. But still, for such outdated tasks, the number of texture modules in the GF100 is clearly insufficient. Let's consider these same tests, modified with a preference for samples from textures over mathematical calculations, in order to be completely convinced of our intermediate conclusions:

The picture is somewhat similar, but AMD cards clearly cope better with texture samples, especially the dual-chip HD 5970 is good here! Today's hero in the form of the GTX 480 again shows an average result between the GTX 285 and GTX 295, since here the emphasis of performance on the speed of texture units is even more clearly visible, and the number of them in the GF100 is still clearly insufficient for the new powerful graphics architecture.

But these were outdated tasks, with an emphasis on texturing, and not particularly complex. Now we'll take a look at the results of two more pixel shader tests version 3.0, the most complex of our pixel shader tests for Direct3D 9, which are much more indicative of modern exclusive games on PC. The tests differ in that they place a greater load on both the ALU and texture modules; both shader programs are complex and long, and include a large number of branches:

  • Steep Parallax Mapping a much more “heavy” version of the parallax mapping technique, also described in the article.
  • Fur procedural shader that renders fur.

Finally! This is a completely different matter. Both PS 3.0 tests are very complex, do not depend at all on memory bandwidth and texturing, they are purely mathematical, but with a large number of transitions and branches, which the new GF100 architecture seems to cope with very well.

In these tests, the GTX 480 shows its real strength and outperforms all solutions except the new dual-chip one from its competitor. Moreover, the GTX 295 is almost twice as slow in these most complex tests, and the GTX 285 is even three times slower! The results were clearly influenced by the new GPU's architectural changes to improve computing efficiency.

So, with the new GF100 architecture we note a very large increase in performance in the most complex PS 3.0 tests. In which the most important thing is not the peak mathematical power that AMD solutions have, but the efficiency of executing complex shader programs with transitions and branches. Well, the doubled mathematical power, compared to the GT200, also had an effect. A very good result, because overtaking the AMD architecture solution, which has a larger number of ALU execution units, is worth a lot.

Direct3D 10: PS 4.0 pixel shader tests (texturing, loops)

The second version of RightMark3D included two familiar PS 3.0 tests for Direct3D 9, which were rewritten for DirectX 10, as well as two more new tests. The first pair added the ability to enable self-shadowing and shader supersampling, which further increases the load on video chips.

These tests measure the performance of pixel shaders running in cycles, with a large number of texture samples (in the heaviest mode, up to several hundred samples per pixel) and a relatively small ALU load. In other words, they measure the speed of texture samples and the efficiency of branches in the pixel shader.

The first test of pixel shaders will be Fur. At the most low settings it uses 15 to 30 texture samples from the height map and two samples from the main texture. The Effect detail mode “High” increases the number of samples to 40-80, the inclusion of “shader” supersampling up to 60-120 samples, and the “High” mode together with SSAA is characterized by maximum “heaviness” from 160 to 320 samples from the height map.

Let's first check the modes without supersampling enabled; they are relatively simple, and the ratio of results in the “Low” and “High” modes should be approximately the same.

Performance in this test depends both on the number and efficiency of TMU blocks, and on the fill rate with bandwidth to a lesser extent. The results in “High” are approximately one and a half times lower than in “Low”, as it should be according to theory. In Direct3D 10 tests of procedural fur rendering with a large number of texture samples, Nvidia solutions are traditionally strong, but the latest AMD architecture has already come close to them.

The GTX 480 is almost a third faster than the GTX 285, but falls short of the GTX 295, which we also saw in DX9 tests. This speaks more about the influence of fill rate and memory bandwidth, where the new Nvidia solution has an advantage over the single-chip card of the previous series. The speed of the GF100 is approximately the same relative to the two cards based on the RV870. Let's look at the result of the same test, but with shader supersampling enabled, which increases the work by four times, perhaps in this situation something will change, and memory bandwidth with fill rate will have less effect:

Enabling supersampling theoretically increases the load by four times, and this time the GeForce GTX 480 loses ground, oddly enough. And both Radeons are getting a little stronger. The difference between the GTX 480 and GTX 285 is very small, which most likely indicates an emphasis on texturing. Or bandwidth, which did not increase too much for the GTX 480 compared to the GTX 285. The impact of ALU performance and efficient branch execution is clearly not visible in this test.

The second test, which measures the performance of complex pixel shaders with loops with a large number of texture samples, is called Steep Parallax Mapping. At low settings it uses 10 to 50 texture samples from the height map and three samples from the main textures. When you enable heavy mode with self-shadowing, the number of samples doubles, and supersampling quadruples this number. The most complex test mode with supersampling and self-shadowing selects from 80 to 400 texture values, that is, eight times more than the simple mode. Let's first check simple options without supersampling:

This test is more interesting from a practical point of view, since varieties of parallax mapping have been used in games for a long time, and heavy variants, like our steep parallax mapping, are used in many projects, for example, in Crysis and Lost Planet. In addition, in our test, in addition to supersampling, you can enable self-shadowing, which approximately doubles the load on the video chip; this mode is called “High”.

The diagram almost completely repeats the previous one, showing similar results even in absolute numbers. In the updated D3D10 version of the test without supersampling, the GTX 480 copes with the task a little better than the single-chip top of the previous generation, but lags behind the dual-chip GTX 295 card. Also, the new GF100 video card is slightly ahead of its rival HD 5870, the dual-chip version of which becomes the winner in absolute terms.

Let's see what difference turning on supersampling will make; it always causes a slightly larger drop in speed on Nvidia cards.

When supersampling and self-shadowing are enabled, the task becomes more difficult; enabling both options together increases the load on the cards by almost eight times, causing a large drop in performance. The difference between the speed indicators of several video cards has changed, the inclusion of supersampling has the same effect as in the previous case - AMD cards have clearly improved their performance compared to the Nvidia solution.

Both dual-chip cards remain ahead of the GTX 480, but this time the new solution is slightly behind its direct competitor HD 5870. It seems that this will be the case in gaming tests - in some places the GTX 480 will be far ahead, and in others it will be a little behind . However, the card on the GF100 at least outperforms its predecessor, noticeably in easy mode, and just a little in heavy mode. Architectural changes in Nvidia's new GPU didn't provide much of an advantage in these tests, unfortunately.

Direct3D 10: PS 4.0 Pixel Shader Tests (Compute)

The next couple of pixel shader tests contain a minimum number of texture fetches to reduce the performance impact of the TMU units. They use a large number of arithmetic operations, and they measure precisely the mathematical performance of video chips, the speed of execution of arithmetic instructions in a pixel shader.

First math test Mineral. This is a complex procedural texturing test that uses only two samples of texture data and 65 sin and cos instructions.

But in math tests we should see big changes, since the GF100 GPU features twice the ALU power of the GT200. However, theoretically, AMD solutions should be even faster in our synthetic tests, since in computationally complex tasks the modern AMD architecture has a clear advantage over its competitors from Nvidia. The situation is confirmed this time too; although the new GTX 480 board has narrowed the gap between Nvidia and AMD cards, it remains more than one and a half times.

But the comparison with the GTX 285 and GTX 295 turned out to be interesting. This time Nvidia failed to achieve either a two-fold difference from the previous single-chip card or overtake the old dual-chip card of the previous generation. The conclusion is confirmed that this test does not completely depend on the speed of the ALU, but the results cannot be attributed to the difference in bandwidth. The GF100 achieved only a 38% increase compared to the GTX 285, which is very strange and very, very small, as it seems to us.

Let's look at the second shader calculation test, which is called Fire. It is heavier for an ALU, and there is only one texture fetch, and the number of sin and cos instructions has been doubled, to 130. Let's see what has changed with increasing load:

In the second test, the rendering speed is limited almost exclusively by the performance of the shader units, but still the difference between the GTX 285 and GTX 480 is too small - only 58%, although theoretically it should be closer to a twofold difference. But the new solution at least caught up with the dual-chip GTX 295, unlike the previous test. However, competitors such as the Radeon HD 5870 and even more so the HD 5970 show speeds that are even much higher in this test.

Let's summarize the math D3D10 tests. All Nvidia video cards are far behind, even the new GF100 is almost twice as slow as its competitor in peak synthetic tasks! And all this despite the fact that the GTX 480 is theoretically almost twice as fast as the single-chip version of the GTX 285. Reality shows a much lower figure, and it was not possible to even get close to AMD cards using simple mathematical tests from Nvidia.

In general, the result of extreme mathematical calculations remains unchanged this time too there is a clear and undeniable advantage of AMD’s solutions, which has not changed the output of the GTX 400 line. Let’s look at the results of testing geometry shaders that’s where the new solution should be stronger than anything else other.

Direct3D 10: geometry shader tests

The RightMark3D 2.0 package has two geometry shader speed tests, the first option is called “Galaxy”, a technique similar to “point sprites” from previous versions of Direct3D. It animates a particle system on the GPU, a geometry shader from each point creates four vertices that form a particle. Similar algorithms should be widely used in future DirectX 10 games.

Changing the balancing in geometry shader tests does not affect the final rendering result, the final image is always exactly the same, only the methods of processing the scene change. The “GS load” parameter determines in which shader the calculations are performed: vertex or geometry. The number of calculations is always the same.

Let's look at the first version of the Galaxy test, with calculations in the vertex shader, for three levels of geometric complexity:

The ratio of speeds for different geometric complexity of scenes is approximately the same for all solutions, performance corresponds to the number of points, with each step the FPS drops by about two times. The task for modern video cards is not particularly difficult, and performance in general is limited by the speed of geometry processing and is not limited by memory bandwidth.

This is where the new GPU shows its true strength. The Geforce GTX 480 in all modes shows results close to the competitor's dual-chip solution, being one and a half times faster than both the HD 5870 and the dual-chip card based on the GT200. Excellent result! As expected, the GF100's execution of geometry shaders is very, very efficient, about 2.5 times faster than the GT200 can. Let's see if the situation changes when we transfer part of the calculations to the geometry shader:

No, the numbers did not change much when the load changed in this test. All cards in this test do not notice changes in the GS load parameter, which is responsible for transferring part of the calculations to the geometry shader, and show results similar to the previous diagram. Let's see what will change in the next test, which assumes a large load on geometry shaders.

“Hyperlight” is the second test of geometry shaders, demonstrating the use of several techniques at once: instancing, stream output, buffer load. It uses dynamic creation geometry by drawing into two buffers, as well as new opportunity Direct3D 10 stream output. The first shader generates the direction of the rays, the speed and direction of their growth, this data is placed in a buffer, which is used by the second shader for drawing. For each point of the ray, 14 vertices are built in a circle, up to a million output points in total.

A new type of shader programs is used to generate “rays”, and with the “GS load” parameter set to “Heavy” also to draw them. That is, in the “Balanced” mode, geometry shaders are used only to create and “grow” rays, the output is carried out using “instancing”, and in the “Heavy” mode, the geometry shader is also involved in output. First we look at the easy mode:

Both dual-chip configurations performed as usual in this test, the GeForce GTX 295 and the Radeon HD 5970. Apparently, this test is incompatible with the AFR multi-chip rendering method at all. Otherwise, the relative results in different modes correspond to the load: in all cases, performance scales well and is close to theoretical parameters, according to which each subsequent level of “Polygon count” should be less than twice as slow.

In this test, the performance of the new GeForce GTX 480 is only slightly faster than the Radeon HD 5870 in hard mode, but in easy mode the difference is more noticeable. Comparing the GTX 480 with the GTX 285 based on the previous generation GPU is generally ridiculous; the new video chip turns out to be about twice as fast.

The numbers should change in the next diagram, in a test with more active use of geometry shaders. It will also be interesting to compare the results obtained in “Balanced” and “Heavy” modes with each other.

It's time to once again be amazed at the GF100's geometry processing capabilities and the speed of execution of geometry shaders. This is exactly the result for which global changes were made to the GF100 graphics pipeline. Although the performance of geometry shaders has been well improved in both the GT200 and RV870, the GF100 simply tears them to pieces in this task.

The new GTX 480 is almost twice as fast in this test as the Radeon HD 5870 and up to 2.75 times faster than its single-chip predecessor, the GTX 285. Nvidia engineers tried to improve the efficiency of the previous geometry processing architecture, and they clearly succeeded. All previous solutions are simply not capable of performing geometry shaders as efficiently. What will happen in the tessellation tests, which should show an even greater difference, based on theory? But let's not look too far ahead.

Direct3D 10: texture fetching speed from vertex shaders

The Vertex Texture Fetch tests measure the speed of a large number of texture fetches from the vertex shader. The tests are similar in essence and the ratio between the cards’ results in the “Earth” and “Waves” tests should be approximately the same. Both tests are based on texture sampling data, the only significant difference is that the Waves test uses conditional branches, while the Earth test does not.

Let's look at the first "Earth" test, first in the "Effect detail Low" mode:

Previous research has shown that the results of this test are affected by both texturing speed and memory bandwidth. But the difference between the solutions is very small. The GTX 480 shows similar results to the dual-chip GTX 295, is slightly ahead of the HD 5870, but is quite a bit inferior in all modes to the most powerful card in this test, the Radeon HD 5970. The results are clearly strange... Let's look at the performance in the same test with an increased number of texture samples :

The relative position of the cards on the diagram has changed slightly, this can be seen from the slightly worse indicators of almost all cards. Except for the GTX 480 we are considering today. It has lost almost no performance compared to the same test in light conditions. This is what it means: increased efficiency of texture modules and especially the caching subsystem. Now the new GF100 card is the fastest at medium and high polygon counts and on par with dual-chip cards in the simplest mode.

Let's look at the results of the second test of texture fetches from vertex shaders. The Waves test has a smaller number of samples, but it uses conditional jumps. The number of bilinear texture samples in this case is up to 14 (“Effect detail Low”) or up to 24 (“Effect detail High”) per vertex. The complexity of the geometry changes similarly to the previous test.

Interestingly, the results in the Waves test are not similar to what we saw in the previous charts. The advantage of AMD products has increased somewhat, and now the GTX 480 shows performance similar to the HD 5870 and GeForce GTX 295, slightly losing to its competitor in heavy mode. The previous top-end Nvidia solution on a single chip is left behind, the new model of the GeForce GTX 400 family is ahead of it, although not significantly. Let's consider the second version of the same test:

Again, there are almost no changes, although as the conditions increased in complexity, the results of the latest Nvidia GPU in the second vertex sample test became slightly better relative to the speed of AMD video cards. The advantage over the HD 5870, although small, is there, and the new single-chip card coped with the GeForce GTX 295, with the exception of the easiest mode.

3DMark Vantage: Feature tests

IN this review We have again decided to include synthetic benchmarks from the 3DMark Vantage suite. Although the package is no longer new, its feature tests support D3D10 and are interesting because they differ from ours. When analyzing the results of the new Nvidia solution in this package, we will be able to draw some new and useful conclusions that eluded us in the RightMark family of tests.

Feature Test 2: Color Fill

Fill rate test. Uses a very simple pixel shader that does not limit performance. The interpolated color value is written to an off-screen buffer (render target) using alpha blending. The 16-bit off-screen buffer of the FP16 format is used, which is most often used in games that use HDR rendering, so this test is quite timely.

The performance numbers in this test don't match what we've seen in our similar tests, even considering the different formats: ours uses an 8-bit per-component integer buffer, while the Vantage test uses 16-bit floating-point. Vantage's numbers do not show the performance of ROP units, but rather the approximate value of memory bandwidth. For dual-chip cards, everything is somewhat more complicated; the GTX 295 shows a lower figure than it should.

The test results roughly correspond to theoretical figures, and depend on the memory bus width, its type and frequency. The GTX 285 shows a good result due to the use of 512-bit memory, and the GTX 480 is not too ahead of it due to the fact that the GDDR5 memory does not operate at a particularly high frequency, and the memory bus width corresponds to 384-bit. Well, the Radeon HD 5870 is also somewhere nearby, although it only has a 256-bit memory bus, but GDDR5 is quite fast.

Despite the use of GDDR5 memory with higher bandwidth, the new Nvidia solution together with the HD 5870 shows results only slightly higher than the GTX 285, which has a 512-bit bus and GDDR3 memory. This can serve as a potential performance limitation when using rendering buffers in the FP16 format, which is widely observed in modern games.

Feature Test 3: Parallax Occlusion Mapping

One of the most interesting feature tests, since a similar technique is already used in games. It draws one quadrilateral (more precisely, two triangles), using a special Parallax Occlusion Mapping technique that simulates complex geometry. Quite resource-intensive ray tracing operations and a high-resolution depth map are used. This surface is also shaded using a heavy Strauss algorithm. This is a test of a very complex and heavy pixel shader for a video chip, containing numerous texture samples during ray tracing, dynamic branching and complex lighting calculations according to Strauss.

The test differs from others in that it depends not only on shader power, branch execution efficiency and texture fetch speed separately, but on a little bit of everything. And to achieve high speed, a proper balance of GPU and video memory blocks is important. It greatly affects the test and the efficiency of branching in shaders.

Unfortunately, the GTX 480 shows a mediocre result in this test, only 23% faster than the previous single-chip solution GTX 285. The Nvidia video card introduced today lags behind both the dual-chip GTX 295 and its main competitor Radeon HD 5870, and the dual-chip HD 5970 remained completely out of reach.

It is not very clear what had such a negative impact on the results of this test. Perhaps it's my fault low speed texture samples, which are actively used in the test, since the branching efficiency of the GF100 is quite high, which was proven by our tests of pixel shaders of the third version. Nvidia solutions have always been effective in this test, but the HD 5870 outperforms even the new GTX 480. Maybe the GF100 will show its best side in physics simulation tests?

Feature Test 4: GPU Cloth

The test is interesting because it calculates physical interactions (fabric imitation) using a video chip. Vertex simulation is used, using the combined work of vertex and geometry shaders, with several passes. Use stream out to transfer vertices from one simulation pass to another. Thus, the execution performance of vertex and geometry shaders and the stream out speed are tested.

You can immediately discard the performance of dual-chip cards; they clearly correspond to the speed of their single-chip counterparts (each chip in the HD 5970 and GTX 295 operates at a lower frequency than in the HD 5870 and GTX 285). The rendering speed here depends on the performance of geometry processing and the execution of geometry shaders. In this test, even the GTX 285 performs well, only slightly behind the HD 5870, and the new GTX 480 card again showed its strengths.

In this test, the GF100 is almost twice as productive as the previous solution, which is a good match for the twice the shader power of the new chip. The advantage over the competing Radeon HD 5870 is just as impressive. In general, our today's hero can be assigned the status of a leader in the implementation of geometry shaders and the speed of geometry processing in general, as it should be according to theory.

Feature Test 5: GPU Particles

Test of physical simulation of effects based on particle systems calculated using a video chip. Vertex simulation is also used, each vertex representing a single particle. Stream out is used for the same purpose as in the previous test. Several hundred thousand particles are calculated, all are animated separately, and their collisions with the height map are also calculated. Similar to one of our RightMark3D 2.0 tests, particles are rendered using a geometry shader that creates four vertices from each point to form a particle. But the test most of all loads shader units with vertex calculations; stream out is also tested.

There is an even stronger result. In Vantage's synthetic fabric and particle simulation tests, which use geometry shaders, the new GF100 chip simply leaves all its rivals in the dust. This time it is almost three times faster than the previous Nvidia GPU, while the competing Radeon HD 5870 performs about half as well in the particle simulation test.

The multi-chip results are again the same for both the AMD and Nvidia cards, the multi-chip rendering method clearly does not work, since the calculation results of the current frame are used in the next one, which prevents it from starting to be calculated before the rendering of the current one is completed. This is an obvious weakness of dual-chip cards; they cannot work efficiently when data from the previous one is used in a frame.

Feature Test 6: Perlin Noise

The last feature test of the Vantage package is a mathematically intensive test of the video chip; it calculates several octaves of the Perlin noise algorithm in the pixel shader. Each color channel uses its own noise function to put more stress on the video chip. Perlin noise is a standard algorithm often used in procedural texturing and uses a lot of math.

A mathematical feature test from the Futuremark test package shows the pure performance of video chips in extreme tasks. The performance shown in it corresponds well to what should be obtained according to theory, and partially corresponds to what we saw above in our own mathematical tests from RightMark 2.0. But in this test the difference between the solutions is even greater.

So, in this mathematical test, the GTX 480 based on the new GF100 finally outperformed the GTX 285 by exactly half, which is in line with the theory. But the gap between the new solution and the HD 5870 turned out to be too large – 1.7 times. We are not yet considering the dual-chip HD 5970...

In general, AMD video cards naturally outperform their Nvidia competitors in this test, but the new solution based on the Nvidia GF100 graphics processor was still able to come close to it. As a reminder, this math test is fairly straightforward and is designed to show performance close to the theoretical peak. In more complex computational tests, such as physical calculations, a slightly different picture emerges. But simple but intensive mathematics is performed much faster on AMD cards.

Direct3D 11: Compute and Geometry Shaders

To test new solutions from Nvidia and AMD in tasks that use the capabilities of DirectX 11, we used samples from development kits (SDKs) from Microsoft, AMD and Nvidia, as well as some demo programs from these companies.

First, let's look at tests that use a new type of shader - Compute. Their appearance is one of the most important innovations in the latest versions of the DX API; they are used for various tasks: post-processing, simulations, etc. The first test shows an example of HDR rendering with tone mapping from the DirectX SDK with post-processing using pixel or compute shaders.

We must admit the clear victory of AMD's single-chip solution over the new Nvidia Geforce GTX 480 video card in this test. The board announced today on the new GF100 chip lags behind the competing Radeon HD 5870 in both pixel and compute shader modes. Moreover, the lag is quite noticeable – up to one and a half times. The dual-chip HD 5970 has only one GPU running in this test, so its result is even lower than that of the HD 5870.

The second compute shader test is also taken from Microsoft's DirectX SDK and shows a computational N-body gravity problem, simulating a dynamic particle system that is subject to physical forces such as gravity.

And in this computing test, Nvidia’s new solution again loses to its closest competitor, the Radeon HD 5870. In this case, by about 25%, which is also quite a lot. The dual-chip HD 5970 once again cannot show its capabilities and is limited to the operation of one of the two GPUs installed on the board.

Next test is a demo program from Nvidia called Realistic Character Hair. It does not use purely synthetic code for computational or geometry shaders, but a complex of geometry and computational shaders and tessellation, so it is somewhat closer to real problems than the pure synthetics of the first two tests.

But in this test, the new Nvidia GPU shows an excellent result, significantly ahead of the single-chip Radeon HD 5870 and the dual-chip HD 5970, the second GPU of which again failed. At the same time, it is interesting not only the difference in performance between single-chip cards up to 1.5-1.8 times, but also their different behavior when hardware tessellation is enabled.

In this case, the new Geforce GTX 480 video card based on the GF100 chip accelerates by 15% when tessellation is enabled, while the AMD solution based on the RV870 slows down by almost 5%. In other words, in this case, tessellation is beneficial for Nvidia’s solution, but not for AMD. Apparently, the different organization of the geometric conveyor affects the performance of which we now move on to.

Direct3D 11: Tessellation Performance

The most important innovation in Direct3D 11 is considered to be hardware tessellation. We looked at it in great detail in our theoretical article about the Nvidia GF100. There are several different schemes for partitioning graphic primitives (tessellation). For example, phong tessellation, PN triangles, Catmull-Clark subdivision.

Tessellation has already begun to be used in the first DirectX 11 games, such as STALKER: Call of Pripyat, DiRT 2, Aliens vs Predator, Metro 2033. In some of them, tessellation is used for character models (all FPS games listed), in others - for simulation realistic water surface (DiRT 2). The PN Triangles scheme is used in STALKER: Call of Pripyat, in Metro 2033 Phong tessellation. These methods are relatively quickly and easily implemented into the game development process and existing engines, which is what has been done.

Our first tessellation test will be the Detail Tessellation example from the ATI Radeon SDK. Actually, it shows not only tessellation, but also two different bumpmapping techniques: regular normal map overlay and parallax occlusion mapping. Well, let's compare DirectX 11 solutions from Nvidia and AMD in different conditions:

The first conclusion suggests itself is the following: the pixel-by-pixel parallax occlusion mapping technique (middle bars in the diagram) on both the GeForce GTX 480 and RADEIN HD 5870 is performed less efficiently than tessellation (lower bars). That is, simulating geometry using pixel calculations provides lower performance than real geometry rendered using tessellation. This is about the prospects of tessellation where parallax mapping is currently used.

Next, regarding the performance of the GTX 480 and AMD cards relative to each other. The dual-chip HD 5970 is ahead of single-chip options, which is understandable. But the GTX 480 is 5-15% ahead of the HD 5870. More with tessellation enabled, less with per-pixel calculations. What we expect is that in games that only support DX9 or DX10, the difference between the GTX 480 and HD 5870 should also be smaller than in DX11 games with tessellation.

The second tessellation performance test we will have is another example for 3D developers from the ATI Radeon SDK PN Triangles. Actually, both examples are also included in the DX SDK, so many game developers will create their code based on them. We tested this example with different tessellation factors to understand how much impact changing it has on overall performance.

This example is perhaps the first time we've seen the true geometric power of the GF100's graphics architecture. Yes, this is only a synthetic test and such extreme partitioning factors are unlikely to be used at first. But this is why synthetics are needed to help assess the prospects of solutions in future problems.

And the Geforce GTX 480 here perfectly demonstrates what the GF100 is capable of in tessellation tasks. The single chip is many times faster than the competitor's dual-chip card. The advantage over the HD 5970 reaches four times, and the single-chip HD 5870 is defeated with a simply devastating score in this test. Essentially, the GF100 allows for several stops more tessellation than the RV870. This is what it means to have an architecture specifically designed to take into account the capabilities of the new tessellation API.

But let's look at another test demo Nvidia program Realistic Water Terrain, also known as Island. By the way, the author of this program is Timofey Cheblokov aka Smalltim, known to 3D enthusiasts. His Island demo uses tessellation and displacement mapping to render realistic-looking ocean surfaces and terrain. She looks just great:

In general, Island is not a pure synthetic test for tessellation, but contains rather complex pixel and compute shaders, so the difference in performance may be less than in the previous case, but this position will be closer to reality.

In this case, we tested the demo at four different tessellation ratios, here this setting is called Dynamic Tessellation LOD. If at the lowest partitioning factor the GF100 card is only slightly ahead of the single-chip version from AMD, and even inferior to the HD 5970, then with an increase in the partitioning factor and the resulting scene complexity, the performance of the GTX 480 does not decrease as much as the rendering speed of competing solutions.

As a result, we again have a situation where the GF100 chip of the new Nvidia graphics architecture provides tessellation performance similar to the RV870 with significantly different scene complexity. So, with a maximum LOD coefficient of 100 in this program, the GTX 480 shows the same performance as the Radeon HD 5870, but with a coefficient of only 25 , that is, several times more triangles (28 million versus 4 million in this case). It's just a huge difference!

Conclusions on synthetic tests

Based on the results of synthetic tests of the new Nvidia Geforce GTX 480 model, based on the GF100 graphics processor, as well as the results of other video card models from major video chip manufacturers, we can conclude that this is a very powerful Nvidia graphics architecture, which features significantly improved performance and possibilities. New models of video cards based on GF100 have become one of the fastest among all single-chip cards.

The increased number of geometry processing units and their parallel operation has significantly improved the performance of tessellation and geometry shaders. In synthetic tessellation tasks, Nvidia's new solution simply has no equal. Even a dual-chip solution does not help a competitor, and when comparing video cards with a single GPU, the GF100-based solution outperforms the best RV870-based card in such tests by up to 4-6 times. And until the release of a competitor’s architecture, specially enhanced for efficient geometry processing, the situation will not change.

If we judge the performance in 3D applications without tessellation, then we can assume that in game tests it will be the same as in our synthetic ones - in some places the GeForce GTX 480 will be ahead of its competitor, and in others it will be a little behind. Moreover, there should not be too big losses, since there are no games that would be completely limited by mathematical calculations or texture fetch performance - the only parameters on which we have some questions about the GF100 architecture.

In synthetic tests of tessellation, geometry shaders and physics calculations (simulating fabrics and particles in the Vantage package, which also uses geometry shaders), the new Nvidia GF100 chip is significantly stronger than others. As in other computing tests with complex programs. But straightforward mathematics like purely computational tests from RightMark or Vantage, as expected, were lost to AMD solutions, and Nvidia still has a decent gap. It turns out that the GF100 has come closer to the CPU in its features, has become even more universal (remember C++ and caching like a CPU), but compared to the RV870 it has slightly less “number crushing” power, which has always distinguished GPUs from CPUs.

The relatively low peak computational and texture performance that we noted in our article leads to a lag behind the competitor in some artificial tests, but overall the GTX 480 showed very decent results, which should be confirmed in the next part of our material. In it you will get acquainted with the tests of Nvidia's latest solution, based on the new GPU, in the most modern gaming applications.

We assume that the game results will approximately correspond to our conclusions made when analyzing the results of synthetic tests. Although there will not be a significant difference, because the rendering speed in games often depends on several characteristics of video cards at once, and is much more dependent on the fill rate and memory bandwidth than synthetics. We think that the GeForce GTX 480 should be slightly ahead of its single-chip competitor Radeon HD 5870 in games without tessellation and will certainly be ahead in tests using it.

Six months after the release of the RV870 “Cypress” graphics processor and the ATI Radeon HD 5800 line of video cards based on it, NVIDIA was finally able to please its fans with the release of the new “Fermi” architecture and the first two video cards - GeForce GTX 480 and GTX 470, designed to be the answer to the Radeon HD 5800. For many, the tedious wait is over and now, based on the mass of various articles and reviews, we can begin to decide what to get for modern games with support for DirectX 11? Fortunately, there is still time before the video cards become available for free sale, since the first batches will arrive in retail chains no earlier than April 12. Despite the fact that we received the card a few days before the official announcement, we preferred not to rush into publishing the article, but to conduct the most detailed and thorough testing possible - that’s why we present the article to your attention only a few days after the announcement of the new video cards, but we hope that the completeness of the material will allow you to forgive us this slight delay.

So, welcome - GeForce GTX 480!

NVIDIA GeForce GTX 480 specifications compared to competitors

The technical characteristics of the NVIDIA GeForce GTX 480 are presented in the table in comparison with current price competitors and the previous GeForce GTX 285 video card:

There is now more than enough information about the Fermi architecture on the Internet, in addition, you can familiarize yourself with official documentation(2.74 MB). Therefore, we will immediately move on to the review of the video card.

Review of the NVIDIA GeForce GTX 480 1.5 GB video card

The reference NVIDIA GeForce GTX 480 video card was provided to us for testing in OEM configuration, that is, without packaging and accessories. For those who are familiar with appearance GeForce GTX 260-285, the new product will not seem original:


Perhaps the five nickel-plated copper heat pipes extending upward from the GPU may attract attention:


Otherwise, the reference GeForce GTX 480 does not stand out in any way outwardly. The length of the printed circuit board is 267 mm, which distinguishes the new product from its competitor in the form of the ATI Radeon HD 5870, whose length is 282 mm and which does not fit into all system unit cases.

The video card is equipped with a PCI-E interface version 2.0, two Dual Link DVI-I outputs and one HDMI connector, adjacent to a grille for partial exhaust of air heated by the GPU outside the case:


From the other end of the board you can see an opening, which does not open the way for air flow to the turbine, as one might assume. On top of the video card there are six- and eight-pin connectors for connecting power, as well as two MIO connectors, for organizing the operation of two GeForce GTX 480 in SLI mode or three in 3-Way SLI:


Please note that next to these connectors there is another grille through which some of the hot air exits the cooling system, thus remaining inside the system unit case. Considering the very high consumption of the GeForce GTX 480 (stated 250 W) and, as a result, heat dissipation, this is an unpleasant fact. But, apparently, the engineers who designed this cooling system had no other choice.

The plastic casing of the video card cooling system is held on by latches, which can be easily unfastened:


Using alternative coolers on the GeForce GTX 480 together with a standard plate for cooling power elements is impossible, since in the mounting holes near the graphics processor there are bushings on which the bases of the cooling systems will rest, forming a 3-4 mm gap between the base of the coolers and the GPU heat spreader .

It’s easy to unscrew the radiator with the cooling plate of the graphics card’s printed circuit board elements, as a result of which you can familiarize yourself with the latter in full:


All video card memory chips are located on the front side of the printed circuit board. The power path of the GeForce GTX 480 is a six-phase power supply circuit for the GPU, which is based on the CHL8266 controller, and a two-phase one for memory chips:


The GPU crystal, consisting of an incredible 3.2 billion transistors, is covered with a heat spreader cover with markings:


Judging by the markings, the GPU belongs to the third revision (GF100-375-A3) and was released in the 4th week of 2010. The GPU contains 480 universal shader processors, 60 texture units and 48 raster operation units (ROPs). The nominal frequency of the GPU geometry block is 700 MHz, and its shader domain is twice as high - 1401 MHz. In 2D mode, GPU frequencies are reduced to 51/101 MHz. You could already see other characteristics above in the specifications table.

The reference NVIDIA GeForce GTX 480 is equipped with 12 GDDR5 video memory chips with a total capacity of 1.5 GB, located on the front side of the printed circuit board. The chips were released by Samsung and are labeled K4G10325FE-HC04:


Based on the memory specifications, its nominal access time is 0.4 ns, and the theoretical effective frequency is 5000 MHz. Despite this fact, the GeForce GTX 480 memory frequency is only 3696 MHz, which allows us to hope for its successful overclocking. In order to reduce heat generation and save energy, when the video card switches to 2D mode, the effective memory frequency is reduced to 270 MHz. The width of the video card memory bus is 384 bits, which allows you to achieve an impressive throughput of 177.4 GB/sec.

The new version of the GPU-Z utility can almost accurately demonstrate the characteristics of the GeForce GTX 480:


Let's move on to studying the video card cooling system and checking its efficiency. The key element of the stock GeForce GTX 480 cooler is the GPU heatsink:


It consists of five copper heat pipes with a diameter of 6 mm, which are part of the base (direct contact technology). The tubes penetrate aluminum fins approximately 0.35 mm thick and with an interfin spacing of just over 1.5 mm. It should be noted that the radiator area of ​​the cooler is very modest. The distance between the tubes at the base is also 1.5 mm. The entire radiator structure is nickel plated.

Between the GPU and the HDT base of the cooler radiator there is a thick gray thermal interface applied in excess:


For coolers with direct contact technology, the quantity and quality of the thermal interface is of higher importance to achieve maximum efficiency than for coolers with a classic base. Looking ahead, we note that removing the standard thermal paste and replacing it with the minimum possible layer of Arctic Cooling MX-3 made it possible to reduce the peak temperature of the GPU by 3 ° C. In 2D mode the temperature did not change.

The second component of the GeForce GTX 480 cooling system is a metal plate with a turbine installed on it.


The plate, through thermal pads, is in contact with video memory chips and power elements of the printed circuit board. The rotation speed of the turbine (peak power 21 watts, by the way) is adjusted automatically by the video card depending on the temperature. Its interesting feature is the fact that the increase in speed occurs smoothly, but its decrease after the load is removed is very sharp. The noise that the GeForce GTX 480 cooler generates at speeds close to maximum gives the impression that the turbine is turning off, although in fact this is not the case. In 2D mode, when the video card frequencies are reduced significantly, the turbine operates at 44-46% of its power. We will tell you about its noise level in one of the following sections of today's material, but for now we will check how effective the standard GeForce GTX 480 cooler will be.

To create a load and warm up the video card, we used the very resource-intensive Firefly Forest test from the semi-synthetic 3DMark 2006 package at a resolution of 2560x1600 with anisotropic filtering at 16x level. The temperature of the video card's GPU and turbine power (in %) were monitored using MSI Afterburner version 1.5.1, which does not yet fully support the GeForce GTX 480. The room temperature during testing was 25 °C. Testing was carried out in a closed system unit case, the configuration of which you will find in the section with testing methods. The test was carried out before disassembling the video card using a standard thermal interface.

So, let's look at the temperatures of the GeForce GTX 480 in automatic turbine mode and at maximum power:


Automatic adjustment of maximum speed


Obviously, the video card turned out to be very hot. Even with a load in the form of a test from 3DMark 2006, the GPU temperature quickly reached 95 °C, but then, thanks to an increase in turbine speed to 70-78% (~3600 rpm), it dropped to 91..92 °C and did not change further throughout the test. If you manually set the turbine to maximum power (~4780 rpm), then the GPU temperature will not exceed 68 °C. There is a very high dependence of the radiator efficiency on the turbine rotation speed, which, first of all, indicates its insufficient dispersion area.

We also checked the efficiency of the reference GeForce GTX 480 cooler using FurMark version 1.8.0 (with a renamed exe), which was launched in full screen mode at a resolution of 2560x1600 with 16x level anisotropic filtering activated in the GeForce drivers. In automatic mode, we could observe the same picture as when testing in 3DMark 2006, with the only difference being that the peak temperature first reached 98 °C, and after automatically increasing the turbine speed to 4150 rpm it dropped to the same 91- 92°C. Well, at the maximum turbine rotation speed the following results were obtained:


As a result, the GPU temperature reached 86 °C. As you can see, the new video card turned out to be very hot, and the cooling system was noisy in 3D mode. However, potential owners of the GeForce GTX 480 should not be upset about this, since the top products from NVIDIA and ATI have never been characterized by low temperatures and noise levels. In addition, alternative coolers will soon appear that will “bring” standard coolers up to 30 ° C and at the same time operate incomparably quieter (remember Arctic Cooling Accelero Xtreme GTX 280 or Thermalright products). The question, rather, is different - how justified is it to purchase a product for 500 US dollars that requires replacing the standard cooling system and, most likely, losing the warranty? Well, another option is to wait for the GeForce GTX 480 with alternative coolers to appear.

To check the overclocking potential of the GeForce GTX 480, we used the EVGA Precision v1.9.2 utility:



It is clear that with such a temperature regime on a still “raw” graphics processor, one cannot expect any impressive results in overclocking. And so it happened - the frequency of the graphics processor, without loss of stability and reduction in picture quality, was only raised by 45 MHz with a final 745 MHz (+6.4%). But the 0.4 ns memory chips were frankly pleasing, allowing it to operate stably at an effective 4780 MHz (+29.3%):


I’m not 100% sure that the memory test from the latest version of OCCT works correctly with the GeForce GTX 480, but it tests almost all the 1.5 GB of memory available on the video card:



Overclocking the video memory did not affect the temperature conditions of the printed circuit board of the video card and graphics processor, which is quite logical.

At the end of the review of the new video card, let us remind you that the recommended price for the NVIDIA GeForce GTX 480 is $499. Sales of video cards should start worldwide on April 12.

Test configuration, tools and testing methodology

All tests were carried out inside a closed system unit case, the configuration of which consisted of the following components:

Motherboard: ASUS P6T Deluxe (Intel X58 Express, LGA 1366, BIOS 2101);
Central processor: Intel Core i7-920, 2.67 GHz (Bloomfield, C0, 1.2 V, 4x256 KB L2, 8 MB L3);
Cooling system: Хigmatek Balder SD1283 (with two Thermalright TR-FDB at 1100 rpm);
Thermal interface: Arctic Cooling MX-2;
RAM: DDR3 3x2 GB Wintec AMPX 3AXH1600C8WS6GT (1600 MHz / 8-8-8-24 / 1.65 V);
System disk: SSD OCZ Agility EX (SATA-II, 60 GB, SLC, Indillinx, firmware v1.31);
Disk for games and tests: Western Digital VelociRaptor (SATA-II, 300 GB, 10000 rpm, 16 MB, NCQ) in a Scythe Quiet Drive 3.5" box;
Archive disk: Western Digital Caviar Green WD10EADS (SATA-II, 1000 GB, 5400 rpm, 32 MB, NCQ);
Case: Antec Twelve Hundred (front wall - three Noiseblocker NB-Multiframe S-Series MF12-S1 at 900 rpm; rear - two Scythe SlipStream 120 at 900 rpm; top - standard 200 mm fan at 400 rpm );
Control and monitoring panel: Zalman ZM-MFC2;
Power supply: Zalman ZM1000-HP 1000 W, 140 mm fan.
Monitor: 30" Samsung 305T Plus.

In order to reduce the processor dependence of video cards in some modes of individual games included in testing, the 45-nm quad-core processor was overclocked with a multiplier of 21 and the Load-Line Calibration function activated to 4.0 GHz with an increase in voltage to Motherboard BIOS boards up to 1.3725 V



The RAM operated with timings of 7-7-7-14-1T at a voltage of 1.64 V. All other parameters in the motherboard BIOS related to overclocking the processor or memory were not changed (left in the “Auto” positions).

For comparison with the NVIDIA GeForce GTX 480, Leadtek WinFast GTX 285 and XFX GeForce GTX 295 2x896 MB:




Among the video cards on ATI GPUs, the testing included the Radeon HD 5870 1 GB and the dual-processor Radeon HD 5970 2x1 GB:




Now let's move on to the software part and tools. Testing, which started on March 23, 2010, was conducted under the direction of operating system Microsoft Windows 7 Ultimate x64 with all critical updates as of the specified date and with the following drivers:

chipset motherboard Intel boards Chipset Drivers - 9.1.1.1025 WHQL ;
DirectX End-User Runtimes libraries, release date - February 2010;
video card drivers for ATI Catalyst 10.3 GPUs;
Video card drivers for NVIDIA GPUs: GeForce/ION Driver 197.17 beta for GeForce GTX 480 and GeForce/ION Driver 197.25 beta for other NVIDIA video cards;
physics acceleration drivers - NVIDIA PhysX System Software 9.10.0129.

Testing of video cards in games was carried out in two resolutions: 1920x1080 and 2560x1600. In our opinion, testing such powerful video cards at lower resolutions has no practical benefit, and will only lead to an increase in the amount of testing and limiting the performance of video cards by the speed of the platform.

For the tests, two graphics quality modes were used: “High Quality + AF16x” - maximum texture quality in the drivers with 16x level anisotropic filtering enabled, and “High Quality + AF16x + AA 4(8)x” with 16x level anisotropic filtering enabled and full screen anti-aliasing (MSAA) degree of 4x, or 8x, if the average frame rate remained high enough for a comfortable game. Anisotropic filtering and full-screen anti-aliasing were enabled directly in the game settings, or in their configuration files. If these settings were missing in games, the parameters were changed in the control panel of the Catalyst and GeForce drivers. Vertical sync is forcibly disabled in driver control panels.

All games were updated with the latest patches at the beginning of the preparation of this article. Ultimately, the test list consisted of two semi-synthetic packages, one techno demo and 21 games, including latest news. Here's what the test list looks like with a brief description of the techniques (the games are arranged in order of their release):

3DMark 2006 (DirectX 9/10) - build 1.2.0, default settings and 2560x1600 with AF16x and AA8x;
3DMark Vantage (DirectX 10) - version 1.0.2.1, “Performance” settings profiles (only basic tests were carried out);
Unigine Heaven Demo (DirectX 11) - version 2.0, maximum quality settings, “extreme” tessellation;
World In Conflict (DirectX 10) - version 1.0.1.0(b34), graphics quality profile “Very High”, “Water Reflection Clouds” - On, test built into the game;
Crysis (DirectX 10) - version 1.2.1, “Very High” settings profile, double cycle of “Assault Harbor” demo recording from Crysis Benchmark Tool version 1.0.0.5;
Unreal Tournament 3 (DirectX 9) - version 2.1, maximum graphics settings in the game (level 5), Motion Blur and Hardware Physics activated, FlyBy scene was tested at the “vCTF-Corruption” level (two consecutive cycles), HardwareOC UT3 was used Bench v1.5.0.0;
Lost Planet Extreme Condition: Colonies Edition (DirectX 10) - version 1.0, graphics level “Maximum quality”, HDR Rendering DX10, test built into the game, results are shown for the first scene (ARENA1);
Far Cry 2 (DirectX 10) - version 1.03, settings profile “Ultra High”, double test cycle “Ranch Small” from Far Cry 2 Benchmark Tool v1.0.0.1;
Call of Duty 5: World at War (DirectX 9) - game version 1.6, graphics and texture settings set to “Extra” level, “Breach” demo recording at the level of the same name;
BattleForge: Lost Souls (DirectX 11) - version 1.2 (03/19/2010), maximum graphics quality settings, shadows enabled, SSAO technology enabled, double run of the test built into the game;
Stormrise (DirectX 10.1) - version 1.0.0.0, maximum quality settings for effects and shadows, “Ambient occlusion” disabled, double run of the demo scene on the mission “$mn_sp05”;
Tom Clancy's H.A.W.X. (DirectX 10) - version 1.03, maximum graphics quality settings, HDR, DOF and Ambient occlusion techniques activated, built-in test (double run);
Call of Juarez: Bound in Blood (DirectX 10.1) - version 1.0.1.0, maximum graphics quality settings, Shadow map size = 1024, 110-second demo recording at the very beginning of the “Miners Massacre” level;
Wolfenstein MP (OpenGL 2.0) - version 1.3, maximum graphics settings, own demo recording “d2” at the “Manor” level;
Batman: Arkham Asylum (Direct3D 9) - version 1.1, maximum detail, maximum “physics”, double run of the test built into the game;
Resident Evil 5 (DirectX 10.1) - version 1.0, testing the variable test with maximum graphics settings without motion blur, the result was taken as the average value of the third scene of the test, as the most resource-intensive;
S.T.A.L.K.E.R.: Call of Pripyat (DirectX 11) - version 1.6.02, settings profile “Improved dynamic lighting DX11” with additional manual setting of all parameters to the maximum, tested our own demo recording “cop03” at the “Backwater” level;
Borderlands (DirectX 9) - game version 1.2.1, testing “timedemo1_p” with maximum quality settings;
Left 4 Dead 2 (DirectX 9) - game version 2.0.1.1, maximum quality, demo recording “d333” was tested (two passes) on the “Swamp Fever” map, stage “Swamp”;
Colin McRae: DiRT 2 (DirectX 9/11) - game version 1.1, built-in test consisting of two laps along the London circuit with maximum graphics quality settings;
Wings Of Prey (DirectX 9) - game version 1.0.2.1, texture quality “Ultra Ultra High” and other maximum graphics quality settings, tested a two-minute demo recording at the “Escort” level from the “Battle of Britain” campaign;
Warhammer 40,000: Dawn of War II - Chaos Rising (DirectX 10.1) - version 2.1.0.4679, graphics settings in the game menu are set to the “Ultra” level, three or four runs of the test built into the game;
Metro 2033 (DirectX 10/11) - version 1.0, maximum quality settings, a scripted scene lasting 160 seconds was used for the test. at Chaser level, double sequential pass;
Just Cause 2 (DirectX 11) - version 1.0.0.1, maximum quality settings, Background Blur and GPU Water Simulation techniques disabled, double sequential pass of the Dark Tower demo.

More detailed description You can find methods for testing video cards and graphic settings in some of the listed games in a specially created thread of our conference, as well as participate in the discussion and improvement of these techniques.

If games implemented the ability to record a minimum number of frames per second, then this was also reflected in the diagrams. Each test was carried out twice; the best of the two values ​​obtained was taken as the final result, but only if the difference between them did not exceed 1%. If the difference between test runs exceeded 1%, then testing was repeated at least once more to obtain the correct result.

Video card performance test results and their analysis

In the diagrams, the test results for the ATI Radeon HD 5970 and Radeon HD 5870 video cards are highlighted in red, the hero of today's article, the GeForce GTX 480, is in the traditional NVIDIA green color, and the GeForce GTX 295 and GTX 285 video cards are marked in blue-green. Testing of video cards overclocked was not carried out as part of today's article, as it deserves a separate article.

Let's look at the test results and analyze them.

3DMark 2006



In the first semi-synthetic test, the GeForce GTX 480 is only slightly ahead of its main competitor, the Radeon HD 5870, but both video cards are faster than the dual-processor GeForce GTX 295. The Radeon HD 5970 in high-quality mode and a resolution of 2560x1600 is far ahead of all other test participants, including GeForce GTX 480.

3DMark Vantage



The situation is a little different in 3DMark Vantage, but only in the “Perfomance” settings profile, where the GeForce GTX 480 is almost 2000 3D “parrots” faster than the Radeon HD 5870. With increasing load, the video cards demonstrate the same performance.

Unigine Heaven Demo 2.0

Since the GeForce GTX 285 and GTX 295 video cards do not support DirectX 11, for a correct comparison with the GeForce GTX 480, all these cards were previously tested in DirectX 10 mode with tessellation disabled:



The GeForce GTX 480 results in the Heaven demo are like a balm for the soul of potential buyers of this video card. Indeed, the new product demonstrates excellent performance and, in the most difficult mode, significantly outperforms two competitors from its own camp.

Now let's check how good the GeForce GTX 480 is in DirectX 11 mode with tessellation activated (the results of video cards in DirectX 10 are in italics):


As you can see, the GeForce GTX 480 easily competes with the dual-processor Radeon HD 5970 and leaves far behind its direct competitor Radeon HD 5870. If in the near future tessellation will prevail in games, then purchasing a GeForce GTX 480 looks much more attractive than ATI. However, this is more like “guessing on the tea leaves,” so let’s leave this matter to the fanatics and move on to gaming tests.

World in Conflict


In the game World in Conflict, the GeForce GTX 480 turns out to be faster than the Radeon HD 5870 and even successfully competes with the dual-processor GeForce GTX 295, surpassing the latter in the minimum number of frames per second. The performance of the dual-chip Radeon HD 5970 video card is beyond the reach of all other participants in today's testing, but, again, this video card does not have superiority in the minimum number of frames per second, which is so important for comfortable gaming.

Crysis


Many expected that with the release of “Fermi” the Crysis game would finally be conquered by a single-processor video card too, but this turned out to be far from the case. Moreover, the GeForce GTX 480 demonstrates a very slight advantage over the Radeon HD 5870, which is also a surprise for many who were expecting this new product. But the GTX 480 in the most difficult quality mode provides a higher level of the minimum number of frames per second and, in addition, easily deals with the GeForce GTX 295.

Unreal Tournament 3


Unreal Tournament 3 continues to paint “oil paintings”, ranking video cards by performance in almost the same way as they are positioned by cost. The GeForce GTX 480 is faster than the Radeon HD 5870, but slower than both dual-processor video cards.

Lost Planet Extreme Condition: Colonies Edition


The maximum that the GeForce GTX 480 can do in the game Lost Planet is to outpace the Radeon HD 5870 in three out of four test modes and compete on an equal footing with the GeForce GTX 295. The dual-processor Radeon HD 5970 video card is still the fastest.

Far Cry 2


It is impossible not to note the rather confident performance of the new video card in the game Far Cry 2, where the GeForce GTX 480 not only manages to leave behind the Radeon HD 5870 everywhere, but also competes on an equal footing with the “aircraft carrier” Radeon HD 5970 in high-quality graphics mode using full-screen anti-aliasing.

Call of Duty 5: World at War


To play Call of Duty 5: World at War, which is not at all demanding by modern standards, any of the video cards tested today is sufficient, including a resolution of 2560x1600 and MSAA8x. However, if we compare the GeForce GTX 480 and Radeon HD 5870 with each other, we can talk about equal performance of video cards in this game.

BattleForge: Lost Souls


Despite the fact that BattleForge: Lost Souls is a pro-ATI game, the video card on the NVIDIA GF100 GPU loses to the Radeon HD 5870 only in the mode without graphics quality enhancement techniques. With anti-aliasing turned on, the GeForce GTX 480 manages to get ahead a little.

Stormrise


But in another game released under the close tutelage of ATI, GeForce GTX video cards don’t stand a chance, so they compete only with each other.

Tom Clancy's H.A.W.X.


In the flight simulator Tom Clancy's H.A.W.X., on the contrary, the GeForce GTX 480 crushes not only the Radeon HD 5870, but also the dual-processor flagship Radeon HD 5970. Note that in high-quality graphics mode, the advantage of the GTX 480 over the GTX 285 exceeds 2 times.

Call of Juarez: Bound in Blood



Call of Juarez: Bound in Blood, like BattleForge and Stormrise, is better suited for video cards based on ATI GPUs. However, this did not prevent the GeForce GTX 480 from being ahead of the Radeon HD 5870. It should be noted that Call of Juarez cannot be classified as resource-intensive games, since even at maximum resolution one GeForce GTX 285 or some Radeon HD 4870 is quite enough HD 5770.

Wolfenstein MP


The victory of ATI video cards in Wolfenstein is quite surprising, since NVIDIA has always worked better with OpenGL and, as a rule, has been the leader in such games. However, the GeForce GTX 480 is no match for the Radeon HD 5870 in this game.

Batman: Arkham Asylum


But it has something to oppose in the game Batman: Arkham Asylum, namely hardware support for accelerating physical effects PhysX. For ATI GPU-based video cards to perform successfully in this game, you must either install GeForce as a second video card, or disable these effects altogether. True, the game becomes very flawed in graphical terms, so the first option is much more interesting, although it is more expensive in terms of finances and time to organize the collaboration of Catalyst and GeForce drivers. In addition, it should be noted that the GeForce GTX 480 demonstrates simply brilliant performance in Batman: Arkham Asylum, surpassing the dual-processor GeForce GTX 295.

Resident Evil 5


The GeForce GTX 480 leaves no chance for its competitor in the game Resident Evil 5. Moreover, as we see, the video card based on the new GF100 graphics processor is capable of successfully resisting both dual-processor giants in the form of the GeForce GTX 295 and Radeon HD 5970.

S.T.A.L.K.E.R.: Call of Pripyat

Once again, we need to remind you that the GeForce GTX 285 and GTX 295 video cards do not support DirectX 11, so before looking at the main diagram, for a correct comparison with the GeForce GTX 480, the latter was previously tested in DirectX 10 mode:



It cannot be said that the performance of the GeForce GTX 480 in the game S.T.A.L.K.E.R.: Call of Pripyat is at a high level. There is neither a significant advantage over the GTX 285, nor at least some advance over the GTX 295. And in DirectX 11, the new video card has nothing to brag about:




Borderlands


In the game Borderlands, the battle between the Radeon HD 5870 and the GeForce GTX 480 is carried out with varying degrees of success. In modes without anti-aliasing, a video card based on an NVIDIA GPU is faster, and when MSAA8x is activated, Radeon comes out ahead, but only at a resolution of 1920x1080.

Left 4 Dead 2


If you like to mash endless zombies in the game Left 4 Dead 2, then it is better to opt for a video card with an ATI GPU, as they demonstrate higher performance than NVIDIA cards. The game is not resource-intensive, so any of the video cards tested today will be quite sufficient.

Colin McRae: DiRT 2

DirectX 11 is not supported by GeForce GTX 2xx video cards, so first we will compare the GeForce GTX 480 with them in fair DirectX 9 mode:



Next, let's look at the results in DirectX 11, where Radeon video cards are already participating (DX9 results for GTX 2xx are also present, but are in italics):


Although not by much, the GeForce GTX 480 is still faster than the Radeon HD 5870 in this game.

Wings Of Prey


CrossFireX technology does not work in the still new flight simulator Wings Of Prey, so the Radeon HD 5970 turned out to be the slowest video card in testing. At the same time, SLI technology works great, making the GeForce GTX 295 beyond the reach of other video cards. Well, the GeForce GTX 480 is slightly ahead of the Radeon HD 5870 in terms of the sum of tests in all modes.

Warhammer 40,000: Dawn of War II - Chaos Rising


Warhammer 40,000: Dawn of War II is one of the most processor-dependent games, so the results can only be compared in the maximum quality mode and 2560x1600 resolution, in which the GeForce GTX 480 is slightly faster than the Radeon HD 5870 in terms of average frames per second, but slower in terms of the minimum. In general, according to this indicator, the GeForce GTX 480 turned out to be the worst video card in Warhammer 40,000: Dawn of War II, which rather indicates insufficient optimization GeForce drivers, so it may well be fixed soon.

Metro 2033

Testing of video cards in the new game Metro 2033 was carried out at the very beginning on the “Chaser” level, in a scripted scene where the hero and two assistants ride on a trolley and cannot move, due to which it is possible to obtain results with a very high degree of repeatability. Testing was conducted using FRAPS within 160 seconds of loading the level. Since at maximum graphics quality settings for DirectX 11 Metro 2033 it was possible to test on only one video card out of the five participating in the article, DirectX10 rendering was used for tests:



The GeForce GTX 480 is slightly faster than the Radeon HD 5870, and both of these video cards are significantly inferior to the dual-processor Radeon HD 5970. Let's add that CrossFireX technology did not work in Wings Of Prey, in Metro 2033 we came across poor performance of SLI technology, and in the next game, Just Cause 2, it turned out to be completely inoperable.

After all the tests were completed, the performance of the GeForce GTX 480 was additionally tested in the Metro 2033 game and in DirectX 11 mode. Unfortunately, by that time we had already returned the Radeon HD 5870 and HD 5970 video cards, so the ATI Radeon HD 5850 video card acted as a competitor 1 GB, overclocked from nominal 725/4000 MHz. For comparison, we used the same “Chaser” demo scene with maximum quality settings. The results were very interesting:



At a resolution of 1920x1080, the overclocked Radeon HD 5850 is not far behind the GeForce GTX 480, but at 2560x1600, a real “slideshow” begins on the Radeon, while the GeForce GTX 480 shows three times higher results. But this still does not allow the new video card to provide the player with a comfortable number of frames per second in this game. Moreover, even when running the test scene, I noticed some “blurring” of the picture on the GeForce, and decided to check it with screenshots (albeit at a different level, since you can’t take identical screenshots in “Chaser”). You can evaluate the difference in picture quality yourself (resolution 1920x1080):

ATI Radeon HD 5850NVIDIA GeForce GTX 480







It is easy to notice that the image quality is clearly higher on the Radeon video card. All textures are drawn very clearly, without smudging or blurring. Let me remind you that the graphics quality settings in the game were identical on both video cards: DirectX 11, “Very High”, AF16x, AAA, “Advanced DOF” and “Tesselation” were enabled. In addition, in the Catalyst and GeForce/ION drivers, the maximum quality mode for texture filtering was set to “High Quality” (the default is “Quality”). So, maybe it is precisely due to this reduction in quality that the GeForce GTX 480 achieves higher performance in Metro 2033? The developers of the game Metro 2033 promptly responded to this question in less than a day, here is their answer:

“No, the noticeable difference in display has nothing to do with the performance of the video cards. Indeed, in some antialiasing modes, NVIDIA and ATI video cards render images differently. Now we are trying to understand how quickly this can be fixed. Once again, this has nothing to do with productivity.”

Oles Shishkovtsov, 4A Games, Chief Technical Officer

Well, wait and see.

Just Cause 2


In the new game Just Cause 2, the GeForce GTX 480 is inferior to the Radeon HD 5870 by the same amount as it won in Metro 2033.

Next, let's move on to analyzing the results in summary charts built only from game tests.

Performance comparison summary charts

In the first pair of summary diagrams, we invite you to evaluate the performance increase of the GeForce GTX 480 in comparison with the fastest single-processor video card of the previous generation NVIDIA GPU - GeForce GTX 285, the performance of which in the diagrams is taken as 100%, and the results of the GeForce GTX 480 are shown as a percentage increase ( in the games S.T.A.L.K.E.R.: Call of Pripyat and Colin McRae: DiRT 2, the results were compared in DX10 and DX9 renders, respectively):



Well, here, in general, everything is clear and understandable. In modes without anti-aliasing, the GeForce GTX 480 is, on average, faster in gaming tests than the top-end video card of the previous generation GeForce GTX 285 by 30-37% depending on the resolution, and in modes with anti-aliasing by 46-48%. It should be noted here that with the release of the 5th series of Radeon video cards, the performance increase compared to the previous generation ATI video card was higher. The most pronounced performance increase was recorded in games such as Tom Clancy's H.A.W.X., Far Cry 2 and Metro 2033. And the least significant difference in the performance of these two video cards is in non-resource-intensive games such as Call of Duty 5: World at War or Wolfenstein.

The following diagrams are dedicated to the battle between the dual-processor GeForce GTX 295 and the new single-processor flagship GeForce GTX 480. The results of the GeForce GTX 295 are taken as the zero axis (in the games S.T.A.L.K.E.R.: Call of Pripyat and Colin McRae: DiRT 2, the results in DX10 and DX9 renders were again compared, respectively):



The GeForce GTX 480 is fighting against the previous generation dual-processor video card GeForce GTX 295 with varying degrees of success. In some games, a dual-chip video card is faster, and in others, a GeForce GTX 480. Here we also cannot help but recall the previous confrontation between the Radeon HD 5870 and the Radeon HD 4870 X2, when a video card on the new graphics processor was almost always faster than the dual-chip flagship of the previous generation. However, in the newest games Metro 2033 and Just Cause 2, the superiority of the GeForce GTX 480 is quite large due to the still non-functional SLI technology in these new products.

Now, based on the sum of gaming tests, let's evaluate the confrontation between the new GeForce GTX 480 and Radeon HD 5870. The results are presented in the following pair of diagrams, where the performance of the Radeon HD 5870 is taken as the zero axis, and the results of the GeForce GTX 480 are reflected in the form of deviations from it:



And here the scales tip one way or the other, with a significant change in the percentage of increase or decrease in productivity. So in twelve games - World in Conflict, Crysis, Unreal Tournament 3, Lost Planet: Colonies, Far Cry 2, Tom Clancy's H.A.W.X., Call of Juarez: Bound in Blood (!), Batman: Arkham Asylum (without “doping” for Radeon), Resident Evil 5, Borderlands, Colin McRae: DiRT 2 and Metro 2033, the GeForce GTX 480 is ahead, and in five games - BattleForge, Stormrise, S.T.A.L.K.E.R.: Call of Pripyat, Left 4 Dead 2 and Just Cause 2 - it is already faster Radeon HD 5870. In the remaining four games there is either parity or a tiny advantage of one of the video cards.It can definitely be said that the overwhelming superiority of the GeForce GTX 480, which everyone was waiting for, over the Radeon HD 5870 today does not exist.

Finally, the last couple of diagrams, according to which you can look at the gap between the GeForce GTX 480 and the fastest video card of our time - the Radeon HD 5970:



CrossFireX technology not working in the game Wings Of Prey and the lack of PhysX support in the game Batman: Arkham Asylum allow the GeForce GTX 480 to outperform the dual-processor Radeon HD 5970. In the games Tom Clancy's H.A.W.X. and Resident Evil 5, the speed of the video cards is approximately the same, and in all other cases the Radeon HD 5970 is faster than GeForce GTX 480.

Power consumption, heating of video cards and noise level

The energy consumption of systems with different video cards was carried out using specially modified for these purposes the power supply. The maximum load was created by running one FurMark 1.8.0 in stability test mode and a resolution of 2560x1600 (with AF16x), as well as FurMark together with Linpack x64 (LinX 0.6.4, 4096 MB, 7 threads). Considering that both specified programs generate the maximum load on, respectively, the video system and the central processor, so we can find out the peak power consumption of the entire system and determine the power supply required for it (taking into account efficiency).

The results obtained are shown in the diagram:






It is easy to see that a system with a GeForce GTX 480 at peak load consumes more than a system with a Radeon HD 5870 by about 130 watts both when loaded with FurMark and when combined with FurMark and Linpack x64. Moreover, a system with a GeForce GTX 480 manages to consume more than a system with a dual-processor Radeon HD 5970! This is truly a “gluttonous” video card. And yet, the GeForce GTX 295 is still the leader in power consumption. In 2D mode, the difference between the consumption of systems with GeForce GTX 480 and Radeon HD 5870 video cards is 26 watts, with a benefit in favor of Radeon. By the way, about the benefits.

Let's say you are a "gaming maniac" and play 8 hours each day for an entire month. And let’s imagine that during this time the load on the video subsystem is always maximum, that is, the difference in power consumption of systems with NVIDIA GeForce GTX 480 and ATI Radeon HD 5870 is constantly equal to 130 watts per hour. Thus, within a month, a system with a new NVIDIA video card will “gobble up” more electricity than a system with a competing 32.2 kW ATI video card! I couldn’t find the average current cost of a kilowatt-hour in the Russian Federation (it is approved by the constituent entities of the Russian Federation), so we’ll take Moscow as an example, where a kilowatt-hour is based on a single-tariff meter costs 2 rubles 42 kopecks. Thus, owners of systems with GeForce GTX 480, in comparison with owners of systems with Radeon HD 5870, end up with an “overspend” of as much as 78 rubles per month! The given conditions under which this amount was received, as you understand, are unrealistic. In fact, the amount should be less than four times, at least. But even if this is true, and 78 rubles a month turns out to be true, then answer yourself - is this really the money that can be used as an argument when comparing video cards costing more than 18 thousand rubles?

Now let's compare the temperature conditions of all tested video cards with automatic turbine operation. The load was created by 15 cycles of the Firefly Forest test from the semi-synthetic 3DMark 2006 package at a resolution of 2560x1600 with anisotropic filtering at 16x level. Tests were carried out in a closed system unit case at room temperature 25 °C. Let's look at the results:



There is a significant superiority of the Radeon HD 5870 over the GeForce GTX 480 in terms of GPU temperature both in idle mode and under load.

And the last thing left to do before moving on to conclusions is to evaluate the noise level of video cards. The noise level of standard cooling systems of reference video cards was measured using an electronic sound level meter CENTER-321 after one in the morning in a completely enclosed room of about 20 square meters. m with double glazed windows. The noise level of each cooler was measured outside the system unit case, when the only source of noise in the room was the cooler itself and its turbine. The sound level meter, fixed on a tripod, was always located strictly at one point at a distance of exactly 150 mm from the cooler fan rotor. The motherboard, into which the video card was inserted with a cooling system installed on it, was placed at the very corner of the table on a polyurethane foam backing:


The lower measurement limit of the sound level meter is 29.8 dBA, and the subjectively comfortable (not to be confused with low) noise level of coolers when measured from such a distance is around 37 dBA. The rotation speed of the turbines of standard coolers was changed over the entire range of their operation using our controller by changing the supply voltage in steps of 0.5 V.

The obtained data on the noise level of the Radeon HD 5970 and HD 5870 turned out to differ within 0.1 dBA at each measurement point, so they are combined in the graph. It turned out to be impossible to measure the noise level of the GeForce GTX 295 on the stand, since this video card would have to be completely disassembled to get to the turbine connector. The measurement results are presented in the following graph (the dotted line shows the entire range of turbine speeds, the solid line shows the actual speed range when tested in 3DMark 2006, which we described just above):



The first thing you need to notice from the measurement results is the fact that none of the standard cooling systems of the reference video cards are quiet. The second thing you can pay attention to is that the noise level curve of the GeForce GTX 480 cooler passes under the noise level curve of its competitor, the Radeon HD 5870. Can we conclude that the GeForce GTX 480 cooler is quieter than the standard Radeon HD 5870 cooler? However, this is only true for comparing these coolers at the same turbine rotation speeds, but in practice this does not happen, since when the GeForce GTX 480 turbine operates automatically, its rotation speed varies in the range from 2100 to 3600 rpm, while on the Radeon HD 5870 the speed range is only 1270-2040 rpm (see temperature chart above). Due to the very high heat dissipation of the GF100 GPU, NVIDIA engineers had to raise the temperature limit in the video card BIOS and set it to a higher turbine speed. The result is high noise levels at higher temperatures. Alas, even in this parameter, until new, less hot revisions of graphics processors appear, the GeForce GTX 480 is losing to the Radeon HD 5870.

Conclusion

So, what did we get six months after the release of the Radeon HD 5870? If we take all the tests together (which is most likely incorrect), then the new GeForce GTX 480 video card is on average 5-15% faster than the Radeon HD 5870, depending on the test, quality mode and resolution. However, in individual tests there are both more impressive victories and defeats. Therefore, in our opinion, it is more correct to consider each individual game, which is what we did above in the section with video card performance testing results. By and large, we must admit that the GeForce GTX 480 becomes the fastest single-processor video card.

On the other hand, this victory came at too great a price for NVIDIA. First of all, we do not mean high power consumption, excessive noise levels or unacceptable heat generation, but time. The time that has passed since the release of the Radeon HD 5870 is irretrievably lost. Today, the new top ATI/AMD video cards are not only sold en masse, but also already have six official versions of drivers, while the NVIDIA GeForce GTX 480 does not have a single (!) official one, and only one is in beta testing. Such a late release of a new video card could be justified only by a total and overwhelming superiority (+50% or more) in performance over its competitor, but NVIDIA has not been able to achieve this to date.

As for high energy consumption, for our prices per kilowatt of electricity, the argument in favor of any products (ATI or NVIDIA) looks ridiculous and even funny. The problem of high noise levels and heat generation is solved by installing alternative cooling systems, which have already begun to be announced by well-known brands. NVIDIA also has some aces up its sleeve, such as PhysX and CUDA, as well as projected superiority over its competitor in games that support DirectX 11 using tessellation. It is unlikely that anyone is going to deny the possibility of further optimization of GeForce drivers. Therefore, we assure you that we will not limit ourselves to just getting acquainted with the new video card, and will soon continue to study all its capabilities.

And the last thing that we all shouldn’t forget is that only healthy competition can help reduce prices for video cards and other high-tech products, so the undeniable leadership of any one manufacturer benefits only one of them, or several if they, for example, conspiracy, but certainly not you and me :-)

Thank you:
Russian representative office of NVIDIA and personally Irina Shekhovtsova,
Russian representative office of AMD and personally Kirill Kochetkov
for video cards provided for testing.

Other materials on this topic


Power consumption of video cards: spring 2010
Do we need PhysX? Testing EVGA GeForce GTX 275 CO-OP PhysX Edition
Metro 2033 and modern video cards

Just recently, the first presentations of a series of video cards based on NVIDIA and GTX 470, based on testing official NVIDIA samples, were thundering, but only now such graphics accelerators have begun to appear on store shelves. Of course, the intrigue of the equivalence of samples and serial samples remains. This is especially reinforced by the manufacturer’s decision, even in the flagship of the line, the NVIDIA model, to use a slightly cut-down version of the GF100 chip (GPU based on the Fermi design). But we will try to tell you about everything in order.

The Fermi architecture itself, used in NVIDIA and GTX470 GPUs (video cards), was announced back in September 2009, and only six months later users were able to take advantage of its benefits. The declared cost of GF100 architecture video cards should be $500 for or $350 for the GTX 470, which is slightly higher than single-chip flagships from AMD, although in our market these video cards will obviously be even more expensive. It is worth noting that the problems observed by AMD in producing GPUs using the TSMC 40nm process technology do not allow it to provide the market with the required number of high-performance products with support for DirectX 11. Considering the possibility left by NVIDIA to disable problematic parts of the GPU for the entire line, even the “top” chips do not use the full potential of the GF100, we can hope for a more complete supply of the market with video cards based on the GTX470.

NVIDIA has defined the Fermi architecture as computational at its core, relegating the GPU's traditional role of accelerating 3D graphics in games to the background. The Fermi architecture is a consistent development of the Tesla line of computing cards that are used in performance-demanding systems. This fact is confirmed by support for error correction memory (ECC) and enhanced double precision computing performance. The potential gains from running some technical tasks in parallel are enormous, and NVIDIA's investment in development software have led to a significant lead over AMD and Intel in this growing market.

NVIDIA Fermi (GF100)

The planned capabilities of the new video card were supposed to double the performance potential flagship model on the GF100 compared to a GT200 based graphics card like the GTX285. But theory does not always translate into practical results.

The GF100 chip itself has 512 CUDA cores (four Graphics Processing Clusters, each containing four Streaming Multiprocessors, and each containing 32 CUDA cores). But only 480 CUDA cores were left in, which is 32 cores less than in the original GF100 architecture. This simplification was made by disabling one SM multiprocessor in the GF100, apparently due to the impossibility of obtaining full-fledged graphics processors in sufficient quantities.

In turn, each SM multiprocessor also contains its own texture units and PolyMorph engine (fixed function logic that provides increased performance for geometry calculations). Consequently, I got 60 out of 64 texture units and 15 PolyMorph engines.

In the part of the GF100 pipeline that is independent of GPC clusters, there were no block shutdowns for NVIDIA. All six ROP sections remain here. Each partition is capable of outputting eight 32-bit integer pixels simultaneously, meaning we get 48 pixels per clock cycle. A full GF100 with all ROP partitions supports a 384-bit GDDR5 memory interface (that is, a 64-bit interface per partition). The GPU supports just such a configuration, and 256 MB of memory per interface gives us a total of 1.5 GB of GDDR5 memory (bandwidth is 177 GB/s if you include clock frequency 924 MHz).

All these reductions in the operating power of the original chip are a consequence of problems with the production of usable crystals from NVIDIA, but the need to introduce new solutions to the Hi-End accelerator market forced them to “throw out” at least cut-down versions of the GF100 graphics processor with Fermi architecture. But whatever the result, it is there and it is worth testing and describing.

A production video card with a box design that is very characteristic of this manufacturer has arrived at our testing laboratory.

The packaging of the video card is designed in black and yellow colors. On the front side of the cardboard box the model of the video card, the amount of memory, its type and the bandwidth of the memory bus are indicated. There is also mention of support for proprietary NVIDIA PhysX technologies and the presence of an HDMI connector. In the upper right corner, the manufacturer draws attention to support for proprietary technologies: NVIDIA CUDA, NVIDIA PureVideo HD, NVIDIA SLI.

On the back of the box there is a small overview of the capabilities of this video card. The advantages of using technologies are described: NVIDIA 3D Vision Surround and PhysX.

Inside there is the video card itself and additional delivery components. Together with the graphics accelerator you can get the following:

  • Video card power adapter from two six-pin connectors to one eight-pin PCI Express;
  • Video card power adapter from two MULEX connectors to one six-pin PCI Express;
  • Adapter from DVI to VGA;
  • Adapter from Mini-HDMI to HDMI;
  • User's manual;
  • Disk with software and drivers;
  • Demo disc describing all the new features of this video card.

I would like to note that the power adapters included in the package will clearly force the user to use a fairly powerful power supply with the appropriate connectors for connecting a video card. This may cause some problems when choosing a configuration. In general, the package should fully cover all the nuances of installing this video card in a modern system unit.

Printed circuit board

The video card itself is made on a dark PCB, the front side of which is covered by a cooling system with a dark plastic casing. Let us remind you that this video card supports the PCI Express 2.0 x16 bus, is compatible with DirectX 11 Shader Model 5.0 and OpenGL 3.2, and also supports NVIDIA technologies PureVideo HD Technology, NVIDIA 3D Vision Surround, NVIDIA PhysX Technology, NVIDIA CUDA Technology and NVIDIA SLI Technology.

The reverse side of the video card PCB looks much more modest. Here we can only note the GPU power system chip - the CHL8266 PWM controller using six phases. There are three transistors for each power phase (one in the upper arm and two in the lower arm). This approach allows for better heat removal from the elements of the power subsystem. The second uP6210AG chip is already well known to our readers from other video cards based on GPUs from NVIDIA. It provides two phases of power for the memory chips of this video card. Thus, in total we get a 6+2-phase power supply system for the video card.

Looking under the cooling system, you can immediately state the fact that this video card is completely identical to its “reference” version. The video card uses a 267 mm (10.5″) long circuit board, which is about a centimeter shorter than the accelerators on the Radeon HD 5870, which can help it fit into almost any modern case.

For additional power (in addition to the PCI Express bus), one six-pin and one eight-pin plug are required. NVIDIA claims that such a card has a TDP of 250 W, which is significantly less than the Radeon HD 5970, which barely fits the 300 W ceiling set by the PCI-SIG. Therefore, for a “top” solution, NVIDIA recommends a power supply with a power of 600 W or higher.

The board occupies two slots on the rear panel of the case. For a sufficiently voluminous cooling system, the user will have to free up space inside the case.

The interface panel contains two DVI ports and one mini-HDMI. Plus, the second slot will be completely occupied by the exhaust grille, which ensures that heated air is blown out of the system unit.

Cooling system

Let's take a closer look at the video card cooling system. It completely replicates the “reference” version and NVIDIA engineers clearly tried to make it as efficient as possible, but due to the gluttony of the video card, the resulting temperature of the components still remains at a fairly high level.

Five heat pipes, an additional heat-dissipating casing and the aerodynamic design of the turbine itself are collectively impressive in their maximum sophistication. This is clearly the most efficient cooling design of any reference design we've seen so far. The air pumped by the side turbine passes through an aluminum radiator, permeated with five copper tubes, and exits to the outside of the housing.

A unique feature of this design is the location of one of the sides of the radiator directly on the surface of the card casing, which clearly improves heat dissipation, but due to the good heating of the cooling system, you can get burned by holding this part of the video card.

A notable innovation here is an additional heatsink plate that removes heat from the surface of the GPU and memory chips. The common plate covers the upper part of the video card board and provides heat removal through a special thermal interface from the memory chips and power system transistors.

Let's move on to testing the cooling system. At maximum load, the GPU temperature was an impressive 101 ° C, which is not considered a critical temperature for this GPU. At the same time, the cooling system worked at 92% and created a noticeable noise level.

And in idle mode (2D mode) the cooler operates at 44% of its maximum power. In this mode, its work is also noticeable against the general background noise. The cooling system installed on this video card provides normal efficiency, but the needs of the GPU of the video card clearly force it to try to ensure acceptable temperatures. The noise of the cooling system clearly depends on the load on the video card, and it cannot be called quiet.

Well, now let's move on to a detailed study technical characteristics video cards. To begin with, let's give brief description in table form:

The NVIDIA GPU installed here is labeled GF100-375-A3.

The frequency diagram of the video card and other characteristics look like this:

This sample completely replicates all the characteristics of the “reference” version of the accelerator on NVIDIA. The GPU on the ZT-40101-10P operates at 701 MHz, and shader domains at 1401 MHz, respectively. The video memory received 924 MHz real or 3696 MHz effective frequency.

The tested video card uses SAMSUNG GDDR5 memory chips with a total capacity of 1536 MB. The K4G10325FE-HC04 marking indicates that these chips provide an access time of 0.4 ns, which corresponds to a real frequency of 1250 MHz or 5000 MHz effective and provides a significant headroom for overclocking.

Testing

CPU Intel Core 2 Quad Q9550 (LGA775, 2.83 GHz, L2 12 MB) @3.8 GHz
motherboards NForce 790i-Supreme (LGA775, nForce 790i Ultra SLI, DDR3, ATX)
GIGABYTE GA-EP45T-DS3R (LGA775, Intel P45, DDR3, ATX)
Coolers Noctua NH-U12P (LGA775, 54.33 CFM, 12.6-19.8 dB)
Thermalright SI-128 (LGA775) + VIZO Starlet UVLED120 (62.7 CFM, 31.1 dB)
Additional cooling VIZO Propeller PCL-201 (+1 slot, 16.0-28.3 CFM, 20 dB)
RAM 2x DDR3-1333 1024 MB Kingston PC3-10600 (KVR1333D3N9/1G)
Hard disks Hitachi Deskstar HDS721616PLA380 (160 GB, 16 MB, SATA-300)
Power supplies Seasonic M12D-850 (850 W, 120 mm, 20 dB)
Seasonic SS-650JT (650 W, 120 mm, 39.1 dB)
Frame Spire SwordFin SP9007B (Full Tower) + Coolink SWiF 1202 (120x120x25, 53 CFM, 24 dB)
Monitor Samsung SyncMaster 757MB (DynaFlat, 2048×1536@60 Hz, MPR II, TCO’99)

During testing, it became clear that the video card confirms its status as the most powerful single-chip graphics accelerator to date. The new GPU product from NVIDIA is clearly a little faster in performance than its competitors on AMD chips, but given its power consumption and operating temperature, which also entails increased noise, and also looking at the price tag, it cannot definitely be called a balanced solution. In addition, doubts arise about the possibility of creating a dual-chip version based on NVIDIA, which can outperform the dual-chip graphics accelerator on the Radeon HD 5970.

Overclocking

Overclocking of this video card cannot be called outstanding either. We were almost unable to overclock the memory on the video card, although the chips themselves clearly operate slower than their nominal frequency. But the GPU itself, with a voltage of 1.05 V, was able to be overclocked to 770 MHz, and the core temperature was 87 ° C. But during overclocking, the video card is in different conditions than during the cooling system efficiency test, in particular, the side panel of the case was removed and a 120 mm fan was installed near the video card, which slightly improves the cooling conditions, and the cooler itself constantly worked at 100% rotation speed. Having a software mechanism for controlling the supply voltage, we continued our experiments. When supplying 1.075 V, the GPU was able to overclock to 784 MHz, and the temperature increased to 91 ° C. The best result was achieved at 1.1 V, when the GPU was overclocked to 790 MHz, but now its temperature under load increased to 99 ° C

Let's see how manual acceleration affected performance:

Test package

Standard frequencies

Overclocked video card

Productivity gain, %

3DMark Score
SM2.0 Score
HDR/SM3.0 Score
Performance

Serious Sam 2, Maximum Quality, AA4x/AF16x, fps

1600×1200
2048×1536

Prey, Maximum Quality, AA4x/AF16x, fps

1600×1200
2048×1536

Call Of Juarez, Maximum Quality, NO AA/AF, fps

1280×1024
1600×1200
2048×1536

Call Of Juarez, Maximum Quality, AA4x/AF16x, fps

1280×1024
1600×1200
2048×1536

Crysis, Maximum Quality, NO AA/AF, fps

1280×1024
1600×1200
2048×1536

Crysis, Maximum Quality, AA4x/AF16x, fps

1280×1024
1600×1200
2048×1536

Crysis Warhead, Maximum Quality, NO AA/AF, fps

1280×1024
1600×1200

Crysis Warhead, Maximum Quality, AA4x/AF16x, fps

1280×1024
1600×1200

Far Cry 2, Maximum Quality, NO AA/AF, fps

1280×1024
1600×1200
2048×1536

Far Cry 2, Maximum Quality, AA4x/AF16x, fps

1280×1024
1600×1200
2048×1536

The gain from overclocking is rather weak, and given the maximum operating temperatures of the video card even without overclocking, the feasibility of the latter becomes questionable, because you will have to work hard to increase the cooling efficiency of the GPU. And even at nominal frequencies, this “top” solution can easily provide decent gaming performance even for a demanding user.

Results

Video cards based on the NVIDIA graphics processor, including the tested ZT-40101-10P, turned out to be very productive single-chip solutions. Moreover, the GF100 GPU with Fermi architecture used in them initially had 512 streaming cores, but as a result of some problems with obtaining the required number of chips during production, the “top” video cards on it use only 480 cores. But due to fairly high operating frequencies, the accelerators still turned out to be generally faster than the competitor’s single-chip cards based on the Radeon HD 5870, although the market leader is still the dual-chip solution from AMD – the Radeon HD 5970.

However, if the performance of single-chip “top” video cards on GPUs from NVIDIA is superior to the corresponding solutions on chips from AMD, then power consumption is clearly not the strong point of NVIDIA cards. Of course, for many enthusiasts this will not be a selection criterion, but in some cases it is worth thinking about this aspect, because an increase in energy consumption not only leads to a slight increase in electricity bills. In fact, all the energy consumed by the graphics accelerator is dissipated in the form of heat, which must be quickly removed to avoid overheating and failure of high-tech components, which in turn leads to a more complex cooling system and an increase in its noise.

NVIDIA GeForce GTX 480M- a top-end video card built on the Fermi architecture. It has full support for DirectX 11 and is manufactured using 40 nm technology from TSMC. With 352 cores, GTX480M can be compared with GTX 465 for desktop computers, but at a lower frequency. GeForce GTX 480M has 2 GB of fast GDDR5 video memory (discrete), so its performance should be at the level of the ATI Mobility Radeon HD 5870 card.

Also known as the GF100, the Fermi chip has been redesigned and now has 3 billion transistors (with all 512 shaders). Compared to the desktop HD 5870, which has 2.13 billion transistors or Mobility Radeon HD 5870(RV870) with 1.04 billion transistors, GTX480M looks quite impressive.

The mobile Fermi chip contains up to 352 shader cores (1-dimensional) with 32 rasterization units (ROP) and 44 texture units (Texture Unit). The memory bus is 256-bit, but due to the fast GDDR5 memory, it shouldn't be a problem. Power consumption is 100 W TDP, including the MXM board and 2 GB GDDR5. AMD typically quantifies chip power consumption separately, so they can't be directly compared. GTX 480M Only suitable for a large laptop with a good cooling system. At first, only Clevo decided to install this card in its barebones - 17" D901F and 18" X8100.

Performance Nvidia GeForce GTX 480M should be better than the ATI Mobility Radeon HD 5870, and on par with mobile system Geforce GTX 285M SLI and Radeon HD 4770 for desktop computers. It means that GTX480M- the fastest single video card in the first quarter of 2010. Modern DirectX 10 games should run fluently at high resolutions with good rendering and anti-aliasing. Only very demanding games like Crysis Warhead may need to turn down the detail a bit. Due to hardware support for DirectX 11 (for example, good tessellation), video cards built on the Fermi architecture should perform well in DirectX 11 games, of which more and more will appear.

Just like the GeForce 300M series of video cards, GeForce GTX 480M supports PureVideo HD with VP4 video processor. This means that the video card can fully decode HD video in H.254, VC-1, MPEG-2 and MPEG-4 ASP. Using Flash 10.1, the graphics card can also speed up Flash video processing. Cores Nvidia GeForce GTX 480M can be used for general computing using CUDA or DirectCompute. For example, HD video encoding can be done significantly faster using the GPU's shader cores than a modern CPU would. PhysX, also supported by mobile Fermi, allows you to calculate physics effects in relevant games (falling raindrops, dispersing fog, etc.).

Compared to desktop graphics cards, GeForce GTX 480M Can be matched to an overclocked card Nvidia GeForce GTX 465(frequency 607/1200) and Radeon HD 5770.

More than six months ago, ATI Radeon 5xxx Series video cards appeared on the video adapter market. They brought with them hardware support for DirectX 11 and Shader Model 5.0, tessellation, and many other goodies for those who like to play video games. Unfortunately (or fortunately...), the rival in the person of NVIDIA was unable to provide a competitor in time, and therefore, AMD (more precisely, its division of ATI, which develops graphics chips) reaped all the fruits of success, literally “flooding” the market with video cards with DirectX 11 support.

NVIDIA, which has never skimped on the PR of its products, did not disappoint this time either, deliberately feeding enthusiasts with crumbs from the master's table of the NVIDIA GF100 developers, based on the Fermi microarchitecture. We first heard some details about the structure of the GF100 chips a little over six months ago. Since then, in the depths of the secret laboratories of the NVIDIA corporation, a competitor to the five thousandth series of ATI Radeon video cards was created, which truly simply had to live up to all the promises made earlier. And so, a miracle happened! A month ago, amid the fireworks and fanfare of analysts, the GTX 480 and GTX 470 video cards were launched into free floating around the world market. Did they live up to their “long wait” or not?

advertising

Structure and architecture of NVIDIA GF100

Today there are only two video adapters from NVIDIA on the market with support for DirectX 11. They should set the course for the entire movement of the new line. The highest card on this moment is GTX 480.

“Dreams come true...” It seems that this is exactly what Yuri Antonov sang. But, as it turned out, “they don’t come true.” The GTX 480 was originally supposed to feature 512 “high-performance CUDA cores,” but for some reason NVIDIA was unable to implement its plan 100%. As a result, we can see a decrease in the number of GTX 480 from 512 to 480 processors.


The world of free programs and useful tips
2024 whatsappss.ru

Manufacturer: NVIDIA
Series: GeForce GTX 400M
Code: Fermi
Streams: 352 - unified
Clock frequency: 425* MHz
Shader frequency: 850* MHz
Memory frequency: 1200* MHz
Memory bus width: 256 Bit
Memory type: GDDR5
Maximum memory: 2048 MB
Common memory: No
DirectX: DirectX 11, Shader 5.0
Energy consumption: 100 W
Transistors: 3000 million
Technology: 40 nm
Laptop size: big
Release date: 25.05.2010