AMD Radeon RX 7900 XTX, Radeon 7900 XT, and RDNA 3 Revealed

AMD Radeon RX 7900 XTX, Radeon 7900 XT, and RDNA 3 Revealed

Posted on

The AMD RDNA 3 architecture has finally been officially unwrapped, alongside the new $999 Radeon RX 7900 XTX and $899 Radeon RX 7900 XT graphics cards. These are set to go head-to-head with the best graphics cards, and AMD seems like it might have a legitimate shot at the top of the GPU benchmarks hierarchy. Here’s what we know.

First, most of the details align with what was already expected and covered in our AMD RDNA 3 architecture and RX 7000-series GPUs. RDNA 3 will use chiplets, with a main GCD (Graphics Compute Die) and up to six MCDs (Memory Cache Dies). In addition, there are a lot of under-the-hood changes to the architecture, including more Compute Units and a lot more GPU shaders compared to the previous generation.

Fundamentally, AMD continues to focus on power and energy efficiency and has targeted a 50% improvement in performance per watt with RDNA 3 compared to RDNA 2. We know Nvidia’s RTX 4090 and Ada Lovelace pushed far up the voltage and frequency curve, and as we showed in our RTX 4090 efficiency scaling, power limiting the RTX 4090 to 70% greatly boosted Nvidia’s efficiency. However, AMD apparently feels no need to dial the power use up to 11 at default.

Let’s start with a quick overview of the core specifications, comparing AMD’s upcoming GPUs with the top previous generation RDNA 2 and Nvidia’s RTX 4090.

Graphics Card RX 7900 XTX RX 7900 XT RX 6950 XT RTX 4090 RTX 4080 RTX 3090 Ti
Architecture Navi 31 Navi 31 Navi 21 AD102 AD103 GA102
Process Technology TSMC N5 + N6 TSMC N5 + N6 TSMC N7 TSMC 4N TSMC 4N Samsung 8N
Transistors (Billion) 58 58 – 1MCD 26.8 76.3 45.9 28.3
Die size (mm^2) 300 + 222 300 + 185 519 608.4 378.6 628.4
SMs / CUs / Xe-Cores 96 84 80 128 76 84
GPU Cores (Shaders) 12288 10752 5120 16384 9728 10752
Tensor Cores N/A N/A N/A 512 304 336
Ray Tracing “Cores” 96 84 80 128 76 84
Boost Clock (MHz) 2300 2000 2310 2520 2505 1860
VRAM Speed (Gbps) 20? 20? 18 21 22.4 21
VRAM (GB) 24 20 16 24 16 24
VRAM Bus Width 384 320 256 384 256 384
L2 / Infinity Cache 96 80 128 72 64 6
ROPs 192 192 128 176 112 112
TMUs 384 336 320 512 304 336
TFLOPS FP32 (Boost) 56.5 43.0 23.7 82.6 48.7 40.0
TFLOPS FP16 (FP8) 113 86 47.4 661 (1321) 390 (780) 160 (320)
Bandwidth (GBps) 960? 800? 576 1008 717 1008
TDP (watts) 355 300 335 450 320 450
Launch Date Dec 2022 Dec 2022 May 2022 Oct 2022 Nov 2022 Mar 2022
Launch Price $999 $899 $1,099 $1,599 $1,199 $1,999

AMD has two variants of the Navi 31 GPU coming out. The higher spec RX 7900 XTX card uses the fully enabled GCD and six MCDs, while the RX 7900 XT has 84 of the 96 Compute Units enabled and only uses five MCDs. The sixth MCD is technically still present on the cards, but it’s either a non-functional die or potentially even a dummy die. Either way, it will be fused off, and it’s not connected to the extra 4GB of GDDR6 memory, so there won’t be a way to re-enable the extra MCD.

Compared to the competition, the RX 7900 XTX still technically comes in behind the RTX 4090 in raw compute, and Nvidia has a lot more AI processing power with its tensor cores. But we also have to remember that the RX 6950 XT managed to keep up with the RTX 3090 Ti at 1080p and 1440p and was only about 5% behind at 4K. That’s despite having theoretically 40% less raw compute. So, when the RX 7900 XTX on paper has 32% less compute than the RTX 4090, we don’t actually know what that will mean in the real world of performance benchmarks.

Also, note that AMD’s presentation says 61 teraflops while our figure is 56.5 teraflops. That’s because AMD’s RDNA 3 has a split clock domain for efficiency purposes. The front end (render outputs and texturing units, perhaps) runs at 2.5 GHz, while the shaders run at 2.3 GHz. We used the 2.3 GHz value since the teraflops come from the shaders. Of course, these are “Game Clocks,” which, at least with RDNA 2, were a conservative estimate of real-world clocks while running actual games. (That’s the same for Nvidia’s Ada Lovelace and Intel’s Arc Alchemist, which both tend to run 150–250 MHz higher than the stated boost clock values in our testing.)

AMD also has a higher boost clock relative to the Game Clock, which is where it gets the 61 teraflops figure — the boost clock on the RX 7900 XT is 2.5 GHz. But, again, we’ll need to test the hardware in a variety of games to see where the actual clocks land. With RDNA 2, we found the boost clocks were pretty consistently what we saw in games, maybe even a bit low, so consider the 56.5 teraflops figure a very conservative estimate.

Of course, the bigger deal isn’t how RX 7900 XT stacks up against the RTX 4090 but rather how it will compete with the RTX 4080. It has more memory and memory bandwidth, plus 16% more compute. So even if the performance per clock on the RDNA 3 shaders dropped a bit (more on this in a second), AMD looks like it should be very competitive with Nvidia’s penultimate RTX 40-series part, especially since it costs $200 less.

With the high-level overview out of the way, let’s dig into some architectural details. Unfortunately, AMD is keeping some things under wraps, so we’re not entirely sure about the memory clocks right now, and we’ve asked for more information on other parts of the architecture. We’ll fill in the details as we get them, but some things might remain unconfirmed until the RDNA 3 launch date on December 13.

AMD RDNA 3 and Radeon RX 7900 XT / XTX Slide Deck

(Image credit: AMD)

AMD has said a lot about energy efficiency with the past two generations of RDNA architectures, and RDNA 3 continues that focus. AMD claims up to a 54% performance per watt improvement compared to RDNA 2, which in turn was 54% better PPW than RDNA. In the past three generations, AMD’s efficiency has skyrocketed — and that’s not just marketing speak.

If you look at the RX 6900 XT as an example, it’s basically double the performance of the previous generation RX 5700 XT at 1440p ultra. Meanwhile, it consumes 308W in our testing compared to 214W on the 5700 XT. So that’s a 38% improvement in efficiency, just picking the two fastest RDNA 2 and RDNA offerings at the time of launch.

How does AMD continue to improve efficiency? Of course, a big part of the latest jump comes thanks to the move from TSMC N7 to N5 (7nm to 5nm), but the architectural updates also help.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *