Intel Arc Discrete GPUs Are Almost Here
An ambitious goal for sure. It’s also one in which Intel is in position to successfully execute because of its multi-pronged attack strategy and deep industry partnerships with hardware makers, OEMs, and system integrators. In other words, Intel is not a Johnny Come Lately to the graphics market by any stretch. It technically owns the largest share of the overall GPU market (more than AMD and NVIDIA combined) because of its integrated graphics processors. Now begins Intel’s first earnest attempt at the discrete GPU sector since Larrabee, starting with its mobile solutions.
Intel is hitting the ground running with two different mobile A-Series SoCs: ACM-G10 and ACM-G11. The former is the larger of the two chips and packs up to 32 Xe cores, 32 ray tracing units, 16MB of L2 cache, a 256-bit wide memory bus, and support for PCIe 4.0 x16.
ACM-11, meanwhile, is one-fourth the size and wields up to 8 Xe cores, 8 ray tracing units, 4MB of L2 cache, up to a 96-bit memory bus, and 8 lanes of PCIe 4.0. This is what is arriving in laptops first, under Intel’s Arc 3 branding, followed by more powerful ACM-G10-based solutions not far behind (later this summer) — those will be the Arc 5 and Arc 7.
These two SoCs set the foundation for five graphics solutions across three segmented performance tiers, including Arc 3, Arc 5, and Arc 7. It’s a lot to juggle at first glance, and that’s where the consumer branding comes into play. It’s similar to what Intel has done on the CPU side with its Core i3, Core i5, Core i7, and Core i9 branding, each with its own set of processor models. In this case, Arc 3 is designed as a GPU solutions aimed at “Enhanced Gaming,” Arc 5 is the “Advanced Gaming” tier, and Arc 7 is for “High Performance Gaming.”
And so it goes here on the GPU side. Intel’s Arc 3 GPUs are built around Intel’s ACM-G11 SoC, while Arc 5 and Arc 7 solutions are both based on ACM-G10. The two Arc 3 solutions launching today include A350M and A370M, both of which have made a few laps in the rumor circuit ahead of today’s official reveal. Forget all the leaks, though, because we now have concrete specs to share.
A370M arrives to the mobile scene with 8 Xe cores, 8 ray tracing units, 4GB of GDDR6 memory linked to a 64-bit memory bus, and a 1,550MHz graphics clock. Graphics power is rated at 35-50W. A350M is a lower power solution (25-35W) with 6 Xe cores, 6 ray tracing units, the same memory allocation and bus width, and a 1,150MHz graphics clock.
A Closer Look At The Intel Arc GPU Architecture
We’ve covered Xe-HPG and its architecture at a high level previously—we recommend checking out our Architecture Day 2021 coverage for some additional context. We’ll reiterate some of the info here, but have some additional details to share as well.
Intel segments its Arc discrete GPUs in cores and slices. The cores are the foundation of the design and are grouped together into slices. This first wave of Arc mobile GPUs feature up to 8 render slices, each with 4 cores per slice. There is also 1 ray tracing unit per core (4 per slice), which equates to 32 cores and 32 ray tracing units in a fully-enabled ACM-G10. The smaller ACM-G11 will have only 8 each.
Each core is outfitted with 16 256-bit vector engines and 16 1024-bit Matrix Engines. There is 192K of shared L1 cache per Xe core, which can be dynamically partitioned as L1 cache or Shared Local Memory (SLM) depending on the workload.
The Xe-HPG vector engines have an improved ALU design with a dedicated FP execution port and a shared Int/EM execution port. Also on board is a dedicated XMX Matrix engine, which is particularly well suited for AI-related workloads. The XMX Matrix engine is capable of 128 FP16/BF16 ops/clock, 256 Int8 ops/clock, or 512 Int4/Int2 ops/clock.
The GPUs will be manufactured on TSMC’s N6 process node, which is a marginal improvement over N7 in terms of transistor density. All told though, accounting for architectural improvements in Xe-HPG and the more advance process, Intel is claiming up to a 1.5X performance-per-watt uplift versus its Xe-LP.
Intel’s discrete Arc GPU also feature a class-leading media engine, which supports all major codecs and is the first of its kind to support hardware encode acceleration for AV1. Back at Architecture Day, Intel talked about an AI-accelerated video enhancement technology capable of high-quality, hardware accelerated upscaling of low-resolution video content to 4K resolution, and through a collaboration with Topaz Labs, that tech will be supported in an upcoming release of the company’s Video Enhance AI application. You can see it action here…
The AV1 acceleration in Arc’s media engine is a clear advantage over competing solutions. AV1 is capable of producing higher-quality video at similar compression levels to H.265, or similar quality video with even higher compression. That means AV1 encoding can reduce bandwidth consumption with higher-quality output, which is ideal for game streaming, or reduce the storage space necessary to store video.
As you can see in the demo above, AV1 encoding produces much better looking output than existing codecs.
Although this is a new feature exclusive to Intel at the moment, many ISVs are already supporting the technology. FFMPEG, Handbrake, Premiere Pro, Xsplit, and Davinci Resolve all already support the media engine in Arc, with more sure to follow.
Intel Arc GPU Flexible Power Optimizations
Intel notes that all of the SoCs use dynamic clocks within the frequency voltage curve, based on power consumption, temperatures, and utilization at any given moment. Additionally, the graphics clock is roughly the average clock delivered within a target TDP, while running a typical workload (games and other applications).
There’s also a symbiotic relationship at play with Intel’s mobile GPUs. Arriving on the heels of Alder Lake in mobile form, these Arc A-series GPUs complete Intel’s modern laptop platform and work intelligently with Intel’s 12th Gen Core CPUs.
What this does is manage workloads between the CPU, integrated Xe graphics, and the discrete Arc graphics. Depending on the demands of the workload, Intel’s platform can shift power where it’s needed. That might be the CPU or GPU, or it can strike an optimal balance depending on the workload that’s running at the time.
How Will Intel’s First Arc GPUs Perform?
Specs and features aside, what does this all amount to in terms of gaming performance? We’ll know for sure when have a chance to test Intel’s Arc solutions for ourselves, but in the meantime we can look at Intel’s performance claims.
The Arc 3 series is designed to be a cut above integrated graphics. According to Intel, a laptop outfitted with a Core i7-12700H processor and Arc A370M GPU can top the 60 frames per second threshold at 1080p in many games where integrated graphics could come up short. Some examples include Doom Eternal (63 fps) and Strange Brigade (69 fps) at high quality settings, and Hitman 3 (62 fps), Destiny 2 (66 fps), and Wolfenstein: Youngblood (78 fps) at medium settings.
Competitive esports titles are typically less demanding, and in those types of games, Intel claims the same laptop configuration can approach and exceed triple-digit framerates at 1080p. As highlighted above, Intel’s benchmarks show the A370M paired with a Core i7-12700H hitting 94 fps in Fortnite and 105 fps in GTA V at medium settings, and 105 fps in Rocket League and 115 fps in Valorant at high settings.
Of course, Intel isn’t only targeting gamers with its discrete GPUs, but content creators and professionals as well. This is where Deep Link really comes into play. On a laptop outfitted with a Core i7-12800H processor an Arc A370M GPU, Intel claims up to a 2.4x performance uplift (Adobe Premier Pro) over the same laptop without a discrete GPU.
The final piece to all this is a commitment to polished driver releases and software. To that end, Intel is introducing Arc Control, an all-in-one software experience to streamline various tasks and monitoring tools. It serves up real-time performance metrics like temps and utilization, it serves as a dashboard for broadcasting to third party platforms, and it makes fetching driver updates easy and seamless (Intel is committing to day-0 driver releases for major titles, by the way). There are also performance tuning controls, though Intel is reserving those dials for the desktop. Arc Control will be quickly accessible via an overlay that can be brought up using hotkeys, similar to what AMD has done with its driver and NVIDIA offers with GeForce Experience, and it will also support 12th Gen integrated graphics engines, so both the iGPU and dGPU can be managed from within a single interface on Arc-equipped laptops.
New Features And Tools Coming With Intel Arc
Intel claims the technology can deliver up to a 2X performance boost with Arc’s built-in XMX Matrix engines, but can also work on legacy and competitive GPUs that support the DP4a instruction set. Intel notes that about 15 games are already in pipeline that will support XeSS, with more on the way.
Arc’s display engine is also leading edge. It features support for HDMI 2.0b and DP1.4, but the design is also DP 2.0 10G ready. The display engine can handle 2 x 8K60 HDR displays or 4 x 4K120 HDR displays, with refresh rates up to 360Hz at lower resolutions. The display engine also support adaptive refresh rates, i.e. Adaptive Sync.
Intel, however, also disclosed a couple of new display sync modes, dubbed Speed Sync and Smooth Sync. Vertical Sync, or V-Sync, is a legacy technology that synchronizes a GPU’s output to a display’s refresh rate, which was historically 60Hz. Enabling V-Sync ensures what is being output is in-sync with a display’s capabilities and there will be no display output-related visual anomalies due the GPU and monitor being out of sync. But enabling V-Sync typically introduces a significant input latency penalty, which is a big no-no for fast-twitch and most competitive games.
Disabling V-Sync, and letting a GPU output frames as fast as it can eliminates that latency, but can in turn introduce screen tearing if the GPU is outputting frames faster than a monitor can display them. Both Speed Sync and Smooth Sync aim to eliminate or minimize screen tearing using different methods.
Speed Sync works by outputting only completed frames to the display. This means there will be no tearing and GPU can run at full speed, but partial frames will be discarded. With Smooth Sync, however, the GPU behaves as if V-Sync is disabled, but the hard lines at the boundaries where screen tearing occurs is dithered, and blended between adjacent edges. The screen tearing is technically still there, but with the hard edges blended and smoothed out, it is much less visually jarring. Although Smooth Sync will do some processing on the vast majority of frames being output to the display, it incurs a very slight performance penalty—somewhere in the neighborhood of 1%.
In addition to Dynamic Power Share mentioned earlier, Intel’s Deep Link also enables some other new features, namely Hyper Encode and Hyper Compute. We’ve talked about Hyper Encode before. It essentially allows compatible applications to leverage the media engines incorporated into the iGPU and dGPU simultaneously to improve video encoding performance.
Hyper Encode works by breaking the workloads down into 15 – 30 frame batches, dispatching them to the media engines, and then stitching them back together. A similar-sounding but very different feature called Hyper Compute also distributes compute workloads across the iGPU and dGPU simultaneously, to increase performance.
Keep in mind that this is just the beginning. Intel is starting at the bottom and working its way up with Arc, in terms of performance targets. We’re eager to see how the initial product offerings fare, and of course what the higher end SKUs deliver later this summer. Stay tuned. In addition, we’re going to have Intel’s Tom Petersen on our 2.5 Geeks livestream this Thursday at 5:00pm ET (2:00pm PT) to chat about this launch and Arc in general, so be sure to stop by!