
AV1 Video Encode at 1W Per Stream
AMD this morning is launching a brand new devoted media accelerator and video encode card for information facilities – and the primary to be launched below the AMD model – the Alveo MA35D. The cardboard is a successor to an earlier line of Xilinx playing cards that AMD picked up as a part of their Xilinx acquisition, vaulting them into the marketplace for devoted video encode playing cards. The newest technology Alveo media accelerator card, in flip, guarantees vital efficiency advantages over its predecessor, quadrupling the utmost variety of simultaneous video streams whereas additionally including AV1 and 8K decision encode assist.
Like its predecessor, the Alveo U30, the MA35D is a pure video encode card designed for information facilities. That’s to say that its ASICs are designed solely for real-time/interactive video encoding, with Xilinx seeking to do one factor and do it very properly. This design technique is in notable distinction to competing merchandise from Intel (GPU Flex Sequence) and NVIDIA (T4 & L4), that are GPU-based merchandise and leverage the pliability of their GPUs together with their built-in video encoders as a way to perform as video encode playing cards, gaming playing cards, or different roles assigned to them. The MA35D, by comparability, is a comparatively simple product that’s designed to extra optimally and effectively do video encoding by specializing in simply that.
As it is a product line inherited by AMD as a part of their Xilinx acquisition and developed by the ensuing Adaptive & Imbedded Computing Group, the Alveo MA35D is each new for AMD and acquainted on the similar. Earlier information heart video encode merchandise launched by AMD have been primarily based on their GPU lineup, so whereas that is the newest such video encode card for the ex-Xilinx staff, that is the primary time AMD correct has launched a devoted video encode card on this trend – and making it a major instance of the form of new market alternatives AMD was searching for in buying Xilinx.
The goal marketplace for the cardboard is, like its predecessor, the info heart market. AMD’s precept purchasers are reside streaming providers and different interactive video providers (assume Twitch, cloud gaming, video conferencing, and so forth), all of whom have to encode massive numbers of video streams in real-time in a server setting. So like AMD’s EPYC processors, that is very a lot a server half geared toward a choose group of companies.
Diving into the Alveo MA35D {hardware} itself, AMD is touting a major generational improve over its predecessor. Whereas the Alveo U30 was an H.264 and H.265 encode card that would encode as much as 8 1080p streams, the Alveo MA35D expands this considerably to 32 1080p streams. In the meantime, assist for the latest-generation AV1 codec has been added – becoming a member of the prevailing H.264 and H.265 choices – and the utmost stream decision has been elevated from 4K to 8K – itself one other quadrupling.
On the coronary heart of the cardboard is AMD’s unnamed video encode ASIC, which they’re calling their Video Processing Unit (VPU). The MA35D comprises two VPUs, every with their very own 8GB pool of LPDDR5 reminiscence and a PCIe 5.0 x4 connection again to the host processor. The VPU is being constructed on a 5nm course of, by way of unusually AMD is just not disclosing the fab getting used, which makes us assume it’s a Samsung 5nm course of (ed: at this level, if somebody is utilizing TSMC, they’re normally bragging about it).
Below the hood, every VPU comprises 4 video encode blocks, augmented with the varied accent blocks wanted to make it a completely purposeful chip. Two of the encode blocks are full-featured, supporting H.264, H.265, and AV1, whereas the opposite two blocks are solely for AV1 – underscoring the extra computational complexity of the brand new codec. Different blocks on the VPU embrace video decoder blocks for transcoding, reminiscence controllers, administration controllers, a bitrate scaler, composition engines, and a 22 TOPS throughput AI processor to additional enhance the cardboard’s video encode high quality.
With the video encode blocks themselves, AMD’s engineers have been fast to notice that, regardless of the overlapping similarities between this half and AMD’s GPU efforts, the VPU’s video encode blocks are a singular design, and never pulled from AMD’s GPU video encode blocks. Whereas I wouldn’t be shocked to see AMD ultimately merge encoder IP throughout the product traces, for the present technology product the Alveo MA35D’s VPUs have been in improvement earlier than the Xilinx acquisition ever closed, so the previous Xilinx staff completed what they began. Which means the VPUs are sure to return with their very own set of quirks, but in addition, there’s a sure diploma of satisfaction from the Alveo staff that they’ve constructed the higher video encoder.
The VPU additionally marks the transition of the Alveo video encoder household to a completely ASIC-based product. Xilinx, after all, is greatest identified for his or her programmable FPGAs, and whereas the earlier Alveo U30’s processors used exhausting logic for his or her video encode blocks, that was mixed with a FPGA cloth community. In order that product was nonetheless a mixture of ASIC and FPGA design. MA35D’s VPUs, then again, are tried and true ASICs with no FPGA components, permitting the corporate to completely exploit the ability effectivity advantages of utilizing fastened perform logic for a devoted product.
And power effectivity is the opposite main achieve over the older U30 card – and what AMD considers a major edge over their competitors, as properly. The formal TDP of the cardboard is 50 Watts, however in follow AMD is discovering that the everyday energy consumption of the cardboard is nearer to about 35 Watts, or a hair over 1W per stream for 1080p60. This a 66% discount in per-stream power consumption versus the U30, which was at a bit over 3W for a single 1080p stream.
In the meantime, new to the Alveo MA35D and its VPU is an AI acceleration block. In contrast to GPU-based merchandise, this isn’t for quasi-related AI duties like picture recognition; slightly AMD is utilizing the AI accelerator to feed extra information into their video encoder to additional enhance their encoding high quality. Rated for 22 TOPS of efficiency, the AI processor exists to judge streams on a frame-by-frame foundation, after which use that evaluation to regulate the encode parameters utilized by the remainder of the chip.
Utilizing each region-of-interest encoding and artifact detection, the AI processor basically permits the MA35D to get away with decrease bitrates than a extra naïve video encode technique. Area-of-interest encoding permits for parts of a video to obtain greater high quality encoding (textual content, faces, and so forth), whereas artifact detection can catch when the encoder is being fed blocky or in any other case degraded photos – which are literally tougher to encode – and eradicating/correcting them earlier than a body is shipped off for encoding.
All instructed, AMD is making some pretty aggressive picture high quality claims with the Alveo MA35D; H.264 and H.265 picture high quality needs to be much like x264 Medium and x265 Medium presets respectively, whereas the cardboard’s AV1 encoding high quality needs to be similar to AV1 gradual. These comparisons are primarily based on VMAF scores, and what settings it takes to attain related scores. Or to border issues in a bitrate foundation, utilizing AV1 AMD says the MA35D can ship the identical picture high quality because the Alveo U30 in H.264 mode at 55% of the bitrate (a 1.8x effectivity enchancment).
Lastly, though secondary to the video encode capabilities of the MA35D, it’s fascinating to notice that the administration processors within the VPU have shifted from Arm to RISC-V. Whereas the U30’s processors used quad core Cortex-A53 cores, the MA35D VPU makes use of a pair of quad core RISC-V cores – although AMD doesn’t specify whose. The RISC-V structure has been quietly pushing out Arm for administration controllers reminiscent of these, and that is one other instance of that transition in motion.
With two VPUs, the entire Alveo MA35D card continues to be sufficiently small that it is available in a half-height half-length kind issue. With a 50W TDP, the cardboard is fully powered by way of the PCIe slot, and makes use of a PCIe x8 connector (which will get bifurcated all the way down to x4 for every VPU). And, as is typical for information heart accelerator playing cards, the MA35D is passively cooled.
In response to AMD, the Alveo is sampling to companions now. The corporate expects to start manufacturing shipments within the third quarter of the 12 months, with a recommended retail value of $1595.
No Comments