Linux support for Xe2 and power management will take time to mature, https://www.phoronix.com/forums/forum/linux-graphics-x-org-d...
Xe SR-IOV improves VM graphics performance. Intel dropped Xe1 SR-IOV graphics virtualization in the upstream i915 driver, but the OSS community has continued improvement in an LTS fork, making steady progress, https://github.com/strongtz/i915-sriov-dkms/commits/master/ & https://github.com/Upinel/PVE-Intel-vGPU?tab=readme-ov-file.
I missed this. Wow this is disappointing.
I imagine SRIOV would be useful for more advanced usecases
Why would Intel give up that advantage by directing customers to software GPU virtualization that works on AMD and Nvidia GPUs?
There's lots of hardware competition for consumers, including upcoming Arm laptops from Mediatek and Nvidia. Intel can use feature-limited SKUs in both CPUs and GPUs to target specific markets with cheaper hardware and reduced functionality.
While on Lunar Lake the GPU and the video codec block are on the same tile, they are still in different locations on the compute tile.
In the new Arrow Lake S desktop CPU, to be announced tomorrow, the GPU is extracted on a separate tile, like in Meteor Lake, while the other two blocks related to video output, i.e. the video codec block and the display controller block, are located on a tile that contains the memory controller and a part of the peripheral interfaces and which is made using a lower-resolution TSMC process than the CPU and GPU tiles.
As far as I understand this is not true. It's a different engine within the graphics device, and it shares the execution units.
Most of the video you encode on a computer is actually all in software/CPU because the quality and efficiency is better.
I don't think that's true. I bought a Thinkpad laptop, installed Linux and one of my issues was that watching youtube video put CPU onto 60%+ load. The same with Macbook barely scratched CPU at all. I finally managed to solve this issue by installing Arch. When everything worked as necessary, CPU load was around 10%+ for the same video. I didn't try Windows but I'd expect that things on Windows would work well.
So most of the video for average user probably is hardware decoded.
There is no reason to do decoding in software, when hardware decoding is available.
On the other hand, choosing between hardware encoding and software encoding, depends on whether quality or speed is more important. For instance for a video conference hardware encoding is fine, but for encoding a movie whose original quality must be preserved as much as possible, software encoding is the right choice.
This is so much spot on. Video coding specs are like a "huge bunch of tools" and encoders get to choose whatever subset-of-tools suits them. And than hardware gets frozen for a generation.
It depends on what you care about more, you don't always need the best possible encoding, even when you're not trying to record/stream something real time.
For comparison's sake, I played around with some software/hardware encoding options through Handbrake with a Ryzen 5 4500 and Intel Arc A580. I took a 2 GB MKV file of about 30 minutes of footage I have laying around and re-encoded it with a bunch of different codecs:
codec method time speed file size of original
H264 GPU 04:47 200 fps 1583 MB 77 %
H264 CPU 13:43 80 fps 1237 MB 60 %
H265 GPU 05:20 206 fps 1280 MB 62 %
H265 CPU ~30:00 ~35 fps would take too long
AV1 GPU 05:35 198 fps 1541 MB 75 %
AV1 CPU ~45:00 ~24 fps would take too long
So for the average person who wants a reasonably fast encode and has an inexpensive build, many codecs will be too slow on the CPU. In some cases, close to an order of magnitude, whereas if you do encode on the GPU, you'll get much better speeds, while the file sizes are still decent and the quality of something like H265 or AV1 will in most cases seem perceivably better than H264 with similar bitrates, regardless of whether the encode is done on the CPU or GPU.So, if I had a few hundred of GB of movies/anime locally that I wanted to re-encode to make it take up less space for long term storage, I'd probably go with hardware H265 or AV1 and that'd be perfectly good for my needs (I actually did, it went well).
Of course, that's a dedicated GPU and Intel Arc is pretty niche in of itself, but I have to say that their AV1 encoder for recording/streaming is also really pleasant and therefore I definitely think that benchmarking this stuff is pretty interesting and useful!
For professional work, the concerns are probably quite different.
That was the case up to like 5 to 10 years ago.
These days it's all hardware encoded and hardware decoded, not the least because Joe Twitchtube Streamer can't and doesn't give a flying fuck about pulling 12 dozen levers to encode a bitstream thrice for the perfect encode that'll get shat on anyway by Joe Twitchtok Viewer who doesn't give a flying fuck about pulling 12 dozen levers and applying a dozen filters to get the perfect decode.
Certainly for some use cases speed and low CPU matter but not all.
[Edit: I think I initially misread you - but I agree, it's a huge differentiator]
In the homelab/home server space I always thought the OOB management provided by AMT/vPro is probably the biggest selling point. Manageability, especially OOB, is a huge deal for a lab/server. Anyone who used AMD's DASH knows why vPro is so far ahead here.
You would need an actual GPU, though. Massively increasing cost, power usage etc. without providing any real value in return for many use cases and AFAIK HW transcoding with Plex doesn't even work properly with with AMDs iGPUs?
The N100 can transcode 4k streams at ~20w while costing barely more than a Raspberry Pi.
I feel like my Dad saying “turn off the damn lights” now that I gotta pay the ‘light bill’ on a machine that runs 24/7 with spinning disks.
GPU compute performance is both technically interesting, and matters to much more than simply gaming!
The last time I looked it was worth supporting because there was a 20 point gap in hardware support but that’s closed as each generation of hardware adds AV1 support.
If hardware accelerate decoding works, you just feed the binary video blob to the driver and it returns decoded frames.
They covered power quite a bit, but claimed the biggest power draw comes from memory access. I got the impression they were blaming AMDs increased memory bandwidth on their smaller cache size and hence a form of inefficiency. But higher frame rates are going to require more memory accesses. The smaller cache should have less impact on the number of writes needed. IMHO just some top line power consumption numbers are good, but trying to get into why one is higher than the other seems fruitless.
Extremely poorly. The worst of all deck-likes.
Their Tegra chips could do a lot in these laptop / handheld gaming devices.