AMD EPYC Turin delivers better performance/power efficiency than AmpereOne(www.phoronix.com)

95 pointsby geerlingguya year ago6 comments

chuankla year ago
There is something wrong with some of those numbers.
For example, take 7-Zip Compression 22.01. The CPU Power Consumption Monitor chart states:
AmpereOne: Average 278.72W EPYC: Average 311.64W
But the fine print under that same chart states:
AmpereOne: 6968J per run EPYC: 14439J per run
By the Joules per run numbers, AmpereOne is far more power efficient than EPYC, requiring only less than half of the energy to complete a run.
In that case, how could the average power of EPYC to be only 11.8% higher than that of AmpereOne? For this benchmark EPYC is 14.2% faster than AmpereOne, and if the average power numbers are correct, the EPYC should have slightly lower Joules per run than AmpereOne.
That is not the only anomaly. For example, the CPU Power Consumption Monitor chart for John the Ripper 2023.03.14 also does not make sense.
- rowinofwina year ago
  The averages are not the same as the median values, I think this is where some of the problem comes from. The plots have quartiles with the boundaries shown as lines. The line showing the median value for the Ampre system is near the middle of the plot, but the median value for the AMD plot is far over to the right end of that plot, suggesting that many of the results were in a narrow range just above that value. This would skew the total average energy consumption way up, so we would see the difference shown in average Joules per run. This is probably not a good type of plot for this type of data, a scatter plot or line chart may be better.
- telgareitha year ago
  Heres more: * first chart has a n of 3. (Mythbusters' rocket car had n>3) * the jouls you reference have candle charts showing way too much variance to make any conclusions.
  Never-mind that these are all reduced to absurd levels, or biased.
  My favorite was some site crapping on a SSD that only managed 3GiB/s for 100GiB of data, then dropped to 500meg or something. But, they didn't mention data transferred at all. Just speed vs time. Obviously pushing for that higher kickback on the ssd that costs 4x as much and uses 8x the power.
gary_0a year ago
EPYC Turin Dense is TSMC 3nm and AmpereOne is TSMC 5nm, so that's to be expected.
Given that most (all?) cutting-edge chips use TSMC nowadays, can you really have an apples-to-Apples comparison if the chips being compared aren't on the same process node?
Unless you're comparing price/performance, since nowadays there's no guarantee that a process shrink will get you significantly cheaper transistors (RIP, Dr. Moore).
- wmfa year ago
  It's a what you can buy today vs. what you can buy today comparison. Ampere chose to use N5 even though N3 was available and they are paying for that decision.
  - trhwaya year ago
    Ampere MSRP $5.5K vs $14K for the EPYC. With 1.6x worse performance at 1.2x better energy consumption. Looks like a reasonable option, and the more options the merrier.
  - re-thca year ago
    > Ampere chose to use N5 even though N3 was available
    Wasn't it just late? There were numerous delays.
    wmfa year ago
    Yeah, that's their bigger problem; all their chips are years late. They probably should be shipping AmpereTwo on N3 by now.
- qballa year ago
  >since nowadays there's no guarantee that a process shrink will get you significantly cheaper transistors
  That is because all cutting-edge chips use TSMC.
  No competition means price per transistor can stay consistent or even rise, which is one part of why most modern CPUs and GPUs have price/performance ratios that are the same or worse than their previous-generation counterparts.
  >can you really have an apples-to-Apples comparison if the chips being compared aren't on the same process node?
  Of course not, but that isn't going to stop people from doing it, nor is it going to stop people from going "x86 is dead" when comparing last-gen-node AMD processors to CPUs only Apple can use (conveniently forgetting that Qualcomm's products underperform at the same process node).
  - aurareturna year ago
    M3 on the same N3B node is 2-3x more efficient than Lunar Lake. M3 is also straight up faster.
    Qualcomm’s X Elite matches or exceeds Intel Lunar Lake on an older N4P node in efficiency and speed.
    Sources: https://www.notebookcheck.net/Intel-Lunar-Lake-CPU-analysis-...
    https://youtu.be/ymoiWv9BF7Q
    adrian_ba year ago
    It is quite difficult to compare the true efficiency of Apple and non-Apple computers, because only few useful applications can run on both kinds of computers and because typically those who use non-Apple computers do not have direct access to any Apple computer, while those who use Apple computers usually have never used a good non-Apple computer (I would not consider any of the old Intel-based Apple computers as good).
    Of the very few benchmarks that can compare Apple with non-Apple, I have never seen any where an M3 was 2-3x more efficient than Lunar Lake, so a link would be appreciated.
    On the contrary, most if not all benchmarks showing battery lifetimes were showing better values for Lunar Lake, implying better efficiency.
    Other than by the battery lifetime I cannot see how you can test the efficiency of an Apple computer, except by using a power and energy measurement instrument on the wall socket, because in none of the reviews about Apple computers have I seen any mention about accurate internal power sensors exposed to the user.
    An M3 is definitely much more efficient in single-threaded execution than Lunar Lake, which is due to having a higher IPC and a lower clock frequency.
    On the other hand, in multithreaded applications there is very little efficiency difference between different CPU microarchitectures that are implemented in the same TSMC process.
    ameliusa year ago
    > only few useful applications can run on both kinds of computers
    GCC, Gimp, Firefox, ...
    yencabulatora year ago
    GCC tends to be a filesystem benchmark on the side, so the OS and SSD matter a lot.
    Gimp and Firefox benchmarks are likely affected by UI library/API differences.
    I propose ramdisk-only from-scratch compilation of a large Rust project in a loop. For AC you can measure power use externally, for battery I don't know if there's anything better than self-reported µAh/µWh counters.
    adrian_ba year ago
    Obviously, there are a lot of open-source applications that can be run on Apple computers.
    However I have never seen published benchmarks for them.
    A benchmark that would be valid for comparing the efficiency of an Apple computer with a non-Apple computer would be to compile using gcc a big software project. A cross-compilation of the project would be more accurate, because for a native compilation target the compiled files might be not the same.
    ameliusa year ago
    True, although the efficiency of the instruction set should perhaps also be part of the benchmark since many applications nowadays are JIT-compiled.
    Also, there are benchmarks for browsers which you could run on both types of computer.
- ameliusa year ago
  Shouldn't all the credits go to TSMC anyway? I mean coming up with an architecture for a GPU is no small feat, but it's nothing compared to building a fab with the capabilities of TSMC's.
kristianpa year ago
Dupe of https://news.ycombinator.com/item?id=41802254
qwertoxa year ago
> The AmpereOne A192-32X bottomed out at 101 Watts during the idle periods while the EPYC 9965 went as low as 19 Watts
Do these EPYCs usually go this low when idling? I ask because im considering getting one but it would idle more than 50% of the time, or would waiting for 5c make more sense?
I find 19 Watts surprisingly low. I know that the mainboard and peripherals would consume more, but my system running a 5950x, which im planning to upgrade to an EPYC, idles at around 130 Watts.
- chainingsolida year ago
  I own a 5950x, that 130 number sounds extremly wrong for idle. I'd recommand checking settings in bios. I'm getting around 7-20 depending on which number I read in Ryzen Master, so adding in the gpu idle of around 30-40, I'm around half of 130 for the entire system! I'll also add low idle power would make scence for a server owner, so I could see AMD priorizing it..
- c0balta year ago
  That idle is unusually high, you should check for your cpu governor (if the host is Linux, use AMD state with balanced, otherwise refer to Windows power settings). If you talk about system idle (instead of CPU idle) then a dGPU can also have a significant impact.
  - a year ago
    undefined
- Flemitploa year ago
  130?
  My systems normally idle at 30-60.
  There is a huge difference between Mainboard chips etc.
  None of my systems idle at 130. Not even my 2x 4090 system for ml
  - Tostinoa year ago
    Yeah, that is quite high. My 2x 3090 with 7950x and 96gb of ram idles around 70 watts.
renewiltorda year ago
Out of curiosity, why are the Ampere processors cloud-only? I can fit an Epyc based machine easily and have an integrator ship me something.
But top of the line ARM machines are really hard to get a hold of. We need an OpenAI for ARM ;)
- wmfa year ago
  Oracle buys the first year of production then they become available to the public later. AmpereOne should hit NewEgg around the time it becomes completely obsolete.
snvzza year ago
ARM sure isn't the future.
RISC-V is.
- yjftsjthsd-ha year ago
  I hope so, but it clearly isn't the present, unless you're aware of a RV processor in this league that I don't yet know about?
- deadmutexa year ago
  Future can be 6 days from now or 6 centuries from now. This statement is useless without specific details.
  - readthenotes1a year ago
    But by providing such details the statement goes from unknowable to unknown and potentially verifiable at some point.
    Avoiding falsifiable statements is a skill set that might be worth having in your communications toolkit.
    (I remember reading that some philosophy school had {True, false, unknown, unknowable} but, alas, cannot find any reference to that just now)
    rational_indiana year ago
    LOL. So you want everyone to become skillful in using weasel words? Spoken like a true weasel.
    readthenotes1a year ago
    Huh. I Forgot the /s
- cresta year ago
  Sure buddy. Just one little thing please tell me where you found a RV64GCV system with comparable throughput as well as throughput per watt instead of a ~100MHz in-order dual-issue toy core that doesn't exist outside FPGAs (and emulation).
  - netr0utea year ago
    The Milk-V Pioneer has 64 out of order cores and supports 128GB of ECC memory!
    SigmundAa year ago
    Its Sophon SG2042 SOC has about the same per core performance as an A72 like in a Rpi 4 or Graviton 1 from 2018...
    ramon156a year ago
    I don't know why people especially RISC-V to already be on the level ARM and x64 is. The fact RISC-V even exists to begin with is amazing.
    My opinion is definitely biased, though. Only time will tell
    seanw444a year ago
    The fact that large corporations like Google and Facebook have incentives to have a better alternative to x86 and ARM for the data center is very beneficial too, and can only speed development up.