Keep these things the hell away from the people who develop Chrome and desktop JS apps.
At this point we may need TSMC to make a specialized chip to run Electron.
If you disassemble some armv8 binaries that aren't dealing with Javascript, you do see still see FJCVTZS.
https://web.archive.org/web/20201119143547/https://twitter.c...
> Uncontended acquire-release atomic operations are basically free on Apple Silicon
While I don't doubt you, the poster, specifically, how is this possible? To be clear, my brain is x86-wired, not ARM-wired, so I may have some things wrong. Most of the expense of atomic inc/dec is "happens before", which essentially says before the current core reads that memory address, it will be guaranteed to be updated to the latest shared value. How can this be avoided? Or is it not avoided, but just much, much faster than x86? If the shared value was updated in a different core, some not-significant CPU cycles are required to update L1 cache on current current with latest shared value. > some not-significant CPU cycles
should say: > some not-insignificant CPU cycles
I am looking forward to the HTML Frameworks explosion. You thought there were too many JS options? Imagine when anyone can fork HTML.
I do strongly agree that <canvas> elements should not be used to replace HTML/CSS! My personal web hierarchy is 1. HTML/CSS/images; 2. Add (accessibility-friendly) JS if some fancy interaction is useful; 3. More complex - try SVG/CSS; 4. use <canvas> only if nothing else meets the project requirements.
I’ve found some resources but when I look at them I also hear stories of blind people saying these guidelines only make things worse.
Regarding the vague criticism you mention, I'd need something more concrete to tell you if the rumors are truish...
Isn't that Flutter?
Of course, I could also imagine one that reads the C and provides the equivalent html/css/js. And others might scoff "why not just compile the whole C app into wasm", which would certainly be plenty performant in a lot of cases. So I guess I don't know why it isn't already being done, and that usually means I don't know enough about the problems to have any clue what it would actually take to make such things.
In any case, I'm also looking forward to a quantum leap in web app UI! I'm not quite as optimistic that it's ever going to happen, I guess, but I can see a lot of benefit, if it did.
For simple components, I much prefer them to firing up the React ecosystem.
It seems like vertical scaling has fallen out of fashion lately, and I’m curious if this might be the new high-water mark for “everything in one giant DB”.
It may only be a few mm to the LPDDR5 arrays inside the SoC, but there are all sorts of environmental/thermal/power and RFi considerations, especially on tiny (3-5nm) nodes! Switch on the numerical control machine on the other side of the wall from your office and hope your data doesn't change.
I found one here from Supermicro: https://www.supermicro.com/en/products/motherboard/X13OEI-CP...
Has anyone see one of these in action? What was the primary use case? Monolithic database server?
Is it the usual Apple distortion effect where fanboys just can't help themselves?
It's definitely a sizeable amount of RAM though, and definitely enough to run the majority of websites running on the web. But so would a budget Linux server costing maybe 100-200 bucks per month.
I would be interested as well in what an on chip memory bank would do for an EPYC or similar system since exotic high performance systems are fun even if all I’ll ever touch at this point is commodity stuff on AWS and GCP.
And that wasn’t even where it topped out, there were servers supporting 6TB of DDR3 in a single machine. DDR4 had at least 12TB in a single (quad-CPU) machine that I know of (not sure if there were any 96*256GB DDR4 configs). These days, if money’s no object, there exist boards supporting 24TB of DDR5. I think even some quad-CPU DDR2-era SKUs could do 1.5TB. 512GB is nothing.
(Not directly in response to you, just adding context.)
You misunderstood my post, and I don't appreciate the tone of your reply.
While I believe you that you meant to write about the different performance profile of on chip memory, that's not what you did at the time I wrote my reply. What you actually did write was how 512 GB of RAM might revolutionize i.e. database servers. Which I addressed.
And if you hadn't written that, I wouldn't have written my comment either, because I'm not a database developer that could speculate on performance side-grades of such kind (less memory, but closer to the CPU)
Know that scene from one episode of Aqua Teen Hunger Force where George Lowe (RIP) is a police officer and has his feet amputated, so he drags himself while pursuing a suspect?
Yeah. It does that.
I can even use VS Code remote on it in a pinch, though that's pushing it...
We had some hefty rigs at the last studio I worked at.
The old xeon stations were power houses.
The joke being that Apple realized that so many apps are built in Electron and made a decision to provide a shit ton of RAM to just to handle Electron. It seems very on point to the discussion
At this point, it's more satirical than haha funny. Electron is so bloated that it requires way more RAM than say native apps. To poke fun of its inefficiencies isn't going to win Last Comic Standing, but it is valid criticism even if attempted to be told in a humorous manner. Just because it's stuck in your craw doesn't mean the rest of us are in the same place as you, yet you are unwilling to accept that your view isn't the only view.
I actually almost totally agree with the perspective the “joke” comes from! I just don’t see it as a topic that warrants so frequently disrupting otherwise interesting discussion.
I really think a sizable chunk of people in the “omg my RAM!” camp are basing it on vibes, backed up by a misread of reported usage.
This reminds a long time ago when I was trying to figure out why the heck my Intel Mac was allocating all my RAM and most of my swap to Preview or Chess.
It’s true reported memory allocation does not equal actual memory used and that’s very clever of everyone who brings it up, but it does actually cause real annoyances.
I thought ublock was forced out of Chrome months ago... how are you people still using it? I switched back to Firefox a couple years ago already, even if it's occasionally painful.
But in my example I was thinking of a particular 2-month stretch where this kept biting me and I was using Chrome at that point. In terms of memory usage, Firefox is no better though (at one point it was, but not any more).
Now I'm afraid of saying "memory usage" lest someone pops out to comment "that's not how memory works" like whack-a-mole.
There are many specialized allocation patterns -- especially for larger system things like DBs, virtual machine / runtimes etc. -- that will mmap large regions and then only actually use part of it. Many angry fingers get pointed often without justification.
And this attitude "oh memory usage problems are a misreading of top" promotes poor memory management hygiene - and I think there's a strong argument that's all good in server applications / controlled environments but for desktop environments this attitude causes all sorts of knock on effects.
Coupled with the CPU just having more oomph, I ordered an M4 Max with 64 GB of RAM for my video/photo editing; I may be able to export H.265 content at 4K/high settings with greater-than-realtime performance...
I'm a little sad that the AI narratives have taken over all discussion of mid-tier workstation-ish builds now.
It really is cool to see. It’s nice that that kind of horsepower isn’t limited to the likes proper “big iron” like it once was and can even be reasonably be packaged into a laptop that is decent at being mobile and not an ungainly portable-on-a-technicality behemoth.
I’m sure whichever I end up with will be a pretty big upgrade over my basest of base model 32GB M1.
The thing I’m most curious about on the new machines, especially the Ultra, is the thermals. I only care about perf per watt if it becomes unfavorable enough that the fan spins up above idle during normal Tasks. On my M1 the only way to get it to audibly spin up is to get the machine to near total load nd hold it there for some time.
Disappointing for those of us who don't care about power consumption in a desktop.
>macOS Sequoia completes the new Mac Studio experience with a host of exciting features, including iPhone Mirroring, which allows users to wirelessly interact with their iPhone, its apps, and notifications directly from their Mac.
So that's their highlight for a pro workstation user.
If that sounds too negative, compare their current vision for their products with Steve Jobs old vision of "a bicycle for the mind". iOS-type devices are very useful, but unleashing new potential, enabling generational software innovation, just isn't their thing.
(The Vision Pro is "just" another kiosk product for now, but it is hard to tell. The Mac support suggests they MIGHT get it. They should bifurcate:
1. A "Vision" can be the lower cost iOS type device, cool apps and movies product. Virtual Mac screen.
2. A future "Vision Pro that is a complete Mac replacement, the new high end Apple device, filled out spacial user interface for real work, etc. No development sandbox, Mx Ultra, top end resolution and field of view, raise the price, raise the price again, please. It could even do the reverse kind of support, power external screens that continued working like first class virtual screens, when you needed to share into the real world.
The Vision Pro should become a maximum powered post-Mac device. Not another Mac satellite. Its user interface possibilities go far beyond what Mac/physical screens will ever do. The new nuclear powered bicycle for the mind. But I greatly fear they want to box it in, "iPad" everything, even the Mac someday.)
$400 to go from 1TB to 2GB.
$307/TB to go from 1TB to 16TB.
That is 3 times the Amazon prices: https://diskprices.com/?locale=us&condition=new&capacity=4-&...
So far it's been working quite well with the exception that VSCode does not seem to understand how to update itself if you keep it in the external Applications folder: every time it tries to update itself it just deletes itself instead. Moved it back into the /Applications folder and it's been fine.
Mac OS has always been able to do this.
I don't get why they couldn't be arsed to stuff a few m.2 slots in there. They could keep the main nand their weird soldered on BS with the firmware stuffed in a special partition if they want. Just give us more room!
"includes six internal PCIe 4.0 slots for expansion. It does not support discrete GPUs over PCIe." uhhh, so in case people want an AS chip with most stuff soldered on but also really need certain PCIe cards that aren't GPUs?
I don't mind them charging say $50 "Apple premium" for the fact it's a proprietary board and needs firmware loading onto the flash but the multiplicative pricing is bullshit price gouging and nothing more.
And most (me included) would still end up buying the device anyways, maybe just with less storage than they want. And then need to upgrade earlier.
From Apple’s perspective, they seem to have figured it out.
And maybe the upgraded configurations somewhat subsidize the lower end configurations?
It's the beauty of having a product with no real competition in the market.
(BTW, I use Linux as my home and work OS But I'm a super geek and 20+ years full stack dev... not their target market, as I can handle the quirks and thousand papercuts of Linux)
1: https://www.ifixit.com/Guide/How+to+Replace+the+SSD+in+your+...
I'd expect an upgrade route for the new Mac Studio will appear.
Here's one YouTube video showing an upgrade to 8TB of SSD storage. see https://www.youtube.com/watch?v=HDFCurB3-0Q
The the question is if a llm will run with usable performance at that scale? The point is there's diminishing returns despite having enough uRAM with the same amount of memory bandwidth even with increased processing speed of the new chip m3 for AI.
Yes.
The reason: MoE. They are able to run at a good speed because they don't load all of the weights into the GPU cores.
For instance, DeepSeek R1 uses 404 GB in Q4 quantization[0], containing 256 experts of which 8 are routed to[1] (very roughly 13 GB per forward pass). With a memory bandwidth of 800 GB/s[3], the Mac Studio will be able to output 800/13 = 62 tokens per second.
[0]: https://ollama.com/library/deepseek-r1
[1]: https://arxiv.org/pdf/2412.19437
[2]: https://www.apple.com/newsroom/2025/03/apple-unveils-new-mac...
You don’t know which expert you’ll need for each layer, so you either keep them all loaded in memory or stream them from disk
That said, it is possible to train a model in a quantization-aware way[2][3], which improves the quality a bit, although not higher than the raw model.
Also, a loss in quality may not be perceptible in a specific use-case. Famously LMArena.ai tested Llama 3.1 405B with bf16 and fp8, and the latter was only 2 Elo points below, well within measurement error.
[0]: https://github.com/ggml-org/llama.cpp/blob/master/examples/q...
[1]: https://github.com/ggml-org/llama.cpp/discussions/5063#discu...
But if you don't already know the question your asking is not at all something I could distill down into a sentence or to that would make sense to a lay-person. Even then I know I couldn't distill it at all sorry.
Edit: I found this link I referenced above on quantized models by bartowski on huggingface https://huggingface.co/bartowski/Qwen2.5-Coder-14B-GGUF#whic...
For bigger models (in range of 8B - 70B) the Q4KM is very good, there are no any degradation compared to full FP16 models.
It could track different hardware configurations and reasonably standardized benchmark performance per model. I know there's benchmarks buried in github Llama repository.
We need a SWE-bench for open source LLM's and for each model to have 3Dmark like benchmarks on various hardware setups.
I did find this which seems very helpful but is missing the latest models and hardware options. https://kamilstanuch.github.io/LLM-token-generation-simulato...
I get why he calls it a simulator, as it can simulate token output. It's an important aspect for evaluating use case if you need to get a sense of how much token output is relevant beyond the simple tokens per second text.
Not the size/amount, but the memory bandwidth usually is.
Mac ecosystem is starting to feel like the PC world. Just give me 3 options. Cheap, good, and expensive. Having to decide how many dedicated graphic cores for a teenagers laptop is impossible.
For example, I got the M1 Max when it was new. A year later the M2 came out. Spec-wise, the M1 Max was still a bit better than the M2 Pro in many regards. To me, getting a Max buys you some future proofing if you or your company can afford it (and you need that kind of performance). I use the Max with a lot of video work, and it's been fantastic.
Imagine, my Apple TV doesn’t even have a power button! My MacBook tells at me if I accidentally press it when doing a TouchID!
Worst of all, it always worked fine on my previous Hackintosh!
I'm never going to pay 10k for that though. Hopefully cheaper hardware options are coming soon.
Maybe there's not much market right now, but who knows if DeepkSeek R3 or whatever will need something like this.
It would be awesome to be able to have a high-performance local-only coding assistant for example (or any other LLM apllication for that matter).
I still see laptops selling with 8GB memory, and IMO we should be well past this by now with, IMO, 32GB minimum. My work laptop still only has 16GB.
How often are you using the power button on your Mini? What is your use case?
Maybe Apple should remove power off from the UI menus if they're claiming it uses less energy to leave it on.
(I'm dubious of that claim people are repeating here, but what the hell do I know I'm just a physicist. Reality distortion isn't my thing.)
The mini is probably less power hungry than the macbooks (less components). I have some wifi 5/ac routers that consume more power at idle (nothing connected to them) than apple laptops.
Every single day, not by choice but because it's constantly waking up in sleep mode to do maintenance task then overheating and shutting down again. Something about macOS and Bluetooth devices not playing nice.
If you never turn off your computer, it makes sense that you never use the power button. But some people do turn their computers off, and for us, it's really useful to be able to turn them on again.
Even if you can power it on using a wired keyboard though, I'm certain that you can imagine people who prefer wireless keyboards but also turn their computer off.
Put simply, more people like the aesthetic of no visible power button than like the aesthetic of daily rebooting their computer.
If I were you, and I really couldn’t let go of that, I’d put the Mac in sleep and have it scripted to restart at e.g. 6AM each day. You get the best of both worlds. Feel like you have a “fresh” Mac every morning. Let it do its updates and whatnot behind the scenes.
What people are using for LLMa on macs, is it ggml ?
There's also Apple CoreML, which is sort of like ONNX in that it provides a limited set of primitives but if you can compile your model into its format, it does good low-power edge inference using custom hardware (Neural Engine).
Apple also provide PyTorch with MPS, as well as a bunch of research libraries for training / development (axlearn, which is built on JAX/XLA, for example).
They also have a custom framework, Accelerate, which provides the usual linear algebra primitives using a custom matrix ISA (AMX), and on top of that, MLX, which is like fancy accelerated numpy with both Metal and AMX backends (and slower CPU backends for NEON and AVX).
Overall, there's a lot you can do with AI on Apple Silicon. Apple are clearly investing heavily in the space and the tools are pretty good.
(Maybe this is a feature that Apple has supported for a while, but I am unaware) Does this mean they will be officially supporting all PCIe devices like GPUs? Or do they only mean certain PCIe components like SSD expansions and network interfaces?
Edit: BlackMagic was what I was thinking of https://support.apple.com/en-gb/102103. 'Requires intel processor'
For me, the main value proposition from Macs are in their laptop offerings.
this new one comes with up to 512GB of unified RAM!
The M3 Ultra seems strictly better but is also significantly more expensive.
I'm looking forward to trying Nvidia's little set top box if it actually ships, should have higher memory bandwidth, but still Ill probably set up a system where I email a query with attachments and just let DeepSeek email me back once it's finished with reasoning at 10T/s
Snark aside in case you're seriously asking, it's a PR thing. Generation to generation might not show much differences in direct comparisons that makes the crowd ooh and ahh. The M1 chip was the first Apple Silicone chip, so going back to their first one for basis of comparison provides for more oohs and ahhs. The charts look pretty this way too.
For people who are making a forced purchase, comparisons don’t matter.
For people who are content with what they’ve got, comparing with the oldest-popular market segment offers a clear statement of improvement and helps long-term users calibrate when to make their next purchase.
My partner and I are both still on M1 (our personal machines) and don’t really see the need to upgrade.
I hate that this naming shit has gotten so bad
I suppose I commented here because I think people are letting their subjective distaste for those terms sway their opinion of a superior naming scheme.
Is an Intel 10700K faster than a 12400F? The generations are different but the chips have vastly different capabilities and features.
M4 is the generation. The modifier modifies the generation. M4 Pro is an M4 with some extra pizzaz. M4 Max is an M4 with lots of extra pizzaz.
wtf are you talking about? Intel, Nvidia, and AMD all absolutely have complete specs for their products readily available on their respective websites. Much, much more complete ones than Apple does as well.
But the nice thing is you search the model name, and Intel gives you all the specs upfront.
Remember when it was just a MacBook, or air, or pro and it had a year.
It’s definitely a little odd to have M3 Ultra > M4 Max, but I feel like anyone complaining about this must have never bought any other manufacturers’ wares in their lives. Obtuse complication is kind of the norm in this industry.
(For reasons best known to themselves, Apple made two completely different 13" MBPs that year, both new, with the loathed butterfly keyboard, weighing a different amount, with different processors, and the same name.)