71 pointsby rbanffy9 days ago9 comments
  • llm_nerd9 days ago
    For those who wonder where this fits relative to platforms we are usually more accustomed to, they sell these to banks and financial services that have a long history of mainframes and are basically just upgrading in place within minimal change. With recent iterations they've added some AI processing on the silicon, offering baby steps to imbuing solutions like fraud detection with neural nets on chip.

    But to put it in context, the 24 TOPS that they advertise -- the inference performance of their AI module on their Telum II -- doesn't even match an M4's neural engine (40 TOPS). And of course compared to a dedicated device like an H100 SXM that can hit 4000 TOPS (yes, 166x more). Of course for both the M4 and H100 chip I'm giving quantized numbers, but presumably the Telum II is as it boasts about its quantized support.

    Massive caches. Tonnes of memory support. Neat device. A "you won't get fired for leasing this" solution for a few of the Fortune 500.

    But you can almost certainly build a magnitudes faster device in just about every dimension using more traditional hardware stacks and for a fraction of the price.

    • claudex9 days ago
      It's 24 TOPS per CPU, so 192 TOPS for a full mainframe. And you can add Spyre accelerator which have 32 AI unit per card (so 32x24 TOPS) and up to 48 cards per mainframe (so 48x32x24 = 36 864 TOPS). But yeah, you could buy a lot of H100 for the price of a mainframe with such a configuration.
    • rswail9 days ago
      > But you can almost certainly build a magnitudes faster device in just about every dimension using more traditional hardware stacks and for a fraction of the price.

      Using the sentence "more traditional hardware stacks" when comparing to mainframes is sorta funny.

    • UltraSane9 days ago
      IBM strictly forbids anyone from publishing performance benchmarks of their mainframes which should tell you that they are pretty slow for the money.
      • hulitu9 days ago
        > which should tell you that they are pretty slow for the money.

        It depends. Some years ago, the memory was working at half the bus speed.

        • UltraSane8 days ago
          If IBM mainframes had amazing MIPS/$ IBM would make sure the world knows it.
          • rbanffy8 days ago
            CPU MIPS is just a small part of the equation here. A mainframe has lots of specialized hardware dealing with IO that allow the CPU to be running application code close to 100% of the time. A pure CPU intensive benchmark would be misleading and the restriction on publishing benchmarks is not really aimed towards the users, but towards the competition - who'd be more than happy to compare machines with different designs and different performance characteristics to IBM's designs.

            MIPS per MIPS, IBM itself offers solutions that have higher processing power than mainframes in their POWER and IBMi series of machines. And you can surely assemble an x86 or ARM monster with a lot of cores, but it won't match the performance of the Z for the workloads it is designed for - on-line transaction processing.

            • UltraSane7 days ago
              I'm sure the absolute performance of the Telus II CPU is very good since it runs at 5.5Ghz and has 36MB of L2 cache per core. But I'm talking about price per dollar and this is where IBM mainframes fail miserably because IBM charges through the nose for them.

              I would not be surprised if a IBM mainframe costs 10x of what a x86 or ARM server of equal performance does.

              • rbanffy6 days ago
                Performance is not a single number. Mainframes are terrible choices for HPC, AI training, data analytics, etc. Where they excel is that narrow niche of high-volume, real-time transaction processing. In order to get there, the whole design of the machine is carefully balanced - the right number of cores per socket, the cache sizes (and the dynamic allocation of L3 and L4 according to pressures, trying to minimize eviction) all the way to CICS, z/OS and the way they allocate memory and processing power.
                • UltraSane6 days ago
                  The main selling points of z/OS and hardware is the amazing uptime and 60 years of backwards compatibility. There isn't a single computational workload that a IBM mainframe can do cheaper than a x86 server/cluster.
  • CuriousRose9 days ago
    Being 50% faster (article based guess-timate) what’s the average time that a mainframe stays in the fleet? Surely significant energy and data centre floor space savings might even capture some customers from the previous generation to upgrade? Or is it a 5-10 year cycle?
    • trollbridge9 days ago
      Mainframes are often leased, and with leasing charges based on how many CPU cores are used (regardless of how many are physically present).
      • rbanffy9 days ago
        One cool feature of the z16 was that all cores can be activated during startup, and then all cores that aren't licensed (paid for) shut down and the performance drops to the agreed limit. I do that with my KVM machines on the server under my desk - all are configured with more cores than needed and, when boot finishes, most cores are removed from process scheduling leaving just the amount the VM is supposed to use during normal operation. The result is a significantly faster start up and significantly lower power consumption (and resource contention).
        • dijit9 days ago
          Blog post worthy.

          I wasn’t even aware you could tell the kernel not to schedule on some cores.

          Yet, its as easy as:

              echo 0 > /sys/devices/system/cpu/cpu{num}/online
          • UltraSane9 days ago
            I recently learned you can do this on Windows also. The kernel won't schedule anything to run on a the reserved cores but you can pin high priority processes to them. Good for a polling loop.
        • anonfordays9 days ago
          Your comment isn't clear as to why. If you assign two vCPUs to a KVM VM, why would limiting the VM's kernel to only scheduling on one vCPU increase the speed? The vCPUs are typically all on one physical CPU anyway, unlike a mainframe. You can also set affinity/processor pinning in KVM if you have more than one physical CPU/core, with the VM being none the wiser.
          • rbanffy8 days ago
            This caps host CPU usage from the VM after startup processes finish. During startup a process in a VM can use all of the host cores the VM was configured with. Then, when startup is finished and we start doing what the VM is supposed to do, its core usage is limited, preventing noisy neighbors problem. Individual VM performance will be lower, but the host will be more responsive.
    • vb-84489 days ago
      It depends on IBMs sales target.

      Some customer may change it even every 3-5 years, but most i saw were between 5 and 7.

      Energy and data centre floor space are not a real thing here, the mainframe usage of physical resource is nothing compared to other systems.

    • rbanffy9 days ago
      I think that most z17's sold are already planned acquisitions, from either decommissioning older machines, capacity expansion, or workload consolidation (say, a 4-node z13 parallel sysplex moving to a 2-node z17). I don't think anyone in their right mind would run a machine like this (for business-critical apps) for more than the extended warranty.
    • Spooky239 days ago
      Closer to 5 usually. You’re leasing MIPS when you buy these things, so there’s a lot of margin and room for finance games.

      It was a crazy business. You’d buy services and other crap in your mainframe deal, but the consultants are really sales guys who have their nose in your tent. Fully captured businesses in insurance and government especially would even use IBMs brand alignment for computers.

      I had to close out a deal with them at a past employer when my peer had a medical emergency. It was a very eye opening experience.

      • AnimalMuppet9 days ago
        > would even use IBMs brand alignment for computers.

        For those of us not in that world, could you explain what that means?

        And, could you give some details on your last paragraph?

  • rbanffy9 days ago
    Based on the Telum II chip shown at last year's Hot Chips conference.

    https://chipsandcheese.com/p/telum-ii-at-hot-chips-2024-main...

    • theandrewbailey9 days ago
      > IBM’s solution is to reduce data duplication within its caches, and re-use the chip’s massive L2 capacity to create a virtual L3 cache.

      I was kind of fascinated with the original Telum CPUs because of this feature. Do we know if other CPU designers are planning a similar virtual cache feature? Or is this feature implemented as a workaround due to some other deficiency that others don't have?

      • UltraSane9 days ago
        It is needed for the Telum chips because they have so much L2 cache per core (36MB) there isn't room for any L3 cache
        • rbanffy8 days ago
          They could reduce core count to make space for on-chip L3, and add a massive amount of cache to the drawer controller, like the was on the z15, but this is way more efficient. They also keep the in-drawer and cross-socket latencies very low, forming a virtual L4 with all chips in the drawer. In the end, it's a more efficient way to allocate silicon.
          • UltraSane7 days ago
            Would this architecture work on x86 or ARM CPUs?

            I'm surprised they didn't use HBM3 for CPU cache.

  • memset8 days ago
    I interned at IBM writing mainframe software in 2008 or so. One thing I remember them saying - there used to be TV commercials - that a single mainframe could replace a rooms worth of commodity hardware.

    I would have assumed that someone would have started a cloud provider with Linux VMs running on mainframes instead of racks of pizza boxes. What was missing - are the economics of mainframes really that bad?

    • ASalazarMX8 days ago
      Mainframes are the polar opposite of commodity hardware. Those pizza boxes are commodity because they're plentifully available, and you can mix-and-match if needed, there's nothing cheaper for you to run your VMs on. Running them on a mainframe would put IBM as a middle man between you and your business.

      Also, mainframes/midranges are all about stability, security, and robustness, not performance. For example, IBM i (the OS, not the machine) has a hardware dependent layer, and a hardware-independent one. This allows for drastic hardware changes without affecting the high-level applications. A monolith would be arguably more efficient, but it matters more that the hardware-independent layer stays rock-solid.

      https://en.wikipedia.org/wiki/IBM_i#Architecture

    • pabs38 days ago
      IBM offer a cloud to open source folks already, not sure if they have a commercial one though.

      https://developer.ibm.com/articles/get-started-with-ibm-linu... https://wiki.debian.org/Hardware/Wanted#Other

  • pabs38 days ago
    Where does IBM get their CPUs fabbed? I assume they don't have their own fabs these days?

    Edit: its "Samsung’s leading edge 5 nm process node" according to the Telum II article linked from one of the other comments here.

    https://chipsandcheese.com/p/telum-ii-at-hot-chips-2024-main...

    Surprising they aren't using TSMC like pretty much everyone else does.

  • sillywalk8 days ago
    A link to the IBM Redbook Technical Introduction [PDF]:

    https://www.redbooks.ibm.com/redbooks/pdfs/sg248580.pdf

  • a0129 days ago
    It’s funny that they managed to slap AI buzz into the mainframe
    • myself2489 days ago
      It's a real thing -- one of the oft-cited usage examples is running fraud detection while processing credit-card transactions. Customers want to add this powerful capability to that bedrock business process.
      • xattt9 days ago
        What are the heuristics that AI models would use for a given transaction? Is it essentially just a “vibe check”?
        • DougN79 days ago
          I’ve been learning a bit about it after learning about RRCF (Robust Random Cut Forests). The ISOTree library is really well documented if you want to play with it.
        • spicybbq9 days ago
          They use machine learning, pattern matching on what "good" and "bad" requests look like, based on experience.
        • UltraSane9 days ago
          complex Bayesian statistical models updated in real-time.
      • skyyler9 days ago
        AI feels really fuzzy compared to the types of work I normally associate with big iron.
      • gosub1009 days ago
        You can detect fraud using non-AI solutions, but this way if it does something racist or shuts off an innocent person's access to money, they can shrug and blame it on the AI, then announce immediate "comprehensive measures" to prevent it in the future.
    • UltraSane9 days ago
      They started this back in 2020 at least so it predates the LLM hype. They intended it to be used to apply compute expensive fraud detection algorithms to every transaction.
  • NitpickLawyer9 days ago
    IBM - mainframe - Zxx CPUs, what decade is this?!
    • cess119 days ago
      Most organisations never come across a problem that requires neatly clustered high-end hardware.

      "[T]he IBM Telum II processors are eight cores running at 5.5GHz and with a 360MB virtual L3 cache and 2.88GB virtual L4 cache. Telum II brings a new data processing unit, I/O advancements, and other features to help accelerate AI workloads."

      But if you do, if you actually have such massive data streams and low tolerance for latencies that sharding them over many machines costs a lot in overhead and throughput slows down intolerably at load peaks, then these machines are most likely a bargain. They allow you to do things very few other can do, resulting in a moat around your business and locks you in with IBM.

      Or you've been around since the seventies and your software largely consists of COBOL, Fortran and assembly for this line of architectures, and it would cost you two decades of the great rewrite while all your developers do very little else, then it's also a bargain to be able to stay in business.

      • rbanffy8 days ago
        The reasoning is like a choice between buying a single box for 10 million, or buying a hundred box cluster for 2 million, then spend 10 million carefully tuning all your software to run your workload on that 2 million cluster.
    • rubyfan9 days ago
      One where legacy businesses still use the cobol programs developed in prior decades.

      I recently had to wait two quarters to launch a product because the only person who knew what some cobol accounting program did was out on leave.

      This is one of the many reasons many big corporations fail to innovate. It is very hard (near impossible) to implement new systems in an environment dominated by old ones (not talking only about software and hardware here, also organizational dynamics).

      • spratzt9 days ago
        You’re absolutely right about the organizational dynamics.

        Many mangers in large companies, derive their status and power from a knowledge of existing business processes and procedures. Any substantive changes to those procedures obviously represent an existential threat to that position and they generally resist it, often very vigorously.

      • Koshkin9 days ago
        I don’t think you necessarily need a mainframe in order to run a program written in COBOL. (There’s also emulation available, to accommodate worst case scenarios.)
        • jabl9 days ago
          There's even a COBOL frontend in GCC these days!

          That being said, the problem isn't so much the COBOL language itself but rather that all the software written in it is connected to all kinds of database system, messaging systems(?) and whatnot, making it very hard to move to some non-mainframe platform even if a customer so chooses. Or to put it another way, it's cheaper to pay even the very high prices to IBM to keep on the mainframe track rather than to migrate.

    • khaledh9 days ago
      One where banks, credit card networks, insurance companies, government, etc. need to process millions of transactions per second on a single mission critical box with unmatched redundancy, resilience, and security, and with a support army backing it.
    • hagbard_c9 days ago
      The one where the chips used in those mainframes have the edge on those used in servers in some important ways - more cache, higher cache bandwidth, lower cache latency. Those developments will eventually make their way into server chips and with a bit of luck IBM (et al) will have developed something else by that time which will eventually enter mainstream.
    • rbanffy9 days ago
      One where no other company has pushed out a processor that can borrow cache from neighbors.
      • 9 days ago
        undefined
    • speed_spread9 days ago
      It's the decade where it all comes crashing down and you're going back to COBOL.
      • Koshkin9 days ago
        COBOL? If such time comes, we’d be lucky to even need the abacus.
        • speed_spread9 days ago
          If we're leaving civilisation altogether, an abacus is just an ammo clip for a slingshot.
        • Woodi9 days ago
          Nope.

          How many times companies was forced to pay for MS Windows, OEM or not ?

          That COBOL apps and os'es used there are all already payd production quality software from XX dot 50's and all that is needed, sometimes, is to put in new piece of hardware. That is totally other civilisation.

          Ok, I may be colorizing but this is a general concept.

          Don't mention 100 years of "software modifications" because windows ecosystem can't win that contest anyway.

  • 9 days ago
    undefined