2 pointsby mdp20214 hours ago2 comments
  • mdp20213 hours ago
    > Over the past few years, inference-specific chip start-ups were experiencing a sort of Cambrian explosion, with different companies exploring distinct approaches to speed up the task. The start-ups include D-matrix with digital in-memory compute, Etched with an ASIC for transformer inference, RainAI with neuromorphic chips, EnCharge with analog in-memory compute, Tensordyne with logarithmic math to make AI computations more efficient, FuriosaAI with hardware optimized for tensor operation rather than vector-matrix multiplication, and others

    Let us add Taalas, which implemented Llama3 8b on hardware to achieve 17000 tokens/s at a small fraction of the usual power consumption. Quality test at https://chatjimmy.ai/

  • mdp20213 hours ago
    (General question: which Unicode can we use? To prepend a category icon to the article date - news but not the freshest -, I tried to use "Calendar" from the "Miscellaneous" set and it was not accepted; I then tried "Hourglass" from "Miscellaneous Technical" and it seems to work.)