2 pointsby zokrezyl4 hours ago2 comments
  • camel-cdran hour ago
    The problem should be equivalent to: https://www.reddit.com/r/simd/comments/1hmwukl/mask_calculat...

    Falvyu's and bremac's solution seems to be the best.

  • pestatije3 hours ago
    wheres the code?...have a look at codereview[5], the whole site is geared for this kind of challenges

    [5] codereview.stackexchange.com

    • zokrezyl2 hours ago
      I do not have one "implementation" but have been trying with different approaches that all delivered under 50% of memory bandwith... I guess if anyone can purpose a solution should be from scratch... The problem is that all approaches I tried end up generating unpredictable branches that do not allow the CPU to optimally keep loading text from memory.
    • zokrezyl2 hours ago