30 pointsby tosh6 days ago5 comments
  • aktaua day ago
    > Rayforce is a pure C17 zero-dependency embeddable engine where columnar analytics and graph traversals share a single operation DAG, pass through a multi-pass optimizer, and execute as fused morsel-driven bytecode. No malloc.

    Sounds great. I hadn't seen an (explicitly) C17 project before. I wonder which features of it they use. I can only find very scant references in the depot (E.g.: https://github.com/RayforceDB/rayforce/blob/6c4b1eddad0ea728...).

    Anyone know?

  • arikrahman3 days ago
    Any relation to the smash hit SHMUP of the same name?
  • cma2563 days ago
    > Rayforce is a library you link, not a server you deploy. The C API is small enough to wrap from any language with an FFI.

    I'm familiar with large-scale, commercial, client-server use cases for columnar analytics and graph traversal but what is the use case for an embedded server like this?

    • 3923 days ago
      Perhaps the fact that 99+% of today's workloads could be running on the client if it were as easy as shipping Rayforce and the data directly to the client.

      Besides that, pure C that you can embed into your app is much easier to deploy for some (and likely 100x more performant) than stuff that comes via Helm chart [cries in JVM 'big'-data solutions]

    • hetoku3 days ago
      [dead]
  • noitpmeder2 days ago
    Is this competing with e.g. an embedded duckdb?
  • noelwelsh3 days ago
    I thought "morsel-driven" was AI slop, but it turns out to be in common usage in the HPC world. So I learned something from this post!
    • tosh3 days ago
      afaiu morsel-driven means the workload gets turned into 'smallish' chunks (morsels)

      instead of having to pre-allocate upfront (e.g. 4 nodes get 1/4 each) it is more granular and dynamic

      a worker that's "done" can request another morsel

      pragmatic approach because nodes might not all be equally fast (cache, cpu frequency, throttling, …) and also some morsel workloads take longer than others depending on the values they contain and what kind of work needs to get done

      so this approach tends to balance out nicely

      I'm sure someone else can explain it better / correct me (please do!)

      • noelwelsh3 days ago
        When I read up, it sounded like the same idea as work-stealing to me. Not surprising that different fields come up with the same idea under different terminology.
      • adsharma3 days ago
        DuckDB and LadybugDB use the same terminology to describe internals.
      • hetoku3 days ago
        Exactly!
    • hetoku3 days ago
      [flagged]