46 pointsby mousomashakel7 days ago6 comments
  • uwemaurer7 days ago
    Always great to see efforts to make working with data frames easier. Here are some similar data frame libraries for Java:

    https://github.com/jtablesaw/tablesaw

    https://github.com/dflib/dflib

    My preferred way is just use duckdb java API. I didn't see anything better in performance/efficiency. Also a SQL query is often easier to write

    • mousomashakel6 days ago
      Thanks! I'm aware of those great projects. Fahmatrix aims to offer a lightweight, dependency-free alternative that’s easy to embed in any Java app. DuckDB is super impressive, especially for SQL-heavy tasks — but my goal is more about a native, fluent API for those who prefer direct Java code over SQL.
    • theanonymousone7 days ago
      Yes. It has bothered me for a long time too. Maybe the best mix is a dataframe library with basic operations (column select, non-null etc), which also allows SQL for more complex stuff?
      • mousomashakel6 days ago
        Totally agree that SQL can be the best tool for many jobs. My goal with Fahmatrix is to serve the opposite niche: where devs want something that's Java-native, procedural, and simple without reaching for an external engine. SQL support or DSL might come later though — I see the appeal.
      • radus6 days ago
        Polars and duckdb interoperate nicely and can enable this flexibility
  • rickette7 days ago
    Congrats on putting this out there. There isn't a de facto pandas-like library in Java like you said. But for Kotlin there is: https://github.com/Kotlin/dataframe
    • mousomashakel6 days ago
      Thanks so much! Yep, I’ve seen the Kotlin DataFrame lib — very elegant. Fahmatrix is meant for plain Java users who want similar capabilities without switching ecosystems. Appreciate the support!
  • skanga7 days ago
    What about Tablesaw, Apache Arrow? How does this compare ...
    • mousomashakel6 days ago
      Good question. I’ll publish benchmarks soon, but the core difference is that Fahmatrix is fully Java, no JNI, and minimalistic — ideal for small projects or environments like Android. Tablesaw and Arrow are more powerful, but heavier. Fahmatrix aims to be the “just enough” middle ground.
  • owlstuffing6 days ago
    Nice!

    I’m currently using manifold-sql with duckdb for this.

    • mousomashakel6 days ago
      Thanks! That’s a great combo — manifold-sql + duckdb gives you strong typing with powerful SQL under the hood. Fahmatrix is aiming to complement that approach for cases where you want quick, native Java code without SQL — e.g., when building data flows or custom logic inline. Would love to hear if you’ve hit any pain points that a Java-native approach could help with.
  • gitroom7 days ago
    [dead]
  • jurgenaut237 days ago
    [flagged]