https://github.com/jtablesaw/tablesaw
https://github.com/dflib/dflib
My preferred way is just use duckdb java API. I didn't see anything better in performance/efficiency. Also a SQL query is often easier to write
Next step would likely be compatibility with popular libraries such as Apache Commons Math: https://commons.apache.org/proper/commons-math/userguide/sta...
I’m currently using manifold-sql with duckdb for this.