pierridotitea month ago
A quick note on the implementation details for those interested in compilers:
The hardest part wasn't the AD itself, but managing memory safety during the "growth" phase. Since NOMA compiles to native code (LLVM), I had to ensure that when a weight buffer gets realloc'd (moved in memory):
The gradient tape updates its pointers.
The optimizer state (Adam moments) is correctly mapped to the new indices.
The benchmark I linked shows the result: "Preserving" this state allows the model to continue converging immediately after resizing, whereas "Resetting" it causes a massive performance regression.
I'm specifically curious if anyone here has experience with handling SSA Phi-nodes during reverse-mode AD on the Control Flow Graph? That's my next big hurdle for supporting complex control flow.
cyliciuma month ago
Hey ! I Saw a post that u done on reddit how can I help if I wl to contibute ?
- pierridotitea month ago
  Go on our discord :) We can help u to find an issue and some implementation that could be usefull as demo or others x)