For example, implement linear regression but the example solution uses a random number generator without a fixed seed. It’s fine, reproducibility isn’t the point, but leetcode problems are more structured.
In leetcode they usually don’t tell you exactly what data structure you must use, only that it must pass certain test cases. By analogy this might not tell you which architecture to use but require that it passes certain eval metrics.
What would take this repo to the next level is to have a reproducible data generation function for each exercise as well as a reasonable metric which must be passed. I don’t see anything that requires my classification auc to be over 0.5 which would be a basic criteria of bug-free code.
I was reverse engineering the ML interview pipeline for myself and that's how I stumbled upon all this.
I think the data aspect does make sense tho. I might add that as the next thing to do
I mean...this entire project appears to be mostly GPT-generated?