Only nitpick I have is that it's a pity you use only 1 and 2 in the example with the carbs. Because of the symmetry it makes it harder to see which column/row matches which part of the vector/matrix because there's only 1s and 2s and it fits both horizontally and vertically...
I agree with the order, the Gaussian should come later I almost closed the article - glad I kept scrolling out of curiosity.
Also I felt like I had been primed to think about nickles and pennies as variables rather than coefficients due to the color scheme, so when I got to the food section I naturally expected to see the column picture first.
When I encountered the carb/protein matrix instead, I perceived it in the form:
[A][x], where the x is [milk bread].T
so I naturally perceived the matrix as a transformation and saw the food items as variables about to be "passed through" the matrix.
But another part of my brain immediately recognized the matrix as a dataset of feature vectors, [[milk].T [bread].T], yearning for y = f(W @ x).
I was never able to resolve this tension in my mind...
The (an) answer is that since the LHS and RHS are equal, you can choose to add or subtract them to another equation and preserve equality.
If I remember correctly, substitution (isolating x or y) was introduced before this technique.
>> The trouble starts when you have two variables, and you need to combine them in different ways to hit two different numbers. That’s when Gaussian elimination comes in.
>> In the last one we were trying to make 23 cents with nickels and pennies. Here we have two foods. One is milk, the other is bread. They both have some macros in terms of carbs and protein:
>> and now we want to figure out how many of each we need to eat to hit this target of 5 carbs and 7 protein.
+ there are many textbooks on LA. Not a lot of them introduce stuff in the same order or in the same manner. I think that's part of why LA is difficult to teach, and difficult to comprehend, and maybe there is no unique way to do it, so we kinda need all the perspectives we can get.
As an aside, Avro.im looks awesome!
For those unfamiliar with vectors, it might be helpful to briefly explain how the two vectors (their magnitude and direction) represent the one bread and one milk and how vectors can be moved around and added to each other.
Introduction to Applied Linear Algebra – Vectors, Matrices, and Least Squares
I've done math on KA academy up to linear algebra, with other resources / textbooks / et al. depending on the topic.
People will recommend 3B1B, Strang (MIT OCW Lin Alg lessons). For me the 3B1B is too "intuitionist" for a first serious pass, and Strang can be wonderful but then go off on a tangent during a lecture that I can't follow, it's a staple resource that I use alongside others.
LADR4e is also nice but I can't follow the proofs there sadly (yet). There is also 'Linear Algebra done wrong', as well as the Hefferon book, which all end up being proof-y quite quickly. They seem like they'll be good for a second / third pass at a linear algebra.
Side note - for a second or a third pass in LA it seems there is such a thing as 'abstract linear algebra' as a subject and the texbooks there don't seem that much harder to follow than the "basic" linear algebra ones designated for a second pass.
I've gotten off to the most of a start with ROB101 textbook (https://github.com/michiganrobotics/rob101/blob/main/Fall%20...), up until linear dependence / independence, along the MIT Strang lectures. ROB101 is nice as it deals with the coding aspect of it all, and I can follow in my head as I am used to the coding aspect of ML / AI.
I also have a couple obscure eastern european math texbook(s) for practice assignments.
Most lately I have been reviewing this course / book - https://www.math.ucdavis.edu/~linear/ (which has cool notes at https://www.math.ucdavis.edu/~linear/old), and getting a lot of mileage from https://math.berkeley.edu/~arash/54/notes/.
I love the 3B1B videos, but I've noticed my attention tends to drift when watching videos. I've learned that I absorb information best through text. For me, videos work well as a supplement, but not as the main way to learn.
Thanks again.
https://news.ycombinator.com/item?id=45110857
https://news.ycombinator.com/item?id=45088830
The OP's article though simple, still does not really explain things intuitively. The key is to understand the concept of a Vector from multiple perspectives/coordinate systems and map the operations on vectors to movements/calculations in the coordinate space (i.e. 2D/3D/n-space). Only then will Vector Spaces/Matrices/etc. become intelligible and we can begin to look at Physical problems naturally in terms of vectors/vector calculus.
The following are helpful here;
1) About Vectors by Banesh Hoffmann.
2) A History of Vector Analysis: The Evolution of the Idea of a Vectorial System by Michael Crowe.
Apply directly... to what? IMO it is weird to learn theory (like linear algebra) expressly for practical reasons: surely one could just pick up a book on those practical applications and learn the theory along the way? And if in this process, you end up really needing the theory then certainly there is no substitute for learning the theory no matter how dense it is.
For example, linear algebra is very important to learning quantum mechanics. But if someone wanted to learn linear algebra for this reason they should read quantum mechanics textbooks, not linear algebra textbooks.
So I'm looking for resources that bridge the gap, not purely computational "cookbook" type resources but also not proof-heavy textbooks. Ideally something that builds intuition for the structures and operations that show up all over ML.
https://math.mit.edu/~gs/learningfromdata/
Although if your goal is to learn ML you should probably focus on that first and foremost, then after a while you will see which concepts from linear algebra keep appearing (for example, singular value decomposition, positive definite matrices, etc) and work your way back from there
I hadn't known about Learning from Data. Thank you for the link!
Less popular techniques like normalizing flows do need that but instead of SVD they directly design transformations that are easier to invert.
QPs are solved by finding the roots (aka zeroes) of the KKT conditions, basically finding points where the derivative is zero. This is done by solving a linear system of equations Ax=b. Warm starting QP solvers try to factorize the matrices in the QP formulation through LU decomposition or any other method. This works well if you have a linear model, but it doesn't if the model changes, because your factorization becomes obsolete.
Same, and I think ML is a perfect use case for this. I also have a series for that coming.
> How would a point sit on both lines? Well, it would be where the lines cross. Since these are straight lines, the lines cross only once, which makes sense because there’s only a single milk and bread combo that would get you to exactly five grams of carbs and seven grams of protein.
Geez. It's obvious that two straight lines can only cross once. It's not obvious that there's only one combination of discrete servings of bread and milk that can hit a particular target.
(It's so non-obvious that, in the general case, it isn't even true. Elimination might give you a row with all zeros.)
The fact that the solution is unique makes sense if you realize it must sit on these two lines. It makes far less sense to explain the fact that the two lines only cross once by channeling the external knowledge that the solution is unique. How did we learn that?
Try actual problems that require you to use these tools and the inter-relationships between them, where it becomes blindingly obvious why they exist. Calculus is a prime example and it’s comical most students find Calculus hard because their LA is weak. But Calculus has extensive uses, just not for doing basic carb counting.
I don’t have an axe to grind against the site I think it’s fine, but if someone wants to learn LA, a college level course followed by an intense grind of word problems and having to work backwards and forwards and finding flaws in answers might be a better way to develop the noggin for it. Just my 2c.
Below a certain level of complexity the human brain is much faster and efficient operating on abstract symbols, like 'x' and 'y'. You can solve equations and figure things out in a fraction of the time it takes you to visualize bananas, goats, coins, bread, milk, etc.
Visualizations have a role in developing intuitions about complex structures, such as what the a matrix does to a vector or what cosine similarity means, and so on.
But in recent years, everyone and the next man has suddenly assumed that visualizing the number 1 or 2 in terms of every day objects somehow helps learning. It doesn't.
> But in recent years
Just to expand on this a bit: I have been teaching this way since at least 2016, when I published a book on algorithms called Grokking Algorithms. It is an illustrated guide to algorithms. If you didn't like this post, I imagine you won't like the book either :)
Here is an interview I did with Corey Quinn where I talk more about my teaching philosophy: https://www.youtube.com/watch?v=lZFvTTgR-V4
https://youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFit...
Highly recommended !
Also, HT to your user name! Egon Schiele is one of my favorite artists! Loved seeing his works at the Neue in NYC.
B: I miss scroll bars. I really, really miss scroll bars.
This is where the math nerds just can't help themselves, and I'm here for it. However, these things drive me crazy at the same time. You cannot have -4 nickels. In pure math with only x and y, sure those values can be negative. But when using real world examples using physical objects, no, you cannot have a negative nickel. Maybe you owe your mate the value of 4 nickels, but that's outside the scope of this lesson. Your negative nickels are not in another dimension (because again, the math works that way). You want to help people understand math with real world concepts but then go and confuse things with pure math concepts. And these negative nickels are still not even getting into imaginary nickels territory like you have square root of -4 nickels.