Most modern math is certainly not "all on computers" and in general not even "mostly on computers". There are definitely proofs for things like testing large spaces exhaustively which are sped up by computers (see the https://en.wikipedia.org/wiki/Four_color_theorem) and definitely for things like visualization (probably one of the oldest uses of computers for math), but usually the real work goes into how math has always been done: identifying patterns and abusing symmetries.
For this one explicitly, if you read through the paper you'll find the statement that the main theorem presented here "does not depend on any computer calculations. However, we have made available files with explicit coordinates for our kissing configurations"
I had a visceral reaction to this. In what sense can a sphere be considered pointy? Almost by definition, it is the volume that minimizes surface area, in any number of dimensions.
I can see how in higher dimensions e.g. a hypersphere has much lower volume than a hypercube. But that's not because the hypersphere became pointy, it's because the corners of the hypercube are increasingly more voluminous relative to the volume of the hypersphere, right?
First, the volume of spheres (or balls rather) in higher dimensions goes to zero as the dimension grows. Said another way, to keep unit volume on a ball you need to grow the radius more and more (which I interpret as spiky).
Second, the volume of spherical caps grows like ~exp(- d h^2 /2), in particular the caps lose volume fast in higher dimensions. To interpret this as "spikyness" I like to visualize it as two balls intersecting (which is just 2x the cap volume). If they are of the same radius, but their centers are just slightly off their intersection volume goes to zero quickly!
There are other ways in which a hypersphere can be considered "pointy", though; for example, consider a point lying on the surface being moved some epsilon distance to a random direction. As the dimension increases, the probability that the point ends up inside the sphere approaches zero – the sphere spans a smaller and smaller fraction of the "sky".
I don't see how it could possibly be zero, even for reals, unless you're relying on the idea that the probability of any given real emerging from a uniform RNG is zero. That would seem to apply in 2D as well.
Random walks can be defined on continuous space and time as a probability distribution on functions R -> R^n (Brownian motion in n dimensions).
We can then ask whether Brownian motion beginning at the origin will ever revisit it i.e.
Given 2D Brownian motion X such that X(0)=(0,0), the probability that there exists t>0 such that X(t)=(0,0) is 1.
Given 3D Brownian motion X such that X(0)=(0,0,0), the probability that there exists t>0 such that X(t)=(0,0,0) is 0. (This is more clearly true when it doesn't begin at the origin, but it's almost certainly not at the origin at t=1, and you can divide the half open interval (0,1] into a countable number of intervals, each of which have 0 probability of passing through the origin.)
Random walks in 2D are space filling curves; random walks in 3D are not.
I also never used a computer for anything other than latex.
[1] https://terrytao.wordpress.com/career-advice/theres-more-to-...
In higher dimensions, are the spheres just a visual metaphor based on the 3-dimensional problem, or are mathematicians really visualising spheres with physical space between them?
Is that even a valid question, or does it just betray my inability to perceive higher dimensions?
This is fascinating and I'm in awe of the people that do this work.
It's not really a metaphor.
An n-sphere is the set of all points that are the same distance away from the same centre, in (n+1)-dimensional space. That generalises perfectly well to any number of dimensions.
In 1 dimension you get 2 points (0-sphere), in 2 dimensions you get a circle (1-sphere), in 3 dimensions you get a sphere (2-sphere), etc.
EDIT: Also, if you slice a plane through a sphere, you get a circle. If you slice a line through a circle, you get 2 points. If you slice a 3d space through a hypersphere in 4d space, do you get a normal sphere? Probably.
If the boundary is included, it's a closed ball, otherwise it's an open ball.
So the sphere is the "skin", the ball is the whole thing.
A bit different than common usage.
That's handwaving the answer just as you were getting to the crux of the matter. "Are mathematicians really visualising spheres with physical space between them" in higher dimensions than 3 (or maybe 4)?
From the experience of some of the bigger minds in mathematics I met during my PhD, they don't actually visualize a practical representation of the sphere in this case since that would be untenable especially in much higher dimensions, like 24 (!). They all "visualized" the equations but in ways that gave them much more insight than you or I might imagine just by looking at the text.
This wasn't atypical of her. She would also say that if your house is on fire then you call the firefighters, but if it is not on fire then you set it on fire, thereby reducing the problem to something that you have already solved.
He did. You can see / hear that line in this video from his old Coursera course.
https://youtu.be/TNhgCkYDc8M?list=PLLssT5z_DsK_gyrQ_biidwvPY...
Exactly how seriously he intended this to be taken is a matter of debate, but he definitely said it.
I've come to understand that the key thing that determines success in math is ability to compress concepts.
When young children learn arithmetic, some are able to compress addition such that it takes almost zero effort, and then they can play around with the concept in their minds. For them, taking the next step to multiplication is almost trivial.
When a college math student learns the triangle inequality, >99.99% understand it on a superficial level. But <0.01% compress it and play around with it in their minds, and can subsequently wield it like an elegant tool in surprising contexts. These are the people with "math minds".
I have been posting on hackernews "I have dyscalculia" for years in hopes for a comment like this, basically praying someone like you would reply with the right "thinking framework" for me - THANK YOU! This is the first time I've heard this, thought about this, and I sort of understand what you mean, if you're able to expand on it in any way, that concept, maybe I can think how I do it in other areas I can map it? I also have dyslexia, and have not found a good strategy for phonics yet, and I'm now 40, so I'm not sure I ever will hehe :))
I even struggle with times tables because the lifting is really hard for me for some reason, it always amazes me people can do 8x12 in their heads.
In algebra, you learn that (a - b)(a + b) = a^2 - b^2. It's not too hard to spot this when it's all variables with a little practice but it's easy to overlook that you can apply this to arithmetic too anywhere that you can rewrite a problem as (a-b)(a+b). This happens when the difference between the two numbers you're trying to multiply is even.
For a, take the halfway point between the two numbers, and for b, take half the difference between the numbers. So a = (8 + 12) / 2 = 10. b = (12 - 8) / 2 = 2.
Here, 8 = 10 - 2 and 12 = 10 + 2. So you can do something like (10 - 2)(10 + 2) = 10^2 - 2^2 = 100 - 4 = 96.
It's kind of a tossup if it's more useful on these smaller problems but it can be pretty fun to apply it to something like 17 x 23 which looks daunting on its own but 17 x 23 = (20-3)(20+3) = 20^2 - 3^2 = 400 - 9 = 391
The foundations for these concepts were laid by Piaget and Brissiaud, but most of their work is in french. In English, "Young children reinvent arithmetic" by Kamii is an excellent and practically oriented book based on Piaget's theories, that you may find useful. Although it is 250 pages.
This approach has become mainstream in maths teaching today, but unfortunately often misunderstood by teachers. The point of using different strategies to arrive at the same answer in arithmetics is NOT that children should memorize different strategies, but that they should be given as many tools as possible to increase the chance that they are able to play around with and compress the concept being learned.
The clearest expression of the concept of compression is maybe in this paper, I don't know if it helps or if it's too academic.
Once again I wanted to thank you for slowing down and taking the time to leave this thoughtful comment, if everyone took 5 minutes to try to understand what the other person is saying to see if they can help, the world would be a considerably better place. Thank you.
I'll show it to you, but first: are you able to add 80 + 16 in your head? (There's another trick to learn for that.)
12 is made up of a 10 and a 2.
What's 8 x 10? 80.
What's 8 x 2? 16.
Add 'em up? 96, baby!
They teach you to do math on paper from right to left (ones column -> tens column, etc), I find chunking works best if you approach from left to right. Like, multiply the hundreds, then the tens (and add the extra digit to the hundreds-total you already derived), then the ones place (ditto).
It's limited by your short-term memory. I can do a single-digit times anything up to maybe five digits. Two-digits by two digits, mostly. Three-digits times three digits I don't have the working memory for.
Thank you kindly for taking the time to teach me this! This thread has been one of the most useful things in a long ass time that's for sure. If I can ever be helpful to you, email is in the bio. :)
A few years ago I had an in-depth conversation with a (then) sixth-grader of my acquaintance, and came away impressed with the "Common Core" way of teaching maths. His parents were frustrated with it, because it didn't match the paper-based methods of calculation they (and you and I) had been taught, but he'd learned a bunch of these sorts of tricks, and from them had derived a good (probably, if I'm honest, better than mine) intuition for arithmetic relationships.
What stuck with me (written from memory, so might differ somewhat from the text):
In the introductory chapter, he describes mathematics as the science of patterns. E.g. number theory deals with patterns of numbers, calculus with patterns of change, statistics with patterns of uncertainty, and geometry with patterns of shapes and spaces..
Mathematical thinking involves abstraction: you identify the salient structures & quantities and describe their relationships, discarding irrelevant details. This is kind of like how, when playing chess, you can play with physical pieces or with a board on a computer screen - the pieces themselves don't matter, it's what each piece represents and the rules of the game that matters.
Now, these relationships and quantities need to be represented somehow: this could be a diagram or formulas using some notation. There are usually different options here. Different notations can highlight or obscure structures and relationships by emphasizing certain properties and de-emphasizing others. With a good notation, certain proofs that would otherwise be cumbersome might be very short. (Note also that notations typically have rules associated with them that govern how expressions can be manipulated - these rules typically correspond in some way to the things being represented and their properties.)
Now, roughly speaking, mathematicians may study various abstract structures and relationships without caring about how these correspond to the real world. They develop frameworks, notations and tools useful in dealing with these kinds of patterns. Physicists care about which patterns describe the world we live in, using the above mathematical tools to express theories that can make predictions that correspond to things we observe in the real world. As an engineer, I take a real-world problem and identify the salient features and physical theories that apply. I then convert the problem into an abstract representation, apply the mathematical tools (informed by the relevant physical theories), and develop a solution. I then translate the mathematical solution back into real-world terms.
One example of the above in action is how Riemann geometry, the geometry of curved surfaces, was created by developing a geometry where parallel lines can cross. Later, this geometry became integral in expressing the ideas of relativity.
This maps back to the idea of "making the invisible visible": Using the language of mathematics we can describe the invisible forces of aerodynamics that cause a 400 ton aircraft suspended in the air. For the latter, we can "run the numbers" on computers to visualize airflow and the subsequent forces acting on the airframe. At various stages of design, the level of abstraction might be very course (napkin calculations, discarding a lot of detail) or very fine (taking into account many different effects).
Lastly, regarding your post of 'When I found out they're not visualizing the stuff but instead "visualized the equations together and imaging them into new ones"':
Sometimes when studying relationships between physical things you notice that there are recurring patterns in the relationships themselves. For example, the same equations crop up in certain mechanical systems than does in certain electrical ones. (In the past there were mechanical computers that have now been replaced with the familiar electronic ones). With these higher order patterns, you don't necessarily care about physical things in the real world anymore. You apply the abstraction recursively: what are the salient parts of the relationships and how do they relate. This is roughly how you can generalize things from 2 dimensions to 3 and eventually n. Like learning a language, you begin to "see" the patterns as you immerse yourself in them.
Thank you again.
Yep — and this will generally be the case, as the equation looks like: x1^2 + x2^2 + … + xn^2 = r^2. If you fix one dimension, you have a hyperplane perpendicular to that axis — and a sphere of one dimension lower in that hyperplane.
For four dimensions, you can sort of visualize that as x^2 + y^2 + z^2 + t^2 = r^2, where xyz are your normal 3D and t is time. From t=-r to t=r, you have it start as a point then spheres of growing size until you hit t=0, then the spheres shrink back to a point.
For such discrete geometry problems, high-dimensional spaces often behave "weirdly" - your geometric intuition from R^3 will often barely help you.
You thus typically rather rely on ideas such as symmetry, or calculations whether "there is still space inbetween that you can fill", or sometimes stochastic/averaging arguments to show the existence of some configuration.
You could do kissing starfish but no one cares as there is no lore. A bit like 125m world record doesn't matter. 100m is the thing.
This is not a knock ... it is interesting how social / tradition based maths is.
Another example is Fermat's Last Theorem. It had legendary status.
Our mental models don't extend well beyond 3, possibly 4, dimensions, hence _all_ of our intuition starts to be doubtful after 3 dimensions.
For example, the 24-dimensional packing corresponds to the Leech lattice which itself corresponds to the Golay code:
(I'm assuming you've already searched for math bloggers, and similar "labor of love" coverage of the topic.)
Surely it's not too difficult to repeatedly place spheres around a central sphere in 17 dimensions, maximizing how many kiss for each new sphere added, until you get a number for how many fit? And add some randomness to the choices to get a range of answers Monte Carlo-style, to then get some idea of the lower bound? [Edit: I meant upper bound, whoops.]
Obviously ideally you want to discover a mathematically regular approach if you can. But surely computation must also play a role here in narrowing down reasonable bounds for the problem?
And computation will of course be essential if the answer turns out to be chaotic for certain numbers of dimensions, if the optimal solution is just a jumble without any kind of symmetry at all.
https://en.wikipedia.org/wiki/Kissing_number#Some_known_boun...
Even in dimension 5, the kissing number is apparently known only as "42 plus or minus 2."
> Had she been one of his graduate students, he would have tried harder to convince her to work on something else. “If they work on something hopeless, it’ll be bad for their career,” he said.
A few years ago someone found a counterexample. He was quite depressed for a few weeks at the thought of how much of his strongest research years had been devoted to something impossible.
Choosing a "good first problem" in math is quite difficult. It needs to be "novel," somewhat accessible, and possible to solve (which is an unknown when you're starting out)!
To me such a career is useful for (a) the greater good: you can't make discoveries without dead ends and (b) the maths created along the way! Or if not shares then the skills developed.
Is there an intuitive reason for why 6 fits so perfectly? Like, it could be a small gap somewhere, like in 3d when it's 12, but it isn't. Something to do with tessellation and hexagons, perhaps?
> They look for ways to arrange spheres as symmetrically as possible. But there’s still a possibility that the best arrangements might look a lot weirder.
Like square packing for 11 looks just crazy (not same problem, but similar): https://en.wikipedia.org/wiki/Square_packing
Six of those equilateral triangles will perfectly add to 360 degrees. Intuitive enough? (I'm being a little hand-wavey by skipping over the part where each penny triangle shares two pennies with a neighbor — why the answer is not 18 for example.)
For my mind though, the intuitiveness ends in dimension 2 though. ;-)
She taught Lineare Algebra II when I took it! It was one of the toughest lectures I took during university. I remember looking to the person next to me and one of us asked "do you understand anything?" and the other said "no! I haven't understood anything for like 20 minutes" and we burst out laughing and couldn't get it together until we were asked to quiet down. Wadim if you hang out here, send me a mail or something!
The very first day, he started out by talking about kissing spheres and concluded the lecture with "and that's why kissing spheres are easy in 7 dimensions" (or something like that).
Every lecture of his was like being placed in front of a window looking upon a wonderful new world, incomprehensible at first, but slowly becoming more and more clear as he explained. Sometimes I wish I could play in the garden of math.
I went to math prep school for 2 years, attended 12 hours of math class in agebra and analysis per week, which I think proves I've done more math than most people in the general population, and this makes no sense to me. It either lacks introduction required to understand the analogy, or I've become really dumb. I want to understand this based on what the article says, but I can't. I can't represent error-filled messages as high-dimensional points. It's easier for me to imagine what the intersection between 4D spheres would look like in geometry.
I found this for anyone interested in understanding 4D spheres without knowing too much math: https://baileysnyder.com/interactive-4d/4d-spheres/
This can be extended to 3-D or higher dimension spaces.
[1] https://en.wikipedia.org/wiki/Quadrature_amplitude_modulatio...
Well, start with an analogy. Let's say you and I want to communicate a message, which comes from a set of let's say 4 possible messages: "YES", "NO", "GOOD", and "BYE". Let's further suppose that the medium for this message (the "data channel") is going to be a single point selected from a 2D square. We'll agree beforehand on four points in the square that will represent our four possible messages. Then, you're going to position a dot at one of those points, and I'm going to observe that dot and infer your message from its position.
If the "data channel" is "error-free" (a.k.a. "lossless"), then it really doesn't matter which points we agree on: you could say that the exact center of the square is "YES", the point one millimeter to the left is "NO", the point two millimeters to the left is "GOOD", and so on. But if the data channel is "lossy," then the dot might get shaken around before I observe it. Or equivalently, I might observe its position slightly incorrectly. So we should choose our "code" so as to minimize the effect of this "error."
The best way to do that, on a square, is to place our four "code points" all the way at the four corners of the square, as far away from each other as possible. By "as far away from each other as possible," I mean in the sense of https://en.wikipedia.org/wiki/Pole_of_inaccessibility — I mean we want to maximize the minimum distance between any two points. A mathematician would notice that this is the same thing as maximizing the radius R such that we can draw a circle of radius R around each of our code points without any of the circles intersecting. (R in this case is half of the square's side length.)
If we add a fifth code point, this same reasoning would lead us to place that fifth point right smack in the center of the square. And the sixth point... well, I feel like that gets tricky.
BUT! In actual communications, we don't send messages encoded as real points in 2D squares. We send messages as discrete bit-strings, i.e., strings of zeros-and-ones of length N, which you can see as discrete points at the corners of an N-dimensional hypercube. Then, if we want to send K different messages robust against errors(+), we should pick as our code points some K corners of the hypercube so as to maximize the minimum Manhattan distance along the hypercube's edges between any two code points. This is the basic idea behind error-correcting codes.
A digital error-correcting code is "K code points in a bounded region of N-dimensional hyperspace (namely the discrete set of corners of a unit hypercube), selected so as to maximize the minimum distance between any two of them." The kissing-hyperspheres problem is "K sphere-centers in a bounded region of N-dimensional hyperspace (namely the continuous set of points at unit distance from the origin), selected so as to maximize the minimum distance between any two of them (and then, if that minimum distance is still >=1, increase K and try again)."
If all you meant is "Those two problem statements don't seem 100% equivalent," I think I agree with you. But if you meant you didn't see the similarity at all... well, I hope this helped.
https://en.wikipedia.org/wiki/Pole_of_inaccessibility
https://en.wikipedia.org/wiki/Error_correction_code
(+) — edited to add: Robust against the traditional model of error, i.e., our "threat model" is that any given bit has a constant small probability of getting flipped, so that our observed point may be some random Manhattan walk away from the code point you actually sent. You could instead use a different threat model — e.g. supposing that the bits sent in the actual digital message's "low-order" bits would flip more often than the high-order bits — in which case the optimal selection of code points wouldn't be as simple as "just maximize Manhattan distance."
and yet Cohn is first on the author list :(
http://www.ams.org/learning-careers/leaders/CultureStatement...