I havent read Loring Tus books before but let me look at them since I have been wanting to revisit the topic with a clearer and more relaxed approach.
It’s actually much more well written than the majority or articles we usually come across.
For instance, consider the only concrete example in the article: the space of all possible configurations of a double pendulum is a manifold. The author claims it's useful to see it in a manifold, but why? Precisely, why more as a manifold than as a square [O,2π[²?
I also expected more talk about atlases. In simple cases, it's easy to think of a surface as a deformation of a flat shape, so a natural idea is to think of having a map from the plan to the surface. But, even for a simple sphere, most surfaces can't map to a single flat part of the plan, and you need several maps. But how do you handle the parts where the maps overlap? What Riemmann did was defining properties on this relationship between manifold points and maps (which can be countless).
BTW, I know just enough about relativity to deny that "space-time [is] a four-dimensional manifold", at least a Riemmannian manifold. IIRC, the usual term is Minkowski-spacetime.
In a double pendulum, each arm can freely rotate (there is no stopping point). This means 0 degrees and 360 degrees are the same point, so the edges of the square are actually joined. If you join the left and right edges to each other, then join the top and bottom edges to each other, you end up with a torus.
Unfortunately this is one of those things where that knowledge is not enough.
The GR model of spacetime is that it is locally Minkowski but globally a manifold of Minkowski patches.
Because, as the article explains, it's a torus (loop crossed with a loop), not a square (segment crossed with a segment).
Besides not treating readers like idiots, they take themselves seriously, hire smart people, tell good stories but aren't afraid to stay technical, and simply skip all the clickbait garbage. Right now from the Scientific American front page: "Type 1 Diabetes science is having a moment". Or from Nature: "'Biotech Barbie' says ..". Granted I cherry-picked these offensive headlines pandering to facebook/twitter from many other options that might be legitimately interesting reads, but on Quanta there's also no paywalls, no cookie pop-ups, no thinly-veiled political rage-baiting either
For other publications they are beholden to people who haven't figured out ad-block, and your bar needs to be pretty low to capture that revenue.
I like this honestly because this shows that I learned something intelligent. On the other hand, if I don't feel exhausted after reading, it is a strong sign that the article was below my intellectual capacity, i.e. I would have loved it if I could have learned more.
Of course, most common or not, each case is different.
So write a text of at least 500 pages to explain the complexity. :-)
Looking at things from abstract view does allow us not to worry about how we visualize the geometry which is actually hard and sometimes counter intuitive.
In special relativity, for example, a huge amount of attention is typically given to the Lorenz transformations required when coordinates change. However, the (Minkowski) space that is the setting for special relativity is well defined without reference to any particular coordinate system, as an affine space with a particular (pseudo-)metric. It's not conceptually very complicated, and I never properly understood special relativity until I saw it explained in those terms in the amazing book Special Relativity in General Frames by Eric Gourgoulhon.
For tensors, the basis-independent notion is a multilinear map from a selection of vectors in a vector space and forms (covectors) in its dual space to a real number. The transformation properties drop out of that, and I find it much more comfortable mentally to have that basis-independent idea there, rather than just coordinate representations and transformations between them.
The issue is the level of mathematical sophistication one has when a certain concept is introduced. That often defines or at least heavily influences how one thinks about it forever.
The basics of special relativity came up in my first year of university, and the rest didn't really get focused on until my second year.
The first time around I was still encountering linear algebra and vector spaces, while for the second I was a lot more comfortable deriving things myself just given something like the Minkowski "inner product".
(As an aside: I really love abstract index notation for dealing with tensors)
That was one of the most interesting things of my EE/CS dual-degree and the exact concept you're describing has stuck with me for a very long time... and very much influences how I teach things when I'm in that role.
EE taught basic linear algebra in 1st year as a necessity. We didn't understand how or why anything worked, we were just taught how to turn the crank and get answers out. Eigenvectors, determinants, Gauss-Jordan elimination, Cramer's rule, etc. weren't taught with any kind of theoretical underpinnings. My CS degree required me to take an upper years linear algebra course from the math department; after taking that, my EE skills improved dramatically.
CS taught algorithms early and often. EE didn't really touch on them at all, except when a specific one was needed to solve a specific problem. I remember sitting in a 4th year Digital Communications course where we were learning about Viterbi decoders. The professor was having a hard time explaining it by drawing a lattice and showing how you do the computations, the students were completely lost. My friend and I were looking at what was going on and both had this lightbulb moment at the same time. "Oh, this is just a dynamic programming problem."
EE taught us way more calculus than CS did. In a CS systems modelling course we were learning about continuous-time and discrete-time state-space models. Most of the students were having a super hard time with dx/dt = A*x (x as a real vector, A as a matrix)... which makes sense since they'd only ever done single-variable calculus. The prof taught some specific technique that applied to a specific form of the problem and that was enough for students to be able to turn the crank, but no one understood why it worked.
Having studied physics, I would disagree rather strongly. I only really started understanding Special Relativity once I had a clear understanding of the math. (And then it becomes almost trivial.) Those of my fellow class mates, however, who didn't take the time to take those additional (completely optional) math classes, ended up not really understanding it at all. They still got confused by what it all meant, by the different paradoxes, etc.
I saw the same effect when, later, I was a teaching assistant for a General Relativity class.
Eric Gourgoulhon is a product of the French education system, and I often think I would have done better studying there than in the UK.
I had started in a theoretical physics degree which was jointly taught by the maths and physics department. By my final year I had changed into an ostensibly pure maths degree, although I did it mainly to take more advanced theoretical/mathematical physics courses (which were taught by the maths department), and avoid having to do any lab work—a torsion pendulum experiment was my final straw on that one, I don't know what caused it to fuck up, but fuck that.
In the end I took on more TP courses than the TP students, nearly burnt out by the end of the year, and... didn't exactly come out with the best exam results.
Kip Thorne was also heavily influenced by this geometric approach. Modern Classical Physics by Thorne & Blandford uses a frame invariant, geometric approach throughout, which (imo) makes for much simpler and more intuitive representations. It allows you to separate out the internal physics from the effect of choosing a particular coordinate system.
(The comment I replied to mentioned both.)
Um, yes it is. "A foo is an object that transforms as a foo" is a circular definition because it refers to the thing being defined in the definition. That is what "circular definition" means.
When people say "a tensor is a thing that transforms like a tensor" they're using a convenient shorthand for the bit that I put in angle brackets above.
My favourite explanation is that "Tensors are the facts of the universe" which comes from Lillian Lieber, and is a reference to the idea that the reality of the tensor (eg the stress in a steel beam or something) is independent of the coordinate system chosen by the observer. The transformation characteristic means that no matter how you choose your coordinates, the bases of the tensor will transform such that it "means" the same thing in your new coordinates as it did in the old ones, which is pretty nifty.
https://www.youtube.com/watch?v=f5liqUk0ZTw&pp=ygURdGVuc29yc...
Yes, but the "convenient shorthand" only makes sense if you already know what a tensor is. That renders the "definition" useless as an explanation or as pedagogy. It's only useful as a social signal to let others know that you understand what a tensor is (or at least you think you do).
> My favourite explanation is that "Tensors are the facts of the universe"
That's not much better. "The earth revolves around the sun" is a fact of the universe, but that doesn't help me understand what a tensor is.
What matters about tensors are the properties that distinguish them from other mathematical objects, and in particular, what distinguishes them from closely related mathematical objects like vectors and arrays. Finding a cogent description of that on the internet is nearly impossible.
> the reality of the tensor ... is independent of the coordinate system chosen by the observer
Now you're getting closer, but this still misses the mark. What is "the reality of a tensor"? Tensors are mathematical objects. They don't have "reality" any more than numbers do.
> no matter how you choose your coordinates, the bases of the tensor will transform such that it "means" the same thing in your new coordinates as it did in the old ones
That is closer still. But I would go with something more like: tensors are a way to represent vectors so that the representation of a given vector is the same no matter what basis (or coordinate system) you choose for your vector space.
That's just incorrect though for a couple of reasons. Firstly, a vector in the sense in which it is used in physics is a rank 1 tensor so it has this transformation behaviour just like other higher order tensors. Secondly the representation is the thing that changes, but the meaning of that representation in the old basis and the new basis is the same. For example, if I take the displacement from me to the top of the Eiffel tower, I can represent that in xyz Cartesian coordinates or in spherical or cylindrical coordinates, or I can measure it relative to an origin that starts with me or at sea level at 0 latlong. The representation will be very different in each case, but the actual displacement from me to the top of the Eiffel tower doesn't change. What has happened is the basis vectors transform in exactly such a way as to make that happen. It's a rank 1 tensor in 3 dimensions because there is a magnitude and one direction (one set of 3 basis vectors) in whatever case.
Now if I want an example of a rank 2 tensor think about a stress tensor. I have a steel beam which is clamped at both ends and a weight is on top of it. This is a tensor field. For every point in the beam there are different forces acting in each direction. So you could imagine the beam as made up of a grid of little rubik's cubes. On each face of each cube you have different net forces. (eg at the middle of the beam the forces are mainly downwards due to gravity, at the ends of the beam the fact that the middle of the beam is bowing downards will lead to the "faces" that point to the middle of the beam to be being pulled towards the middle (transverse to the beam and slightly downwards) whereas the opposite face is pulled in the opposite direction because the ends of the beam are clamped. So I need two sets of basis vectors. One set indicates the "face" experiencing the force, one set indicates the direction of the force. Now just like the vector/rank one tensor case I can represent those in whatever coordinate system I want, and my representation will be different in each case, but will mean the same sets of forces in the same directions and applied to the same directions because both sets of basis vectors will transform to make that true. I would call that a rank 2 tensor field because I would express it as a function from a set of spatial coordinates to a thing which has a magnitude and 2 directions (that's what I think of as the tensor). However I understand physicists and civil engineers and stuff just call the whole thing the stress tensor (not the stress tensor field). I could be wrong.
So what I mean when I talk about the reality of the tensor I mean whatever it is the tensor is expressing in the physical universe (eg the displacement from me to the tower or the stress in the beam). From a mathematical point of view I agree of course, mathematical objects themselves are purely arbitrary and abstract. But if you have a bridge and you want to make sure it doesn't buckle and fall down, the stress tensor in the bridge is a real and important fact of the universe that you need to have a decent understanding of.
Quite possible. But that's in no small measure because I have yet to find an actual cogent definition of "tensor" that distinguishes a tensor from an array. (I have a similar problem with monads.)
> So what I mean when I talk about the reality of the tensor I mean whatever it is the tensor is expressing in the physical universe
OK, but then "the reality of a tensor" not depending on the coordinate system has nothing to do with tensors, and becomes a vacuous observation. It is simply a fact that actual physical quantities don't depend on how you write them down, and hence don't change when you write them down in different ways.
https://en.wikipedia.org/wiki/Calabi%E2%80%93Yau_manifold
I'm not going to pretend to understand it all but they do make pretty pictures!
What makes Calabi Yau manifolds special is that their curvature balances out perfectly so the space does not stretch or shrink overall.
In physics especially in string theory Calabi Yau manifolds are used to describe extra hidden dimensions of the universe beyond the three we can see. The shape of a Calabi Yau manifold affects how particles and forces behave which is why both mathematicians and physicists study them.
Could you elaborate a bit on this? I find it fascinating. Thanks.
>The shape of a Calabi Yau manifold affects how particles and forces behave [...]
Do you know if there's any experimental evidence of this?
Please correct me if I am wrong, I have not touched this subject in a long time and only have some intuition. Here is how I understand it:
A manifold is a kind of space that looks flat when you zoom in close enough. The surface of a sphere or a doughnut is a 2D manifold, and the space we live in is a 3D manifold. A Calabi Yau is one of these spaces but with more dimensions and extra symmetry that makes it very special.
In geometry there are several ways to describe curvature. The most complete one is the Riemann curvature tensor, which contains all the information about how space bends. If you take a specific kind of average of that, you get the Ricci curvature tensor. Ricci curvature tells you how the size of small regions in space changes compared to what would happen in flat space.
Imagine a tiny ball floating in this curved space. If the Ricci curvature is positive, nearby paths tend to come together and the ball’s volume becomes smaller than it would in flat space. If the Ricci curvature is negative, nearby paths move apart and the ball’s volume grows larger. If the Ricci curvature is zero, the ball keeps the same volume overall. So when I said “the space does not stretch or shrink overall” I was describing this situation: the Ricci curvature is zero, which means the space does not expand or contract on average compared to flat space.
The space can still have complicated twists and bends. Ricci curvature only measures a certain type of curvature related to volume change. Even if the Ricci tensor is zero, there can still be other kinds of curvature present. The curvature balances out is just an intuitive way to express that the volume effects cancel when you take the average that defines Ricci curvature. It does not mean the space has matching regions of positive and negative curvature in a literal sense, but rather that the mathematical combination producing Ricci curvature sums to zero.
Noe back to definition: A Calabi-Yau manifold is defined as compact (finite in size), complex), and Kähler (it has a compatible geometric and complex structure), with a first Chern class equal to zero. Yau’s theorem proves that such a space always has a way to measure distances so that its Ricci curvature is exactly zero. So when I said “the curvature balances out perfectly so the space does not stretch or shrink overall” I meant it as an intuitive description of this Ricci flat property. The space is not flat like a sheet of paper, but its internal geometry is perfectly balanced in the sense that there is no net expansion or contraction of space.
As my knowledge, there is no direct evidence that Calabi Yau manifolds describe real extra dimensions. In string theory, these shapes are used because they fit the math and preserve symmetries like supersymmetry. Experiments have not found signs of extra dimensions or supersymmetric particles, so Calabi Yau manifolds remain a beautiful theoretical idea, not something confirmed by observation.
Initially I recoiled at the thought of the stiffness of the CD, but of course your absolutely right, at least for 2d manifolds.
You're thinking of open sets.
If it is the formal definition being used, then why? Do people actually reason about data manifolds using "atlases" and "charts" of locally euclidean parts of the manifold?
However the embedding space of a typical neural network that is representing the data is not a manifold. If you use ReLU activations the kinks that the ReLU function creates break the smoothness. (Though if you exclusively used a smooth activation function like the swish function you could maintain a manifold structure.)
- For language, individual words might be discrete, but concepts being communicated have more nuance and fill in the gaps.
- For language, even to the extent that discreteness applies, you can treat the data as being sampled from a coarser manifold and still extract a lot of meaningful structure.
- Images of cars are more continuous than you might imagine because of hue differences induced by time of day, camera lens, shadows, etc.
- Images of cars are potentially smooth even when considering shape and color discontinuities. Manifolds don't have to be globally connected. Local differentiability is usually the thing people are looking for in practical applications.
Information Geometry of Evolution of Neural Network Parameters While Training
But i am skeptical whether this definition can be useful in the real world of algorithms. For example you can define things like topological data analysis, but the applications are limited, mainly due to the curse of dimensionality.
A continuous manifold will have a line element that allows you to compute distances between its points using its parameters. The simplest line element was first written down by Pythagorus I think, it allows you to compute the distance between two points in a flat manifold. In physics we do away with gravitational forces by realizing that masses move along geodesics (shortest paths) of a manifold, hence the saying,"matter tells spacetime how to curve and spacetime tells matter how to move". We stich together large curvy manifolds like a patch quilt from the locally Euclidean tangent spaces that we erect at any point.
> The term “manifold” comes from Riemann’s Mannigfaltigkeit, which is German for “variety” or “multiplicity.”
Another is that when working with manifolds, you usually don't get a set of global coordinates. Manifolds are defined by various local coordinate charts. A smooth manifold just means that you can change coordinates in a smooth (differentiable) way, but that doesn't mean two people on opposite sides of the manifold will agree on their coordinate system. On a sphere or circle, you can get an "almost global" coordinate system by removing the line or point where the coordinates would be ambiguous.
I'm not very well versed in the history, but the study of cartography certainly predates the modern idea of an abstract manifold. In fact, the modern view was born in an effort to unify a lot of classical ideas from the study of calculus on spheres etc.
> On a sphere or circle, you can get an "almost global" coordinate system by removing the line or point where the coordinates would be ambiguous.
Applying cartography to manifolds: Meridians and parallels form a non-ambiguous global coordinate system on a sphere. It's an irregular system because distance between meridians varies with distance from the poles (i.e., the distance is much greater at the equator than the poles), but there is a unique coordinate for every point on the sphere.
Colloquially, this means a manifold is just "a bunch of patches of n-dimensional Euclidean space, smoothly sewn together."
A sphere requires at least two charts for an admissible atlas (say two hemispheres overlapping slightly at the equator, or six hemispheres with no overlaps), otherwise you get discontinuities.
> this global coordinate system isn't a continuous mapping (see the discontinuity of both angular coordinates between 2*pi and 0).
I'm guessing that the issue is that I don't know your definition of 'continuous'.
I believe every point on the planet (sphere, for simplification) has unique corresponding coordinates on the map projection (chart). The only exceptions I can see are, A) surfaces perpendicular to the aspect (perspective) of the projection, which is usually straight down and causes points on exactly vertical surfaces to share coordinates; B) if somehow coordinates are limited in precision or to rational numbers; C) some unusual projection that does it.
> A sphere requires at least two charts for an admissible atlas (say two hemispheres overlapping slightly at the equator, or six hemispheres with no overlaps), otherwise you get discontinuities.
There are cartographic projections that use two charts. Regarding those with one, where is the discontinuity in a Mercator projection? I think when I understand your meaning, it will be clear ...
The Mercator projection is obtained by removing two points from the sphere (both poles) and stretching the hole at each pole until the punctured sphere forms a cylinder, then cutting the cylinder along a line of longitude. So you can see that the 3 discontinuities in the Mercator projection correspond to the top and bottom edges (where we poked a hole at each pole) and the left/right edges (where we cut the cylinder). (Note that stretching the sphere at the poles changes the curvature, but cutting the cylinder does not. The projection would have the same properties on a cylinder.)
It is possible to continuously map the sphere to the entire (infinite) plane if you just remove a single point (the north pole): place the sphere so the south pole is touching the origin of the plane and for any point on the sphere, draw a line from the north pole through that point. Where that line intersects the plane is that point’s image under this mapping (called the Riemann sphere).
So manifolds are complicated shapes that are at large enough a scale that an ant (which species?) will think they're flat....ok
It becomes a bigger problem when the etymology is actually a chain of almost arbitrary naming decisions - how far back do I go?!
Names are important.
The naked term "manifold" in its modern usage, refers to a topological manifold, loosely a locally euclidean hausdorff topological space, which has no geometry intrinsic to it at all. The hyperbolic plane and the euclidean plane are different geometries you can put on the same topological manifold, and even does not depend on the smooth structure. In order to add a geometry to such a thing, you must actually add a geometry to it, and there are many inequivalent ways to do this systematically, none of which work for all topological manifolds.
This article breaks that loop and it's refreshing to see a large topic not explained as an amalgamation of arcane jargon
I don't think I can improve this silence.
Have a good day.
I can share my two take-aways.
- in the geometric sense, manifolds are spaces analogous to curved 2d surfaces in 3d that extend to an arbitrary number of dimensions
- manifolds are locally Euclidean
If I were to extrapolate from the above, i'd say that:
- we can map a Euclidean space to every point on a manifold and figure out the general transformation rules that can take us from one point's Euclidean space to another point's.
- manifolds enable us to discuss curved spaces without looking at their higher-dimension parent spaces (e.g. in the case of a sphere surface we can be content with just two dimensions without working in 3d).
Naturally, I may be totally wrong about all this since I have no knowledge on the subject...
Something's gone badly wrong here. "Without learning Cyrillic" is the normal way to learn Russian. Pick a slightly less prominent language and 100% of learners will do it without learning anything about the writing system.
Stand at one of the poles. Walk to the equator, turn 90 degrees. Walk 1/2 way around the equator, turn 90 degrees again. Walk back to the pole. Now the triangle sums 360 degrees!