Okay, you can now create passable rich text documents for a limited (though common) range of purposes with that 8/24-bit breakdown that was suggested. But you may have noticed the author mentioned subscripts, which wasn't in my list. Well, it turns out that subscript and superscript have a terribly limited range of applications if you are specifying them per character: x^2^2 would be visually identical to x^22, and x^a_b would look different from x_b^a (with both presentations being nonsensical). The use of subscripts and superscripts in any technical applications would be severely limited. You need a much richer markup language to be truly expressive. So there really isn't much of a point in offering subscripts. Superscripts, sure, because they have a few non-technical uses.
Yet the reality is that people want a much richer set of formatting options. At a minimum, they want to select fonts and font sizes. Some of the formatting options have semantics. I know I crammed four levels of headings in those eight bits, but that only makes sense in headings. It doesn't make sense to specify it per character. Then there are other common document elements, like tables. You can create decent tables using monospaced fonts, but that is limiting and would produce undesirable results in some cases (try displaying April 5^th sensibly, using a monospace font so that it won't affect the width of the columns). On top of that, you are ditching the concept of styles because that implies some sort of markup.
Note that is 256 combinations. If you want both bold and italics, either it's one of the 256 combinations, separate from the bold-only combination and from the italics-only combination, or you need another 8 bits for each option.
I think HN made a very aesthetically pleasing decision to exclude bold and underline. Imagine the appearance of comment pages if those were options.
I actually do think the author has a point, in that must solutions today are inelegant, I also don't think this is a problem which has a real elegant solution. Where to draw the line? Why not encode fonts into the standard too, if we're doing bold? Etc.
I'm still mostly in favour of keeping everything markdown (in my own writing), however much it pollutes the "purity" of text.
The name still maintains the confusion as it tries to be an alternative to markup systems such as HTML which had the purpose to introduce semantic clues for computers.
We all know how it went; the semantic part was entirely thrown away and markup was thoroughly abused for layout (HTML tables before CSS - CSS which also has little to do with "style" and more to do with typesetting and layout), as no browser today can just show a table of contents based on the HTML title tags.
(Cf. how the cross-reference stream in PDF files makes it painful to edit objects in them, even when the files are nominally encoded in plaintext.)
He then goes into how a separate styling layer can assist with transcluding text from other people's work while modifying the style. But style variations are hardly the only legitimate changes typically made to direct quotations: people often want to modify capitalization or punctuation, elide portions, or insert bracketed notes. And at that point, you're modifying the content as well as the styling, so style-only modifications would be very limiting for that use case.
As for the structure layer, this would have the same issues as every other attempt in the last three decades to create a semantic web or whatever. Authors don't want to spend their time carefully curating metadata that 99.9% of readers won't care about, while bad actors want to game their relevancy metrics through any mechanism available.
HTML was never envisioned as a cross-platform richtext format and markdown lacks almost half of all formatting features. Specialzed json is even more evil because the content becomes unrenderable when the parent app goes out of existence.
op's suggestion (accomodating formattings as unicode bytes) might not be optimal however I'm happy at least somebody thought of this as a problem to solve.
, !
, !
ᴴᵉˡˡᵒ, ᵂᵒʳˡᵈ!
So, there's almost no bold/italic punctuation. And non-ASCII Unicode letters aren't "supported" this way either. But you can get quite far with "formatted" ASCII letters in Unicode, if you're so inclined.
td { font-family:Verdana, Geneva, sans-serif; font-size:10pt; color:#828282; }
The author believes that plain text should encode bold, italic, etc., because that's all they had exposure to. Were the text written today, they would claim emojis belong in unicode as well.
Most social media don't support it, but on Tumblr, for example, you can specify the color of the text and even choose a different font. I think there was some other social media that allowed you to have animated effects on the text as well, but I forgot the name.
Not sure what you mean, unicode does contain emojis. That's what most platform use for emojis now,
Markdown emerged to fulfill that "simple" hypertext document role. If you're writing READMEs and blog posts, you probably don't need more than that. And I think it's more accessible (certainly less error-prone) than HTML for most people.
If you need richer semantics, HTML5 is available. And if semantics are important to you, you're probably still using HTML5 as a rendering layer and your actual semantics are processed, stored and delivered in layers much more purpose-built for that.
Do you have tools that do this or an example?
I'm pretty happy with Markdown and mkdocs (on Linux) to manage and format my notes. VS Code does a pretty good job with this providing both a preview and facilitating linking between documents (both file and heading links.) I'm always open to something better.