His peers thought it was magic because they were unfamiliar with the concept of writing, not because his writing system was so efficient. He was put on trial for witchcraft because people thought he was communicating via magic. https://education.nationalgeographic.org/resource/sequoyah-a....
Slightly different from what I’d normally assume had happened from just reading the above comment.
Really impressive on his part, basically saw it was possible and looked as some examples of what others had done, then got to work.
I mean, that feels like it's bound to happen when an alphabet is built to represent current language or pronunciation. English is notoriously awful for not doing that.
Also, I don't know how you can claim Hebrew is phonetically represented by its alphabet rather than the other way around, as a revived language the pronunciations are largely a matter of convention based on Yiddish. It would be more accurate to say that modern Hebrew uses an ancient writing system, which happens to be closely related to the ancestor of modern European alphabets.
See https://en.wikipedia.org/wiki/Revival_of_the_Hebrew_language
Also, proto-Sinaitic is not an alphabet. That's why Persian writing became harder to read when they switched from the nearly alphabetic Old Persian cuneiform to Aramaic abjad descended from proto-Sinaitic.
Proto-Sinaitic/Phoenician can be described as the “first alphabetic system,” Greek the “first true alphabet.”
Fun fact: Greek is the world’s oldest recorded living language.
The Greek alphabet has been in use for approximately 2,800 years; previously, Greek was recorded in writing systems such as Linear B and the Cypriot syllabary.
Phonetic alphabets were introduced to most of Asia by various Brahmic scripts; the most widely-used (albeit briefly-used) one being the Mongolian Phags-pa script [2], derived from Tibetan, derived from various Brahmic scripts, derived from Aramaic, derived from Phoenician, derived from — sure enough — proto-Sinaitic. Thai and Khmer are derived from Pallava [3], which is derived from Tamil-Brahmi, derived from other Brahmic scripts, again derived from Aramaic and thus eventually from proto-Sinaitic; etc etc.
1: https://en.wikipedia.org/wiki/History_of_the_alphabet
https://www.amazon.com/A-to-Z-Season-1/dp/B0CWCHTM3B
Episode 2 then covers the printing press.
[1] https://easypronunciation.com/en/english-phonetic-transcript...
A writing system that used strict phonetic transcription for everything would be unusably bad. Everyone pronounces words differently than the writing system prescribes, in every language. Words are shortened and blended together constantly in connected speech.
This is, for better or worse, what is being done to incorporate aboriginal names into things like streets and bridges in places like Vancouver.
- [stal̕əw̓asəm Bridge](https://en.wikipedia.org/wiki/Stal%CC%95%C9%99w%CC%93as%C9%9...) - [šxʷməθkʷəy̓əmasəm Street](https://vancouver.ca/news-calendar/musqueamview-street-signs...)
I see the practicalities of adopting this IPA-lite form, but it's a struggle to use, even though I've previously been trained in IPA.
What's happening with your example is just that the symbols chosen for the phonemic transcription are non-Latin so they're unfamiliar to read aloud and harder to type for non-speakers. What I meant was if we all wrote with all of our individual idiosyncrasies of speech without converging on a prescribed standard (a writing system separate from speech transcription).
"Amnu ge sum'm frum upsterz, gimmi u sek" but even more so, with IPA characters for all the 40-odd individual sounds of my dialect of English - then you write your response in the same level of phonetic detail. Exactly what a writing system shouldn't do.
*(or 7 or whatever number makes you feel best)
In Unicode, that's ſ and þ. Both historical English letters that are no longer used.
"Ye Olde" ye was not the same word as "Hear ye, hear ye!", that ye is a plural 'you' basically the same word as "y'all" and never had a thorn.
And while not encoded on a keyboard, it still blows my mind that English has a crazy number of past tenses - and a such a bad hack of a future tense that it’s hard to classify as such.
Linguistics is fun. The accents are alright.
This was caused by the printing press and the typewriter (keyboard) both of which forced simplifications in the written English language.
He gives some rather cute examples, like the language of Finnegans Wake by Joyce being very low redundancy (high efficiency in your words). He also states that crossword puzzles don't work in a perfectly efficient language, that 50% redundancy is pretty good for 2-d puzzles, and 33% redundancy good for 3-d puzzles. This has always been one of my favorite and in my mind most random corollaries in a paper.
https://people.math.harvard.edu/~ctm/home/text/others/shanno...
For example, a language with a larger alphabet will be able to express more in fewer characters. Is that more efficient?
Similarly, you could think of each word as a sort of lookup table for information in the mind of the reader. We don't define words as we're writing, we expect the speaker to know them already. If a language has more words, each word is more precise, and fewer words can be used to express an idea—but is that efficiency? You're just relying on the reader having more preexisting knowledge.
Now, if you want to say that they wrote in the same annoyingly pretentious way that AIs often do, I could agree with that...
https://www.smithsonianmag.com/arts-culture/review-of-the-pr...
Where do you think LLM's learned these things from? They are widely used in literary writing. Like magazines and books.
If anything, the length of that article shows how rarely em-dashes were used by most writers. They're like exclamatory versions of semicolons, a contrived sudden interruption, a sort of inversion of the three dot "…" elipsis. Maybe the em-dash cracked and fell on the floor.
The reason LLMs use a lot of em-dashes is because that's a format they've chosen for output. Thinking that LLMs have a lot of em-dashes because works in the wild have a lot of em-dashes is like thinking that LLM output has a lot of emoticons because a lot of essayists use emoticons to mark subject divisions in the text.