129 pointsby simedw4 hours ago26 comments
  • 13 minutes ago
    undefined
  • bunderbunder14 minutes ago
    This is very cool, but from one Mandarin learner to another I’d caution against relying too heavily on any external feedback mechanism for improving your pronunciation.

    If you can’t easily hear your pronunciation mistakes so clearly it hurts, consider putting more energy into training your ear. Adult language learners usually have brains that have become resistant to, but not incapable of, changing the parts of the brain responsible for phoneme recognition. The neuroplasticity is still there but it needs some nudging with focused exercises that make it clear to your brain exactly what the problem is. Minimal pair recognition drills, for example, are a great place to start.

    It’s not the most fun task, but it’s worth it. You will tighten the pronunciation practice feedback loop much more than is possible with external feedback, so a better accent is the most obvious benefit. But beyond that, it will make a night and day difference for your listening comprehension. And that will get you access to more interesting learning materials sooner. Which hopefully increases your enjoyment and hence your time on task. Plus, more accurate and automatic phoneme recognition leaves more neurological resources free for processing other aspects of your input materials. So it may even help speed things like vocabulary and grammar acquisition.

  • dapangzi3 hours ago
    Longtime lurker, made an account specifically to give feedback here as an intermediate speaker. :)

    This is a great initiative and I hope to see more come out of this; I am not criticizing, but just want to provide my user experience here so you have data points.

    In short, my experience lines up with your native speakers.

    I found that it loses track of the phonemes when speaking quickly, and tones don't seem to line up when speaking at normal conversational speed.

    For example, if I say 他是我的朋友 at normal conversational speed, it will assign `de` to 我, sometimes it interprets that I didn't have the retroflexive in `shi` and renders it `si`. Listened back to make sure I said everything, the phonemes are there in the recording, but the UI displays the wrong phonemes and tones.

    By contrast, if I speak slowly and really push each tone, the phonemes and tones all register correctly.

    Also, is this taking into account tone transformation? Example, third tones (bottom out tone) tend to smoosh into a second tone (rising) when multiple third tones are spoken in a row. Sometimes the first tone influences the next tone slightly, etc.

    Again, great initiative, but I think it needs a way to deal with speech that is conversationally spoken and maybe even slurred a bit due to the nature of conversational level speech.

    • mercanlIl24 minutes ago
      The tool definitely needs to address tone transformations, it’s a big part of how the language is spoken. Otherwise it’s mostly useful for a first year student speaking in isolation.

      Hoping to see improvements in this area

    • sqsan hour ago
      I don't think it takes care of tone transformation (eg 他是 ni3shi4 -> ni2shi4). Or if it does, my tones are just off. But it's a really cool idea!
    • tifanan hour ago
      I had the same issue! Perhaps being another dapangzi is the problem here lol
  • ecshafer3 hours ago
    Anyone that is a native European language speaker that hasn't tried to learn Chinese or some other tonal language, its really hard to understand how hard it is. The tones can really be very subtle, and your ear is not fine tuned to them. So you think you are saying it right, but native speakers have no idea what you are saying.
    • vjvjvjvjghvan hour ago
      Agree. It’s really hard. It also explains why a lot of people born in China tend to make serious pronunciation errors when speaking English or German. They are used to focus on different things than us westerners.

      It took me very long time to really understand how impersonating tone is in Chinese.

    • danparsonsonan hour ago
      Wholeheartedly (or maybe downheartedly?) agree with this - sometimes I try to say the simplest things and people just stare at me like I'm speaking Martian. Which I suppose I might as well be! One of my big problems is implicit use of tones for things like expressing uncertainty; that's a very difficult habit to get out of.
    • laurieg2 hours ago
      For someone who hasn't grown up speaking an language with tones or pitches, the process of learning them can be maddening. I applaud anyone who makes tools like this to try to make the process easier.

      My experience in learning Japanese pitch accent was eye-opening. At the start, I couldn't hear any difference. On quizzes I essentially scored the same as random guessing.

      The first thing that helped me a lot was noticing how there were things in my native language (English) that used pitch information. For example, "uh-oh" has a high-low pitch. If you say it wrong it sounds very strange. "Uh-huh" to show understanding goes low-high. Again, if you reverse it it sounds unusual.

      The next part was just doing lots of practice with minimal pairs. Each time I would listen and try my best to work out where the pitch changed. This took quite a lot of time. I feel like massed practice (many hours in a day) helped me more than trying to do 10 minutes regularly. Try to hear them correctly, but don't try too hard. I didn't have any luck with trying harder to 'understand' what was going on. I liken it to trying to learn to see a new color. There isn't much conscious thought.

      The final piece of the puzzle was learning phrases, not individual words, that had pitch changes. For example: "yudetamago" could be boiled egg or boiled grandchildren. Somehow my brain just had a much easier time latching on to multi-word phrases instead of single words. Listening to kaki (persimmon) vs kaki (oyster) again and again seemed much harder.

      Of course, your mileage may vary with these techniques. I already spoke decent Japanese when I started doing this.

    • cyberax2 hours ago
      I'm a native Russian speaker, and I decided to learn Mandarin, because it's linguistically almost the opposite of Russian.

      I had no problems with tone pronunciation, but tone recognition was indeed much trickier. I still often get lost when listening to fast speech although I can follow formal speech (news) usually without problems.

      • 2 hours ago
        undefined
    • dionian2 hours ago
      its critical because without proper tonal enunciation the words can be ambiguous.
  • memalign40 minutes ago
    I wish this had a pinyin mode…! I am learning to speak Mandarin but I am not learning to read/write.

    ( I’m learning using a flashcards web app I made and continue to update with vocab I encounter or need: https://memalign.github.io/m/mandarin/cards/index.html )

    • data_ders39 minutes ago
      same! but if you get it inevitably wrong the first time it gives you the pinyin. but i struggled to get it to transcribe the consonants I was making let alone the tones. i'm pretty sure i'm not as bad as that!
  • tifanan hour ago
    Well, it would work only when I speak word by word, not as a sentence or in a normal speed for daily conversations. The model thinks I was making mistakes when I speak casually (as a native Chinese speaker, I had Mandarin 2A certification, which is required for teachers or other occupations that requires a very high degree of Mandarin accuracy). You wouldn’t really notice it but language pronunciations is very different between causal and formal speech…
  • vunderba3 hours ago
    When I was living in Taiwan, one of the ways I forced myself to remember to pronounce the tones distinctly was by waving my hand in front of me, tracing the arc of each character’s tone.

    It helped a lot even if I did look like an insane expat conducting an invisible orchestra.

    One more thing: there's quite a bit of variation in how regional accents in the mainland can affect tonal pronunciation. It might be worth reaching to some native speakers to give you some baseline figures.

    • zdragnar3 hours ago
      In a university Mandarin class, one of the adult students (i.e. probably 40 or so) WAY over exaggerated his tones, to the point that the little old lady teaching us laughed out loud after one of his answers.

      A few years later, he had the most clean and consistent pronunciation out of anyone I'd been in a class with, and easily switched between the Beijing and other accents depending on which teacher we had on any given day.

      I rather regret not emulating him, even though I haven't really used it for nearly 20 years and have forgotten most of it.

      • ecshafer3 hours ago
        From a language learning standpoint that does make sense. Over-exageration while you are learning to help cement the idea, and then when you are speaking more naturally you will fall back into a regular kind of tone.
      • luckydata2 hours ago
        that's EXACTLY how I taught myself to speak with a Spanish accent from Madrid. I repeated the way tv celebrities and the speakers on the metro announced the stations, and it gave me a base for how to use my mouth and throat appropriately. After a while I was able to tone it down and my accent got so good that locals couldn't tell I wasn't spanish - I had this cool party trick pulling out my id and showing them I was truly a foreigner!
    • sowbug14 minutes ago
    • simedw3 hours ago
      For accents, I’ve mostly tested with a few friends so far. I’m wondering whether region should be a parameter, because training on all dialects might make the system too lax.
    • devin2 hours ago
      This sounds like how solfeg training works. You use a hand signal to indicate a specific tone: do re mi fa so la ti
    • cyberax2 hours ago
      Hand motions help! Especially when you want to memorize new words, because initially you need to treat tone as something additional to remember.

      I used simple index finger motions to mark tones.

  • rablackburnan hour ago
    > And if there’s one thing we’ve learned over the last decade, it’s the bitter lesson: when you have enough data and compute, learned representations usually beat carefully hand-tuned systems.

    There are still holdouts!

    Come back to me in a couple of decades when the trove of humanity's data has been pored over and drifted further out of sync with (verifiable) reality.

    Hand-tuning is the only way to make progress when you've hit a domain's limits. Go deep and have fun.

  • rahimnathwani3 hours ago
    This is incredible. When I was first learning Chinese (casually, ~20 years ago), my teacher used some Windows software that drew a diagram of the shape of my pronunciation, so she could illustrate what I was getting wrong in some objective way.

    The thing you've built is so good, and I would have loved to have it when I was learning Mandarin.

    I tried it with a couple of sentences and it did a good job of identifying which tones were off.

  • stuxnet79an hour ago
    How difficult would it be to adapt this to Cantonese? It is a surprisingly difficult language to learn. It has more tones than Mandarin plus comparatively less access to learning resources (in my experience)
  • ChadNauseam2 hours ago
    This is amazing. I'm also working on free language learning tech. (I have some SOTA NLP models on huggingface and a free app.) I have some SOTA NLP models on huggingface and a free app. My most recent research is a list of every phrase [0].

    Pronunciation correction is an insanely underdeveloped field. Hit me up via email/twitter/discord (my bio) if you're interested in collabing.

    [0]: https://gist.github.com/anchpop/acbfb6599ce8c273cc89c7d1bb36...

  • affogarty3 hours ago
    This is extremely cool, although I asked my wife (who is Chinese) to try it out and it said she made some mistakes.
  • SequoiaHope2 hours ago
    Amazingly I just did the same thing! Only with AISHELL. It needs work. I used the encoder from the Meta MMS model.

    https://github.com/sequoia-hope/mandarin-practice

  • babyan hour ago
    For people trying to say the "j" sound correctly, as in "jiu" (old), just say "dz", so in that example "dziu"
  • byb2 hours ago
    Neat. A personal tone trainer. Seriously, shut up and take my money now. Of course, it needs a vocabulary trainer, and zhuyin/traditional character support.
  • jrockway2 hours ago
    Interesting application! A friend of mine built a model like this to help her make her voice more feminine, and it is neat to see a similar use case here.
  • an hour ago
    undefined
  • bytesandbits2 hours ago
    great work! I am going to try it out. Currently about to learn some Mandarin to be able to talk with hawker stand owners for a trip I am doing soon. I am trilingual and can speak a few languages on top of that, but none of them tonal. I am new to tonal languages and I find myself struggling with this... a lot!
    • anonzzzies2 hours ago
      goof luck! I speak 6 languages fluent but none of them tonal and I find mandarin very challenging; it does not help that people in places where you might need it are not very forgiving; asking for green fork in a tea shop has people very bewildered.
  • nirvanatikku2 hours ago
    talk about 30 seconds to wow. great app, UX and demo. would love to use this. kudos.
  • jellojello3 hours ago
    This is amazing, if you feel like opening an entire language to being learned more easily.. Farsi is a VERY overlooked language, my wife/her family speak it but it's so difficult finding great language lessons (it's also called Persian/Dari)
    • simedw3 hours ago
      Thank you.

      I had a quick look at Farsi datasets, and there seem to be a few options. That said, written Farsi doesn’t include short vowels… so can you derive pronunciation from the text using rules?

      • kranner3 hours ago
        > written Farsi doesn’t include short vowels… so can you derive pronunciation from the text using rules?

        You can't, but Farsi dictionaries list the missing short vowels/diacritics/"eraab" for every word.

        For instance, see this entry: https://vajehyab.com/dehkhoda/%D8%AD%D8%B3%D8%A7%D8%A8?q=%D8...

        With the short vowel on the first letter it would be written حِساب (normally written as just حساب)

        The dictionary entry linked shows that there is a ِ on the first letter ح

        But you would have to disambiguate between homographs that differ only in the eraab.

  • cmuguythrow2 hours ago
    Awesome idea!
  • iamanllman hour ago
    holy crap, I was literally imaging how I wanted something exactly like this yesterday! you are a hero!
  • btrlsnqtn3 hours ago
    The article mentions the bitter lesson. I'm confused about the status of Sutton's opinion of the bitter lesson. On the one hand, he invented the concept. On the other hand, he appears to be saying that LLMs are not the correct approach to artificial intelligence, which to a naive outsider looks like a contradiction. What gives?
  • drekipus3 hours ago
    instantly awesome.

    I suck at chinese but I want to get better and I'm too embarassed to try and talk with real people and practise.

    This is a great compromise. even just practising for a few minutes I already feel way more confident based on its feedback, and I feel like I know more about the details of pronunciation.

    I'm worried this might get too big and start sucking like everything else.

  • dionian2 hours ago
    it heard wu2 but i heard wo2 from you fine. and it should sound like wo2 not wo3 if spoken quickly. not a native speaker though so i could be wrong
  • funkyfiddler3693 hours ago
    [flagged]