Spaced repetition for efficient learning (2019)(gwern.net)

143 pointsby tsenturka month ago9 comments

ChadNauseama month ago
One counterintuitive issue with spaced repetition is that the modern algorithms like FSRS are actually almost too good at scheduling. The effect is that each card is almost perfectly scheduled to be very difficult but still doable. Now, it's a bit weird to call it an issue considering it's the whole concept working exactly as designed. But it does cause one follow-up problem.
The problem is that in life, we are accustomed to things becoming easier as we get better at them. So you start drawing faces and it starts out feeling very difficult, but then as you practice more and more, it feels easier and easier. Of course, by the time it's feeling easy, it means that you're no longer actually getting effective practice. But nevertheless, it's the feeling that we are accustomed to. It's how we know we're getting better.
Because spaced repetition is so good at always giving you things that you will find difficult, it doesn't actually feel like you're getting better overall even though you are. The things that you are good at are hidden from you and the things that you are bad at are shown to you. The result is a constant level of difficulty, rather than the traditional decreasing level of difficulty.
I've encountered this problem myself. I built a language learning app for fun, and some of my users feel like they're not learning very much compared to alternatives that don't use spaced repetition. In fact, it's the exact opposite. They learn much more quickly with mine, but they don't have that satisfying feeling of the lessons becoming easy. (Because if I gave them easy challenges, it wouldn't be as productive!)
I'm not sure what the best way to solve this problem is. I would much appreciate any advice.
- sodality2a month ago
  The solution is to learn content that you actually use with some regularity in your life outside of the testing! If you're doing this for education, the payoff might be the exam; if you're doing it to learn things without some particular end goal, you'll have to make your own way to make it worth it.
  The language learning app people could try scheduling monthly video chats with native speakers (swapping turns halfway through so it's mutually beneficial) and notice their proficiency improve.
  - ChadNauseama month ago
    haha, that's a great point. I need to find more ways to do that. Maybe I should put $targetlanguage songs on my playlists so I'll get happy as I'm able to recognize more and more
    djmipsa month ago
    Some kind of game they get to play that uses the knowledge they are learning?
- zahlmana month ago
  Just have the program point out the size of the word pool it's drawing from. "Number go up" is objective and clearly motivates people in other disciplines (someone else mentioned lifting weights).
  - ChadNauseama month ago
    I've just added this yesterday. It now shows you how close you are to knowing 95% of the words of various movies. I think movies are a good goal, but I want earlier milestones too, so I need to think more. (Maybe spongebob episodes would be good actually)
    huhtenberga month ago
    > knowing 95% of the words of various movies
    Ah, that's what it's meant! I thought it was some sort of affiliate Amazon link or an ad, but it wasn't clickable and made little sense given the context.
- NiloCKa month ago
  I think what you're describing here is mostly a mindset / user education problem. SRS is for "serious learning", which by necessity will unconditionally feel difficult - if your training sessions aren't strenuous, then they don't drive adaptation.
  It's hard to get around without marrying the SRS with something like a hierarchical skill tree whose traversal you can be made aware of, or some other visible progress metric (eg, climbing the ELO of encountered puzzles in a chess training engine).
  Still: users have to get comfortable with being uncomfortable if they want to profit from these sorts of systems.
  A different issue with SRS's lazer accuracy is the Pareto tradeoff between efficiency and robustness.
- Denzela month ago
  ‘Desirable difficulty’ is the research term. To solve your problem, first understand your users need a mindset change. We need to connect their action to a “satisfying feeling” as you said.
  You want your users to be like weight lifters. No lifter comes out the gym saying, “Man that was the best workout, felt so easy,” to the contrary, lifters use progressive overload to induce difficulty because that difficulty connects to the results they want.
  For your users, you need some way to measure the outcome, so that you can show them, “hey look, that mild discomfort lead to more progress on what you care about,” and then you need to consistently message that some difficulty is good.
  Mindset change takes consistency and time. Won’t happen over night. You’ll know you succeeded when students become aware of “hey, I’m not learning as well if it doesn’t feel difficult”, and then react by increasing the challenge.
  - watwuta month ago
    Weightlifters use weight they can lift and feel good after the session. Literally. They may feel tired, but they feel good. They see weights go up and feel like progressing (unless they are in fact stagnating).
    That is literal opposite of what OP describes. What OP describes is weight lifter taking on weight they cant lift and conatantly feeling like a failure after each training.
  - ChadNauseama month ago
    That's a great analogy. I'll need to think about how to message that difficulty is good. It's a tricky proposition.
- bluGilla month ago
  If your only exposure to words is the spaced repititon why are you wasting your time learning? Use the language and soon you won't need the repitition at all. The app is good to start because you need a couple thousand words before you can do anything, but in a few months you should be switching from learning words to reading and listening (and soon after writting and talking). if you do it that way most words will occure far more often than the algorithm suggests and become easy
  - victorbjorklunda month ago
    You kind of need enough knowledge to use it in a fun way. You only know 100 words of Spanish and no grammar etc? Well, that won’t be any fun stories to read, fun conversations to have etc.
    Notice I use ”fun” on purpose. Because some people are able to enjoy the process of using the language even at that stage (but many do not).
- criley2a month ago
  I have a half dozen language learning apps on my phone and have vibe coded a few concepts as well and while spaced recognition is amazing, it still suffers from the duolingo "vocabulary is not a language" problem.
  IMO the way around users feeling like spaced recognition isn't progression is by redefining progression away from memorizing vocabulary into into becoming proficient in conversation both listening and speaking. If spaced recognition vocab is just one feature of a holistic experience, users will judge their progression holistically.
  I'm really waiting for that one app that finally connects ChatGPT Advanced Voice Mode or Gemini Live to a context-aware and well trained language tutor. I can already have impromptu practice sessions with both in Mandarin and English but they quickly lose the plot regarding their role as a tutor. I'd love to have them available as part of a learning journey. I can study vocab and flash cards all day but the second that voice starts speaking sentences and I need to understand in real time, I freeze up. The real progress is conversing!
  - vunderbaa month ago
    I’ve pointed this out before on HN, but if you want to use ChatGPT as a language partner, you must provide a topic. Expecting it to behave as a proactive teacher is a recipe for disappointment.
    Here’s what I typically do:
    - Create a custom GPT (mine is called Polly the Glot) with a system prompt instructing it to act as a language partner that responds only in Chinese or your target language of choice. Further specify that the user will paste a story or topic before beginning practice, and that this should guide the discussion.
    - Start a new chat.
    - Paste in an article from AP/Reuters.
    - Turn on Voice Chat.
    At that point, I’ll head out to walk my dog and can usually get about 30 minutes to an hour of solid language practice in.
    Fair warning, you'll likely need to be at least an intermediate student by this point otherwise it'll probably be too much over your head.
    Caveat: You could including a markdown file of your known vocabulary as an knowledge attachment in the custom GPT but I've no idea how well that would work in practice.
    criley2a month ago
    I have played around pretty significantly with the markdown context idea but managing it by hand is pretty tough.
    - I take chinese tutoring lessons on italki with a tutor who uses notion (copy paste in markdown)
    - I copy/paste our notion notes in markdown into a repo for storage
    - I use AI to summarize lessons and to keep general context on progress
    - I use AI to generate a voice AI lesson plan, such as 10 words to focus on, reviewing a specific human tutoring session, or some conversational focus area.
    - I start the advanced voice AI with the context
    Unfortunately the AI still loses the plot pretty quickly and devolves into free form conversation. It struggles significantly to enforce any kind of structure that would be helpful for structured learning. I haven't tried this in a few months though, maybe newer models are improving.
  - ChadNauseama month ago
    If you've vibe coded a few language learning apps, maybe we should collab. Check my bio to see what I've done in this space. I'm trying to make the best free language-learning app on the planet
  - CorrectHorseBata month ago
    >but the second that voice starts speaking sentences and I need to understand in real time, I freeze up. The real progress is conversing!
    What helped me a lot was doing a lot of listening exercises. Start with concentrating on what you can recognize, not on what you can't. Then listen again and and again and again trying to recognize more and more.
    criley2a month ago
    That's what I do and it helps and many apps let me do just that. Repeating it, reading the hanzi, reading the pinyin, and it all makes sense.
    But there's something about the "conversation" between a real human or an AI voice mode where you're not on the rails. It's real time and you have to lock in and understand. That's where the magic happens!
  - copperxa month ago
    How is spaced recognition different from spaced repetition? Recall is different from merely recognition, right?
    a month ago
    undefined
    criley2a month ago
    Sorry, just a typo on my part, I meant spaced repetition.
- ben_wa month ago
  > I'm not sure what the best way to solve this problem is. I would much appreciate any advice.
  Serious answer: All the dark patterns.
  Loot boxes even if they only give users a digital hat, small animated bird (like the green one, but not) doing a silly dance when users get enough correct answers, some weird phrases sprinkled amongst the lessons which make the users laugh.
  Just, please let them have an off switch for people like us :)
  - ChadNauseama month ago
    Something funny is that I watched someone use my app for the first time yesterday, and she specifically requested more dark patterns. She said "you should add something that makes you want to come back like the duolingo owl looking sad". I don't know if I can bring myself to do that, but I have a feeling that I can find a way to play to my strengths rather than copy duolingo. Specifically, my advantage is that my app will actually teach you the language, so I can maybe find a way to turn that into motivation rather than needing to rely on dark patterns.
    noahjka month ago
    Trying your app from your profile, and the movie thing is a bit weird for two reasons - at first, I thought it was an ad when it popped up above the fold, and second, I'm seeing a bunch of horror posters, which I don't really want to see? The movie thing is 'neat' but my main reason for learning another language isn't to be able to tell people "I know 56% of the words in La Bruja"
    ChadNauseama month ago
    Thanks, I really appreciate the feedback. I removed the horror movies from the database, so they shouldn't show up anymore. I also made some changes to make the poster it selects appear less like an ad. It was kind of over the top before
    ben_wa month ago
    I absolutely understand your reluctance; Duolingo isn't the only thing you can take inspiratiom from, you can have an XP/level system like Clozemaster, which has (had?) an animated celebration gif on level-up and also an option to switch it off.
- tomasGidena month ago
  I just tried out your app for the first time. First time trying to learn Spanish. I feel exactly like the user you describe but it is because I have to click Don’t remember for 70-80% of the words.
  I’ve always had difficulty remembering vocabulary. I remember cramming German in School 30 years back. We had 20 words we had to learn per week and I could sit a whole night repeating and repeating them just because they wouldn’t stick. And then in the morning they were all gone anyway. So I gather I am a bad language learner.
  In your algorithm, do you assume everyone’s recall is the same or do you optimize for a recall rate which make everyone fail a certain percentage of the word? If so, knowing that I am supposed to not remember 70% would be a good reminder in the app to not feel bad.
- michaelcampbella month ago
  FSRS has dials to tweak to show you cards either more or less often, so if it's feeling "too hard" you can tweak that dial to show you cards when "easier" (at the risk of seeing more cards per day since you're seeing them more often).
  Failing to do that, one might consider that instead of focusing on how hard each card feels, but rather the size of the corpus that they have "under their belt". This is the case if you're constantly adding new cards - if you are and your cards/day is stable, then you have an ever increasing mound of memorized knowledge.
  If you aren't adding new cards, then the cards/day will inevitably go down, barring some actual cognitive issue.
  It's a matter of what you focus on as your measure of success.
  - ChadNauseama month ago
    True. Changing the recall rate would definitely make it easier. But like you said, it increases the workload. I'm not sure that's the optimal play.
    I want to find a way to show people how much they're learning. And maybe make it more fun. But what I really want to do is change their mindset from "wow, that was easy, i'm learning so much" to "wow, that was hard, i'm learning so much".
- renewiltorda month ago
  Have you considered a “Challenge” mode where a user can simply see what “level” they’re at? The Pimsleur app has a more natural fixed progression and also a spaced repetition feature (I think) but running a challenge on an older class definitely gives me a feeling of “okay, I definitely know this material”.
  And for what it’s worth, I’ve been able to sufficiently communicate some basics with my wife’s family in Mandarin that has them thrilled with me. So the learning in P is working somewhat.
- yogsototha month ago
  I remember a documentary about learning things to kids, and in order to optimise learning efficiency the optimal ratio for difficult questions should be 1/4. So the children continue to feel good about themselves, while still improving.
  So I guess this ratio about easy vs difficult question should be a parameter in such spaced repetition algorithms.
- SteveMqza month ago
  Well the simplest thing would be to have test/benchmarks you do at the start and end of each “chapter” (however you define that).
  You fail miserably at the test at the start of each chapter, and crush it at the end.
  The difficult part is deciding how the tests are spread out.
  - ChadNauseama month ago
    I love that idea. Thanks. I'm going to try to implement that!
- throwaway7783a month ago
  What's the name of the app? I would like to try it out!
- fouca month ago
  A lot of the responses seem to be blaming the user/learner and requiring them to change their mindset/attitude, which is actually an insane take.
  As you pointed out, SRS isn't the full solution.
  BTW, I would say that language classes often try to maintain a constant level of difficulty, but there is usually some kind of coverage of the previous material too.
- watwuta month ago
  Frankly, they are not at all good at that. At least for me, they just failed. Thr fact is, they have no idea about when I am about to forget this or that fact.
charcircuita month ago
>I like Mnemosyne (homepage) myself - Free, packaged for Ubuntu Linux
It seems to use a 2 decade old modification of a now 4 decades old algorithm which will be worse and waste more of the user's time than using Anki with FSRS or SuperMemo with SM-18.
- zakkia month ago
  Do you have recommended apps using those algorithms?
  - treetalkera month ago
    Based on discussion on HN several days ago, I have started using Hashcards (markdown-based flash cards with FSRS; run from command line, but view cards in the browser). Once set up, it's pretty simple (e.g., no fiddly settings) which helps a lot. Getting new cards added is easy — especially with a Keyboard Maestro macro that I made (global keyboard shortcut pops up a window to enter Q/A and select a deck, i.e. md file, to append to; entering text only on the Q line creates a cloze-deletion card instead). Recommended.
    My use case is drilling English --> target-language sentences, as well as law-related knowledge, miscellaneous facts, etc. Still mulling over what to do about other skills-based practices, à la Andy Matuschak's concept of "spaced everything".
    - Hashcards: https://github.com/eudoxia0/hashcards
    - HN discussion about Hashcards: https://hn.algolia.com/?q=hashcards
    - Matuschak's "spaced everything": https://notes.andymatuschak.org/Spaced_everything
    ChadNauseama month ago
    If you're drilling English --> target-language sentences with spaced repetition, you might be interested in the free site I made to do just that.^[1] You can find the link in my bio. The source code is also on github.
    Sentence practice is really the best way to do things imo. Studying vocabulary in isolation is so limited by comparison. So nice moves there.
    ^[1]: Actually, my thing does target-language to english drills, not the other way around.
  - sxga month ago
    Mochi.cards
    Very similar to Anki but with a sane UI.
    ChadNauseama month ago
    Make sure you enable FSRS mode if you use mochi.cards (or anki), it's not the default
  - charcircuita month ago
    Anki with FSRS.
yanis_ta month ago
While I’ve been working on my knowledge base meets spaced repetition project, I looked through a bunch of articles, and it’s very easy in fact.
We keep forgetting stuff. But we can remember it more by active recalling. And there is an evidence that you can recall with intervals that grow, to make it optimal. That’s it really. Everything else is tooling on top of that simple fact.
* https://github.com/odosui/mt
yard2010a month ago
Anecdotally I stumbled upon this phenomenon when trying to learn how to play the piano. I noticed that at the end of a session I make so many mistakes and feel like I didn't learn that much, but coming back to it after a day or two I really felt the difference.
- boxeda month ago
  Motor learning is quite different from the type of information the article talks about. I tried adding dance moves into Anki to do spaced repetition and it's extremely obvious that it's a great way to remember a move very badly but never getting good at it. Compare that to the geography deck where Anki is just perfectly suited for the task and smashes it.
  - leonharda month ago
    Do you have more experiences with learning dance moves and spaced repetition you can share? That sounds interesting. (Also what dance is it?)
  - OriginalPenguina month ago
    Spaced repetition works well for motor learning. You just have to keep hitting “Again” until you are actually good at it.
    boxeda month ago
    I don't think that's very useful. You're saying basically treating anything except mastery as "I forgot". That's too much practice. It also doesn't take into considering that you are better of doing your reps later in the day (ie close to your sleep cycle).
    Sure, you can sort of use SRS here, but it's suboptimal and probably will leave too many cards in the top priority "learning" pile causing too much load, or you train incorrectly.
    Still, I agree that this is MUCH better than NOT doing SRS if you don't have an alternate tool with a better algorithm.
- k__a month ago
  Learning to play an instrument feels like magic.
  You fail the whole day.
  Don't have the feeling anything sticks.
  Then, the next day it works right from the start.
  No new insights, nothing, it just works.
  - XCSmea month ago
    I am coaching table-tennis, and sometimes I tell people that we only actually "learn" while we sleep. So, without sleeping, the brain doesn't have time to "save" the new information for future use.
    Not sure if it's factually correct, but it seems about right, sleeping seems to be the magic sauce, and the time when all memories are written from RAM to disk.
  - Qema month ago
    I never played any instrument, but I had the exact same experience getting through difficult stages in videogames.
    k__a month ago
    Yes, me too.
    It seems to be a thing with practicing motion sequences.
    sobjornstada month ago
    I've noticed the same thing with rote memory tasks like lines of poetry, so I think it might be a more general thing involving the memory consolidation properties of sleep, maybe particularly focused on fluency/speed rather than mere ability to recall.
swaitsa month ago
Unfortunately seems to be too old to include coverage of FSRS and the associated optimization algorithms. Would love to see Gwern’s updated thoughts on this.
scratchyonea month ago
Really big fan of the design of the header on this blog, cool way to represent tags and article information without it being monotonous.
InkCanona month ago
There are three closely related reasons why spaced repetition is much less useful in practice then in theory.
1. It doesn't train real task performance. There is a spectrum of problems that people solve. On one end it is the recall of randomized facts in a flashcard prompt->answer way. On the other end is task performance, which can be more formally thought of as finding a path through a state space to reach some goal. The prompt->answer end is what SR systems relentlessly drill you at.
2. SR is pretty costly, prompt->answer problems are also low value. If you think about real world scenarios, its unlikely that you will come across a specific prompt->answer question. And if you do, the cost of looking it up is usually low.
3. The structure of knowledge stored is very different (and worse). If you think about high performance on a real world task like programming or theorem proving, you don't recall lists of facts to solve it. There's a lot about state space exploration, utilising principles of the game, leveraging known theorems, and so on.
This is a more descriptive version of the "rote memorization" argument. There's two common counters to this:
1. Learning is memorization. This is strictly true, but the prompt->answer way of learning is a specific kind of memorization. There's a correlation-causation fallacy here - high performers trained in other ways can answer prompts really well, it doesn't mean answering prompts really well means you will becoming high performing.
2. Memorization is a part of high performance, and SR is the optimal way to learn it. This is generally true, but in many cases the memorization part is often very small.
These ideas more accurately predict how SR is only significantly better in specific cases where the value of prompt-answer recall is really high. This is a function of both the cost to failing to remember and the structure of knowledge. So medical exams, where you can't look things up and is tested a lot as prompt->recalls, SR finds a lot of use.
My own guess for the what the next generation of learning systems that will be an order of magnitude more powerful will look like this:
1. Domain specific. You won't have a general system you chuck everything in. Instead you will have systems which are built differently to each task, but on the similar principles (which are explained below).
2. Computation instead of recall - the fundamental unit of "work" will shift from recalling the answer to a prompt to making a move in some state space. This can be taking a step in a proof, making a move in chess, writing a function, etc.
3. Optimise for first principles understanding of the state space. A state space is a massive, often exponential tree. Human minds cannot realistically solve anything in it, if not for our ability to find principles and generalise them to huge swaths of the state space. This is closely related to meta-cognition, you want to be thinking about solving as much has solving specific instances of a task.
4. Engineered for state space exploration - a huge and underdeveloped ability of machines is to help humans track the massive state space explorations, and evaluate and feedback to the user. The most common used form of this is currently git + testing suites. A future learning system could have a git like system to keep track of multiple branches that are possible solutions, and the UX features to evaluate various results of each branch.
a month ago
undefined
bluGilla month ago
This is memorization. There is debate on if that is learning. At the very least you need to apply this learning to real life or you will never know if you learned. So get to that point quick.
- copperxa month ago
  Memorization is learning. I don't think there's debate about that.
  Whether memorization is useful on its own largely depends on the task.