C++20 coroutines have seemed very complex to me and I haven't sat down to try to understand them. I guess tiny_coro could help for that. I do remember that they were stackless.
I'm not that keen on coroutine and async approaches to concurrency anyway these days. I'd rather use something with actual multitasking, either with Posix threads or with lightweight processes like Erlang's or Go's.
Also, C++ itself is apparently in the process of sinking into a bog. Rust is taking over, which is not entirely good news. I had wanted to spend more time on Ada.
Erlang's processes and Goroutines are stackful unlike C++ coroutines. Erlang also forbids observable data sharing between processes which avoids a lot of pitfalls. I don't think that can be enforced in C++ or Go.
GHC lightweight threads and its STM library (software transactional memory) could be another thing to look at. I wonder if a useful STM feature is feasible for tiny_coro.
Since old Fortran does not have function pointers I used to wonder how does one write gradient descent routine for a user defined function in Fortran.
In other languages, one typically passes the user defined function and it's gradient to the routine as callbacks, so that the routine can call them at will.
It was a moment of great amusement and joy (grinning like a silly kid) when I learned how Fortran did it. It was through a pared down form of a coroutine. Inversion of control.
The gradient descent routine would effectively suspend and 'return' with a flag indicating that it needs the user defined function computed, or the gradient function computed.
These are computed outside and then the routine is called again with these recently computed values. At this point it resumes from where it had yielded. Pretty neat.
The routine's return value indicated whether it is done, or that it is suspended and expects to be called again to resume.
Now I don't remember if the suspended state was stored transparently by the run time, or explicitly in the arguments of the routine (pass by reference, I think), or whether the local, routine specific 'program counter' was managed explicitly or transparently by the compiler and run time.