It's definitely impressive to write a large language model, image generator and listening model entirely by hand