Show HN: Audiogen – a new take on generative music AI(audiogen.co)

3 pointsby elyxlz2 hours ago3 comments

2 hours ago
undefined
elyxlz2 hours ago
Hi HN, Elio here, co-founder of Audiogen.
We've spent 2.5 years building a generative music model and a brand-new interface for making music. We are a tiny team of self-taught researchers and seasoned engineers and designers. We all love and make music, and felt unsatisfied with how existing music models were deployed and used.
We started with a long, grueling journey into building and refining the whole stack to pretrain and posttrain our music model on a shoestring budget. Some key lessons along the way: the trap of thinking that compression and reconstruction in the VAE / audio codec were the most important things, when what really matters is the downstream learnability of the latents, and that hyperparameter-optimal scaling laws are the best way to ablate training recipe and architecture experiments.
We finally reached a level of quality with our models that we believe punches well above its compute budget (~$75k for our flagship pretrain). Aphrodite, the audio codec (~3 kbps), Apollo, our diffusion transformer, and Virgil, our synthetic audio captioner.
Our next step was to rethink the HCI layer for music models. We rebuilt the DAW / timeline experience from first principles into a beginner friendly web GAW (Generative Audio Workstation) where inpainting, extending, remixing, multi track / stem editing are as intuitive as "painting" on the screen. We aimed for an experience that stays in touch with the spirit of creation, where the creator still walks out of a finished song feeling pride in what they made, and yet accessible enough for anyone to feel the joy of making music.
Come check it out!
Manifesto + launch video on X: https://x.com/audiogen/status/2059297892465062386?s=20
Audio samples + product demo: https://audiogen.co/demos
Waitlist: https://audiogen.co/waitlist
Happy to answer questions on the architecture, infra, or anything else.
delis-thumbs-7ean hour ago
Yeah, but why? To save ad budget on background music for a cereal commercial? All this is so lame, like if you want to play most of these genres just start a band with your mates…
Wouldn’t it make more sense for AI music to do something that human beings genuinely cannot do, rather than copy us? There’s nothing as soulless as AI track that you call soul…
- elyxlzan hour ago
  These are demos just to show the core capabilities of the model, check out our twitter thread to see examples of our product, the goal is not to one shot entire songs !