I started using it to generate songs that reinforce emotional regulation strategies -things like grounding, breathwork, staying present. Not instructional tracks, which would be unbearable, but actual songs with lyrics that reflect actual practice and skills.
It started as a way to help me decompress after therapy. I'd listen to a mini-album I made during the drive home. Eventually, I’d catch myself recalling a lyric in stressful moments elsewhere. That was the moment things clicked. The songs weren’t just a way for me to calm down on the way home, they were teaching me real emotional skills I could use in all parts of my life. I wasn’t consciously practicing mindfulness anymore; it was showing up on its own. Since then I’ve been iterating, writing lyrics that reflect emotional-cognitive skills, generating songs with them, and listening while I'm in the car. It's honestly changed my life in a subtle but deep way.
We already have work songs, lullabies, marching music, and religious chants - all music that serves a purpose besides existing to be listened to. Music that exists to teach us ways of interacting is a largely untapped idea.
This is the kind of functional application is what generative music is perfect for. Song can be so much more than listening to terminally romantic lyricists trying to speak to the lowest common denominator. They can teach us to be better versions of ourselves.
Still, I’m excited about the product. The composer could probably use some chain of thought if it doesn’t already, and plan larger sequences and how they relate to each other. Suno is also probably the most ripe for a functional neurosymbolic model. CPE wrote an algorithm on counterpoint hundreds of years ago!
https://www.reddit.com/r/classicalmusic/comments/4qul1b/crea... (Note the original site has been taken over, but you can access the original via way back. Unfortunately I couldn’t find a save where the generation demo works…but I swear it did! I used it at the time!)
https://suno.com/playlist/d2886382-bcb9-4d6d-8d7a-78625adcbe...
https://music.apple.com/au/album/breath-of-the-cosmos/175227...
https://open.spotify.com/track/0mJoJ0XiQZ8HglUdhWhg2F?si=tID...
But I really think they've made a mistake with direction, realistically it should've been trained on tracker files, and build somgs via the method (but generate the vocals,individual instrument sounds for midi, obvs).
I think the quality would be higher since the track can be "rendered" out essentially, but also only then would it be a useful tool for actual musicians, to be able to get a skeleton file (mod, etc) for a song built up that they can then tweak and add a human touch to.
A friend recently made a simple app/site where he can pick the bpm of the music (he likes running listening to 180bpm), then see a bunch of songs and create a quick playlist he can load on Spotify with 1 click and go running
That got me thinking it would be cool to use suno/ai to create activity-specific songs, like songs about/for running or biking, or studying, or working, or painting. Instead of trying to curate popular hit songs to fit the task
https://suno.com/playlist/e6c3f3d1-a746-4106-bea1-e36073d227...
Side note: It feels a little vulnerable to be sharing these. They genuinely helped me through difficult times and I wasn't really expecting anyone else to ever listen to them.
The first time I heard it, it was incredible. The 2nd wedding that did it, it started to feel boring. The 3rd time, everyone hated it.
Similar to image-generation, we're getting tired really fast of cookie-cutter art. I don't know how to feel about it.
That's not a tool issue. It just means that working on a raised floor is not the same a being able to reach a higher ceiling.
I don't know. Scrolling the Sora image generations feed is pretty fun.
It's got trendy memes, lots of mashups, and cool art. They've also managed to capture social media outrage bait on the platform: conservatives vs. liberals, Christians vs. atheists, and a whole other host of divisive issues that are interspersed throughout the feed. I think they have a social media play in the bag if they pursue it.
It feels like Sora could replace Instagram.
Weird, stupid things. Writing theme songs for TV shows that don't exist, finding ways to translate song types from culture A to culture B, BGM for a video game you want to make, a sales song for Shikoku 1889 to sell Iyo railway shares, etc...
Some of us have zero cultural influence and services like Suno mean we aren't listening to the original brainrot (popular music). Sure, you might create garbage but it's your garbage and you aren't stuck waiting for someone to throw you a bone.
I love Suno, it's a rare subscription that is fun.
I'm pretty sure that I actually could, if I really wanted to, create this cover legitimately and even put it on Spotify with royalties going to the original artists (it seems they have a blanket mechanical license for a lot of works). But it was a "gag" song that probably has a market of just me, so hiring a team of people would be a lot of time and money for 3 minutes of a giggle. I also would have to worry about things like if it's changed too much to be a cover and getting sued for putting in extra effort.
That being said, your idea isn't original; there's already a flood of automated AI-generated cover songs being pushed onto Spotify, and they + distributors are (allegedly) starting to actively combat this.
I'll say that this "Suno" thing makes good-sounding music to these non-musician ears right here, but trying a few of these I'm starting to notice it seems fake. But that's not very interesting. What's interesting is that they're going to get good enough to get past the phoniness.
> I don't know how to feel about it.
I know how I feel about it: I don't like it one bit.
It's personal taste, but this is significantly better than the last couple of fiction books I've read (which were both well reviewed).
I think it's good enough that it's hard to argue is emotionally vacuous, unless you define that to mean 'it was written by a machine'
I think increasing we'll find AIs are extremely good at emotional 'manipulation' (which I mean in the same sense has how a good tearjerker is in some sense emotionally manipulative).
I have a feeling that’s by design. Firstly for computation purposes, secondly to avoid someone making a studio-quality deepfake song.
edm anti-folk is also great: https://suno.com/song/47f0585c-ca41-4002-9d7f-fe71f85e0c62
But I'll bet you anything, the average ear won't care. This music is already as good as what's produceable by humans, and will be available for a fraction of the cost without the licensing fees. Be ready to see it popping up everywhere—in cafes, restaurants, on TV ads, on your next spotify playlist. These soundtracks will become ubiquitous, and these far cries in communities like these will become a marginalized minority just like any technology that's been superseded.
Human generated music will still exist, of course, as the deep emotional ties that humans feel towards others (artists) cannot be replaced by this technology. But, there are massive use cases where that type emotional connection is not necessary (everything i noted prior, but also game & movie soundtracks, in waiting areas, tv shows, etc), where I would place a strong bet that this will eventually become even more commonplace than human generated music.
I can't begin explain how taken aback I am by this comment, to call others cynical and come out with this - do you think these currently feature the work of anonymous composers of unloved background music?
Have you heard of the likes of Nobuo Uematsu, Yoko Shimomura, Lena Raine, John Williams, Clint Mansell, Ennio Morricone, Ramin Djawadi, Max Richter, holy shit Max Richter, have you never been pierced to your very core by something like On The Nature of Daylight used to perfect effect at an emotional climax in a movie or TV show? The hair on my arms is standing up just thinking about it.
This is the most shallow sampling of a group people who are beloved for their work in these media and likely for lifetimes beyond it.
As you get higher up the ladder budget wise, quality wise; I think it's an open question as to what will happen. Working with humans introduces another variable, adds expense, time complexity, and so on. Not every producer, 100% of the time, is going to think this tradeoff is any longer worth it, when they can generate something of similar quality without much effort. Universal has already announced they're doing something similar for script writing. It just seems natural that human-made music is also on the chopping block.
Yes this does sound like the enshittification of everything; and I'm certainly not advocating for this course of events at all. But granted how capitalism works, how the human mind works, it just seems like the direction things are likely to go, given how capable this technology is.
I can't say I fully agree with you on video game/movie soundtracks, but I think AI generated assets will make game development more accessible, especially for solo developers or small teams.
So I'd just say listen to what you like, see where technology leads us. I don't think human creators will be put out of buisiness any time soon, but they might get competition especially in 'functional' music.
I guess different strokes but some of the best music I've ever been turned on to just happened to be playing in some random cafe or coffee shop. Conversely if the music is bland and uninspired I'm much less likely to go back.
"Don't set out to raze all shrines - you'll frighten men. Enshrine mediocrity - and the shrines are razed."
On the contrary, this comment is peak HN. The reverse take that a machine eating human art and creativity and selling the interpolated derivatives back to you after laundering it from royalties is a common good.
> Let's all tell ourselves that music requires that 'special human touch' or audiences will become bored, unimpressed and uninterested.
Ah those pesky humans, as opposed to corporate subscription-based content farms as the certified non-cynical happy future of music & art.
> This music is already as good as what's produceable by humans
It’s about as good as everything else generated with AI, like say a large code base. If you can’t tell, congrats.
And you wonder why people are cynical? Do you really think that the best answer to solve IP law and the blandness of pop-music is by making it so cheap that we make it available to everyone?
> But, there are massive use cases where that type emotional connection is not necessary (...) game & movie soundtracks, in waiting areas, tv shows
What will be the point of watching a movie or TV show, then? What will be the point of making one?
Did I say I'm advocating for this future? I'm simply stating an observation, and a likely outcome based on plenty of precedent of similar behavior in industry.
> What will be the point of watching a movie or TV show, then? What will be the point of making one?
I'm sure Michael Bay would consider what he does to be artistic expression; whilst others would say it's a semi-shameless money-grab. Half joking, don't take me too seriously.
I'm trying to use Suno 3.5 to create low quality 90s/00s-style MIDI music (similar to Vektroid[1]) since that's my favourite genre. Ironically, what it created[2] still doesn't properly evoke the hollow, tinny, low-quality computer-generated sound that I want to hear.
Specifically, it reduced the number of instruments, so the final result still sounded good. It didn't mash a bunch of MIDI instruments together and create something just a little incoherent that implies it was based on something better.
I think humans are better at stealing/remixing other songs and making them deliberately worse.
Tastes will probably change to be into whatever AI is unable to generate effectively and this seems similar to Stable Diffusion's inability to generate ugly people.
The tech is great, it's the music it produces that's unoriginal, uninspiring and bland.
> Be ready to see it popping up everywhere—in cafes, restaurants, on TV ads, on your next spotify playlist.
So music will become even shittier going forward? Yay tech! Thanks for automating away the act of music listening!
Seriously though, now that music-as-a-product has been killed by techbros, can we go back to music-as-an-act, like before? That would be the silver lining.
Human made music will continue to exist, but for me, just for me, a lot of it doesn't do anything at all and I wouldn't be able to tell if it's by human, let alone knowing the story behind them or the emotional connection the author had when making that piece of music. I'm sure many people who have better ear will be able to differentiate, but many others will not. You may say it's depressing, I call it reality.
If anyone here has a subscription and they can spare the tokens, I think it would be fun if someone shared a song about Hacker News.
I'm hoping that in the future tools like Suno will allow you to produce / generate songs as projects which you can tweak in greater detail; basically a way of making music by "vibe coding". With 4.0 the annotation capabilities were still a bit limited, and the singer could end up mispronouncing things without any way to fix or specify the correct pronunciation. This blog post mentions that with 4.5 they enhanced prompt interpretations, but it doesn't actually go into any technical details nor does it provide clear examples to get a real sense of the changes.
We can do better on user instruction for sure, duly noted. In my experience a lot of different stuff works (emotions, some musical direction sometimes, describing parts/layers of the track you want to exist, music-production-ish terminology, genres, stuff like intro/outro/chorus), but I think of it more as steering the space of the generated output rather than working 100% of the time. This can go in the style tags or in [brackets] in the lyrics. Definitely makes a difference in the outputs to be more descriptive with 4.5.
Your comment inspired me to upgrade it to 4.5 because it did have that AI tinny quality. https://suno.com/s/tbZlkBL7XeLVuuN0
It sounds better but has lost some magic.
Here is the original comment - https://news.ycombinator.com/item?id=39997706
In that spirit, from the same “artist” here is your comment - https://suno.com/s/AumsIqrIovVhT0c9
And
https://suno.com/s/YGlpHptX6yXJVpHq
Not sure which I like more.
I am feeling a deja vu with that vocal part at 0:40. I definitely heard something similar somewhere.
For comparison, here's a song where I forced myself to do everything within Suno (took less than a week):
https://www.youtube.com/watch?v=R6mJcXxoppc
And here's one where I did the manual composition, worked with session artists, and it took a couple months and cost me several hundred dollars:
I'm not sure if this is solvable, but I think it should be a bigger research topic. If anyone knows of any papers on this, I haven't found one yet (not sure what to search for).
What's the basis for this? Unfortunately it's hard to describe, but I've listened to a wide variety of popular and niche genres my whole life with a specific eye toward appreciating all the different ways people appreciate music and I know when something feels new.
Even most (or all?) pop music feels new. If it wasn't, I don't think it would be popular. Sure, it's all derivative, but what makes music enjoyable is when it combines influences in a fresh way.
"French house polka" achieved by doing a statistical blend of the two genres just isn't that interesting—it misses so many ways that things are combined by artists—specific details of the texture of the sound, how it's recorded, cultural references, tons of tiny little influences from specific artists and other genres, etc.
I've tried very specific prompts in Suno and it's not even close to useful for someone who knows what they're doing. The image generators are hardly better—things overwhelmingly trend toward a fixed set of specific styles that are well-represented in the training sets.
This critique falls down in certain areas though. Using tools like Suno to come up with songwriting ideas can be fantastic—as long as you have the taste to curate the outputs and feed it into your own creative process. It's also fantastic for creating vocal samples, and I'm sure it'll crush it for functional music (advertisements, certain types of social media) in short order
Suno currently is limited architecturally to in distribution components, so trying to create instruments or vocal styles it never heard won't work. The parts that you can work in are a vast and rich creative space.
https://chatgpt.com/s/m_6815c0cb1fd881919867820db3ee3850 https://chatgpt.com/s/m_6815c027e10c8191936ac1e86adb5bcd https://chatgpt.com/s/m_6815bf6174488191b2b93498059c4352 https://chatgpt.com/s/m_6815be7921b4819191544de9f06b336f https://chatgpt.com/s/m_6815bf16568081918ef3d26d0572701d https://chatgpt.com/s/m_6815c00655888191a717f4ef8f895af3 https://chatgpt.com/s/m_6815c0583198819184c4f01b8fa26ae0 https://chatgpt.com/s/m_6815c0acb6fc8191a8a2ec0c71df307a https://chatgpt.com/s/m_6815c233ef00819189e0713abd0f3080 https://chatgpt.com/s/m_6815c265b0948191801b5c970451d705
For bonus points.. the tail-less crocodile.. https://chatgpt.com/s/m_6815c29c1bf081918c791447441c400e
You just lack imagination, like I said.. there are almost no limits but pixel density and aspect ratio currently.
P.S. Do you mind sharing the prompt you used to do the crocodile?
https://chatgpt.com/share/681855ce-a9d4-8004-9a3a-deb19994d8...
Edit: I was really just testing to see how well it could do layout and out of distrobution images. This was right when it came out and I was trying to see what the limitations were.
Like I said, you know nothing about making art. (Not a big deal, not everybody needs to.)
In an alternative parallel universe we might have gotten actual AI tools for making art - these would be various infilling plugins, basically. ("Make me a Photoshop brush that paints pine-needle looking spiky things", that sort of thing.)
What we got in this reality is not that, though. Meme making of the sort you posted is not art.
(That said, making memes, especially of the spicy and/or topical kind, is the one niche that generative AI is good at and nails perfectly.)
I don't have to justify my art to anyone. The magic is.. you can't choose what is art. You can choose what YOU think is art and that's fine. However you can't dictate to the world what is art.
You’re right!
> Even jazz uses 6-2-5-1's over and over.
You’re not even wrong! I wonder if jazz does anything else besides that?
This is a really fascinating topic, and I think it might give us new insight into the human condition. I'm excited to see where this leads us.
can AI recognize what is best? can AI create what is recognized as best?
(you know how vast majority of humans think they are above average drivers?)
I'm sometimes scared that life will loose its spark when AI is just able to solve any problem better or faster than humans. I think music generation is scary, because music is often created using your intuition, rather than relying on general principles or strict rules alone. Intuition feels almost magical, because it feels like you somehow just know the right thing to do, without having to reason about it. If AI gained this hard-to-grasp 'intuition', that would reveal that we are just biological machines after all, and intuition is simply a sophisticated hidden pattern.
That is, you try something new (random) and you, the human, are also the verifier to see whether that random new thing was subjectively good, and then you select based on that.
In this understanding of creativity, creating a new style is a gradual process of evolution (mutation and selection) where your own brain is a necessary part of carrying out the selection. "Did the new idea (mutation) I just tried make me feel good or not (selection)?"
That activity of you being the verifier is effortless and instant, but the AI fundamentally can't tap into the verification and so it has no ability to do the selection step after randomness, and so it is impossible for creativity to emerge no matter the architecture (unless you somehow create a proxy for a human verifier which seems insanely hard).
The only solution I can see to this is to try to simulate this process, seems possible but hard.
I keep wanting to save some of the songs I hear. Damn, I don't think I would really be able to tell in a blind test that these were AI.
I really don't like that UI. It's hard to read, and when I found something it slips. Too much form over function
> I keep wanting to save some of the songs I hear.
Just click the title of the song. If you have an account you can add to favorites, download, etc.
Oh, the joys of infinite public domain music!
I guess they are hoping for the Uber outcome where they earn enough money during the illegal phase so they can pay some tiny fine and keep going.
Suno admitted to train their models with copyrighted music and are now defending the position that music copyrights and royalties are bad for the future of music.
Fair use encompasses a lot of possible scenarios involving copyrighted works which may or may not be commercially licensed: transformative derivative works, short excerpts for purposes of commentary, backups of copies obtained legally, playback for face-to-face instruction, etc.
"Cajun synthpop chant" has no chanting or synths, it sounds more like country music with french woman vocals
For example I was trying to steer a melodic techno prompt recently in a better direction by putting stuff like this upfront:
[intro - dramatic synths, pulsing techno bass]
[organic percussive samples]
[rolling galloping pulsing gritty bassline]
[soaring experimental synths, modulation heavy, echos, sound design, 3d sound]
[lush atmosphere, variation]
[hypnotic groovy arppegiation arps]
[sampled repetitive trippy vocal]
All of this is just stuff I kind of made up and wanted in the song, but it meaningfully improved the output over just tags. I think "steering/nudging the generation space" is a decent idea for how I feel like this affects the output.I also often use them to structure things around song structure like [intro], [break], [chorus], and even get more descriptive with these describing things or moments I'd like to happen. Again adherence is not perfect, but seems to help steer things.
One of my favorite tags I've seen is [Suck the entire song through vacuum] and well... I choose to believe, check out 1:29 https://suno.com/s/xdIDhlKQUed0Dp1I
Worth playing around with a bunch, especially if you're not quite getting something interesting or in the direction you want.
Others such as [Interrupt] will provide a DJ-like fade-out / announcement (that was <Artist name>, next up..." / fade-in - providing an opportunity to break the AI out of repetitive loops it obsesses about.
I've used [Bridge] successfully, and [Instrumental] [No vocals] work reliably as well (there are also instrumental options, but I still use brackets out of habit I guess).
Someday, I’m sure, Suno will find a way to fix this issue. But today isn’t that day.
I increasingly feel like withdrawing from the internet as well. Half of the users are bots anyway
I IV V with different accents over the music and different drum sounds is fine, but thats not really music. It's pretty bad when you can pick out the chords progression in 5 sec. Cue the infamous 4-chord song skit by Axis of Awesome.
However in music - there is so much badly done human music as well, for me it's nearly impossible to understand the difference between a badly done human music and a high fidelity AI music (the chord progression, happens as often in human music). Moreover, I have put Suno AI on playlist mode before and it's actually been enjoyable, and I am a big AI sceptic! Sometimes even more so than Spotify's own (although they've been accused of putting AI music on playlists as well - but I am fairly sure the weak stuff that put me off was by humans - did I say I cannot differentiate?).
Especially some music genres - like Japanese Vocaloid ; Power Metal, some country, where certain genre specific things overwhelm the piece, AI does a very good job in mimicking those from the best of the best and put meagre efforts to shame.
Here is one AI song I generated in an earlier version of Suno - let me know if anything stands out as AI: https://www.youtube.com/watch?v=I5JcEnU-x3s
and another I recorded in my studio with an artist: https://www.youtube.com/watch?v=R6mJcXxoppc
Kind of sad, especially for composers (which I am trying to be). Ah well, can only keep moving forward.
Also as we can blur the line between instrument and audio; why can’t my piano morph into an organ over the course of a piece? (I’m familiar with the Korg Morpheus and similar; I mean in a much more real sense).
An no disrepect towards anyone using AI to create music, it is here and unstoppable, but I don't currently use generative AI in music myself. Yeah, think for performed works for a live audience (at least in classical music), most people want to hear music composed by humans (for the most part). Hopefully, will stay this way for awhile, otherwise, I've been going down a road that goes nowhere. Ah well, wouldn't be the first time:)
Someone on HN, don’t remember who, made the observation that some artists mistake mastery of tools for the art, whereas artists who focus on the actual art can roll with changes to the tools.
But, AI isn't just a tool, it's actually generating musical ideas at a highly finished level. For the first time, we have something that takes over a substantial amount of the creativity used to write a piece, a process which has always been the domain of people, and it's doing it at levels that are close to what the best skilled humans can do. Yeah, this isn't just something that aids creation, but is doing the creating itself.
Maybe one day, will use AI to create substantial amounts of the music I write, but am not nearly at that point yet - don't think most classical concert audiences want to got to a concert hall to hear AI generated music, but that may change. Guess we'll have to wait and see.
Music is more about the human that made it and their relation to you than the sound properties themselves. Same as other art. The more indirect the music process and the further you are from the living experience of the human creator, the less it resembles art. I feel art is more of an spectrum rather than a binary switch and the metric is how much direct human involvement did the audio experience have in terms that you can relate to.
Remove the human completely and you just have sound. It is likely that something like bepop, gabber or industrial synthwave would have been considered "sounds" rather than art by medieval folks or Mesopotamian people if they heard the sounds without knowing whether the source was human or not. Same with us if we were to heard some music from the year 3200 or 4500, we would likely not consider it music.
My reasoning is that the fact that it was made by another human is really important.
Not only because you might think a piece of music is lame because it was made by AI vs a human.
But also because all the things that bring you back to a piece of art is wrapped up in the person that made it.
People who are immense fans of the Beatles, Taylor Swift or Kanye West illustrate this point.
You keep coming back because you liked this person's music before, and so you can't wait to preorder their music in the future.
Same goes for books, paintings and really all other art I can think of.
An artist develops a following that snowballs into their music being broadly consumed.
There are "AI music artists" that have been around for a decade. Miquela is the one I know about. But in that timespan, hundreds of human artists have developed followings and cultural sway that far outweigh what Miquela has done.
It seems more and more that AI is simply another tool for humans to use. Rather than a replacement altogether of humans.
This presupposes that people are still able to tell the difference between computer-generated and human-generated music, which, in my opinion (as a trained musician and former producer), is no longer the case for the majority of people.
Music and musicians undoubtedly also fulfill an "idol function." But the industry has long since provided an answer to this, for example with highly successful, artificially optimized boy or girl groups. With "Milli Vanilli," they took it to the extreme by having the "musicians" no longer sing themselves, but were chosen solely for their effect on the audience. This also works with computer-generated music, only much cheaper.
"It's AI!"
"Oh."
I've literally seen this happen, in person.
The song doesn't get passed to friends. The artist themselves is the transmission mechanism.
> With "Milli Vanilli," they took it to the extreme by having the "musicians" no longer sing themselves
This simply proves my point, not detracts from it.
You can try to make an AI into an actual human (see Miquela) but the fakeness of that is simply insurmountable. Miquela hasn't had breakout success.
There's something very condescending in the elite culture of the past 10 years (it's starting to go away) that thinks you can get successful by being fake. It assumes the masses are stupid.
They are not.
It's not that the music itself is bad. I've listened to AI music while coding for example.
But for a piece of art to "take off". It needs to go beyond you. Not only do you need to tell your friends about it, but they need to tell their friends about it, and so on.
And oftentimes during the transmission process you don't even remember the name of the song, but you remember the name of the artist. Skrillex for example.
Humans care about what other humans do. We couldn't care less about what robots do.
Whether people will still learn an instrument or become musicians is a question that is difficult to answer today. The decline of this profession actually began with the invention of recording technology and has steadily increased since then. It is now almost impossible to make a living from it, and that was already the case before Suno and co. Services such as Spotify have taken anonymization and commodification to the extreme. Nevertheless, people still learn instruments and make music. It may well be that the creative possibilities offered by services such as Suno will even inspire people to make more music again.
Won't split into 2 extreme segments. It's just 1 segment- people who listen to music, often recommended by their friends or peer group.
Also, Profession hasn't declined.
Concerning the trends, see also e.g. https://www.numberanalytics.com/blog/decoding-music-consumpt... or https://techxplore.com/news/2025-01-experts-spotify-music-ha..., or directly https://pubsonline.informs.org/doi/full/10.1287/mksc.2022.02....
But I don't understand why its wrong. If its trained on lots of Urdu/Hindi music, no one pronounces those words like that. How does it get the a/e wrong while still singing almost correctly? It's weird.
Everytime I clicked next I was subsconsciously cringing and expecting an unwelcome and jarring ad to blast through my speakers - and each time releived to actually be greeted by the next music track.
What have we done to ourselves.
It only feels like yesterday that the default was hoarding all your favourite tracks and playing them at will through winamp.
Here's the thing about music - what makes a good song is a sense of unexpected familiarity. In other words, it's based on a set of broad, familiar rules (music theory), but surprises you by implementing those rules in a way that's extremely unique and novel. Because generative AI is kind of like an averaging machine, rounding off the outliers, what you get is an example of the rules in the most boring, typical way possible. It completely lacks the element of surprise, and I'm not sure it will ever be capable of that essential element of surprise because of how it fundamentally works. If you want to generate a bunch of background elevator music, however, this seems to be extremely useful for that.
Allow users to creatively engage by providing suggested starting places in the form of BPM, key and chord progressions or as brief audio and/or MIDI sketches. For example, let me give the AI a simple sketch of a couple bars of melody as MIDI notes, then have it give me back several variations of matching rhythm section and harmonies. Then take my textual feedback on adjustments I'd like but let me be specific when necessary, down to per-section or individual instrument. Ideally, the interface should look like a simplified multi-track DAW, making it easy for users to lock or unlock individual tracks so the AI knows what to keep and what to change as we creatively iterate. Once finished, provide output as both full mix and separate audio stems with optional MIDI for tracks with traditional instruments.
Targeting this use case accomplishes two crucial things. First, it lowers the bar of quality the AI has to match to be useful and compelling. Let's face it, generating lyrics, melodies, instrumental performances and production techniques more compelling than a top notch team of humans is hard for an AI. Doing it every time and only in the form of a complete, fully produced song is currently nearly impossible. The second thing it does is increase the tangible value the AI can contribute right now. Today it can be the rhythm section I lack, tomorrow it can fill in for the session guitarist I need, next week it can help me come up with new chord progression ideas. It would be useful every time I want to create music, whether I need backing vocals, a tight bass riff, scary viola tremelos or just some musical inspiration. And nothing it did would have to be perfect to be insanely useful - because I can tweak individual audio stems and MIDI tracks far faster than trying to explain a certain swing shuffle feel in text prompts.
Seriously, for a tool anything like what I've described, I'd be all-in for at least $30/mo if it's only half-assed. Once it's 3/4-assed put me down for $50/mo - and I'm not even a pro or semi-pro musician or producer, just a musical hobbyist who screws around making stuff no one else ever hears. Sure, actual music creators are a smaller market than music listeners but we're loyal, not as price sensitive and our needs for perfection in collaborators are far lower too. Plus, all those granular interactions as we iterate with your AI step-by-step towards "great", becomes invaluable training data - yet doesn't require us creators to surrender rights to our final output. For training data, the journey is (literally) the reward.
So then there's the casual end-user who's making music for themselves to listen to. IMO this is largely a novelty that hasn't worked out. I haven't heard many people regularly listen to Suno because, again, music is already incredibly cheap. Spotify is ~$15/month and it gives you access to the Beatles and Rolling Stones. The novelty of AI-generated "Korean goa psytrance 2-step" is fun for a bit, but how much will people pay for it, how many, and for how long?
I do think there's a lot of potential targeting musicians who incorporate AI-generated elements in their songs. (Disclaimer: I am a musician who has been using vocal synths for many years, and have started incorporating AI-generated samples into my workflows.) However as you point out, the functionality needed for Suno to work here is very different from the "write prompt, get fully complete song" use case.
It'll be interesting to see where it goes from here. In general, AI-based tooling does appear to be pivoting more towards "tools for creators" rather than "magic button that produces content", so I'm hopeful.
[0] One notable one is the artist "009 Sound System", who had a bunch of CC-licensed tracks that became popular due to YouTube's music swapping feature; since the list was sorted alphabetically, their tracks ended up getting used in a ton of videos and gaining popularity. https://en.wikipedia.org/wiki/Alexander_Perls#YouTube
Yeah, AI music gen is super fun to play with for a half-hour or so - and it's great when I need a novelty song made for a friend's wedding or special birthday - like, once a year maybe. But neither of those seems like a use case that leads to sustainable, high-value ARR. I'm starting to wonder if maybe most AI music generation companies ended up here because AI researchers saw a huge pile of awesome produced content that was already partially tagged and it looked like too perfect of a nail not to swing their LLM hammer at. And, until recently, VCs were throwing money at anything "AI" without needing to show product/market fit.
I'm not sure they fully thought through the use case of typical music listeners or considered the entrenched competition offering >95% of all music humans have ever recorded - for around ~$10/mo. As you said, another potential customer is media producers who need background tracks for social media videos but between the stock music industry offering millions of great tracks for pennies each and the "Fivver"-type producers who'll make a surprisingly good custom track in a day that you can own for $25 - I'm not seeing much ARR for AI music generators there either.
Currently the launch hypothesis of AI music generation puts them in direct competition against mature, high-quality alternatives that are already entrenched and cheap. And those use cases are currently being served by literally the best-of-the-best content humanity has ever created. Targeting replacing that as their first target seems as dumb as a SpaceX setting "Landing on Mars" as the goal of their first launch. There's no way to incrementally iterate toward that value proposition. Sure, targeting more modest incremental goals may be less exciting, but it also doesn't require perfection. Fortunately, music producers have needs that are more tractable but still valuable to solve - and not currently well served by cheap, plentiful, high-quality alternatives. And music producers are generally easier to reach, satisfy and retain long-term than 'music listeners' or 'music licensers'.
Plus Ableton Live itself has a lot of generative tools these days:
https://www.youtube.com/watch?v=_RNXVfo-oLc
But I honestly don't see the point. The journey is the whole point when making music or any art really. AI doesn't solve a problem here. There never has been one in the first place. There is more music out there than you could ever listen to. Automating something humans deeply enjoy is misguided.
If you enjoy writing music your way, great. But I strongly disagree that it’s a mistake to enable people to approach it differently.
All my artists friends where criticizing it and I was thinking it was some form of Neo-Ludditism that they were following. Why not embrace progress? No one is stopping them to not use it but if it helps lower the barrier of entry isn't that great? Surely generative AI could be used to enhance of the workflow of an artist?
Oh, how I have been wrong. In reality it has only been used to replace artists. To devalue their work. It has not place in a artists pipeline.
https://aftermath.site/ai-video-game-development-art-vibe-co...
I think the use of generative AI or at least of generalist LLM's is something fundamentally different than artists embracing new media and new processes. Like digital drawing is still roughly the same process as drawing on paper. The process is largely the same and most skills carry over. You are still in control. Using a prompt to create images is something that is not drawing.
I also recommend:
Artists aren’t entitled to a job any more than I am; to the extent they (or I) do work that is easily replaced, by AI or offshoring or anything, that’s a career risk.
Those same tools enable artists to be more productive, more creative, more capable. The people suffering are those who are attached to the tools they learned with. We in tech know that’s a bad long term idea, and it is surprising in the arts. But on the whole, human creativity and overall output of great art will benefit from this change.
Listening to the studio mix on my headphones at home will always be better sound than being in a crowded concert.
I mean you are right to a certain degree, if it works, it works and if generative tools inspire you to make better music that is great. I am not so sure about that though.
I am forced to vibe code at work and it has not make me more creative. Is has made me want to quit my job.
I'm not saying you need to use generative tools, but if it helps you make music you should do it. Ultimately what you're sharing with the world is your taste, not your technical abilities. To slightly expand on a famous quote in the music world -
> I thought using AI was cheating, so I used loops. I then thought using loops was cheating, so I programmed my own using samples. I then thought using samples was cheating, so I recorded real drums. I then thought that programming it was cheating, so I learned to play drums for real. I then thought using bought drums was cheating, so I learned to make my own. I then thought using premade skins was cheating, so I killed a goat and skinned it. I then thought that that was cheating too, so I grew my own goat from a baby goat. I also think that is cheating, but I’m not sure where to go from here. I haven’t made any music lately, what with the goat farming and all.
https://www.youtube.com/watch?v=6bNMxWGHlTI
Cuando para mucho mi amore de felice corazón...
I can't say anything about autogenerated lyrics.
I've found some okay but listening to "meaningless" music doesn't sit right with me
Which one are you referring to?
Still not totally adherent, but if you can steer it with genre, detailed descriptions of genre, and elements of the genre it's way better than v4. Some descriptions work better than others so there's some experimentation to figure out what works for what you're trying to achieve.
You can also provide descriptions in [brackets] in the lyrics that work reasonably well in my experience.
Disclaimer: I work there as a SWE.
Some examples of style descriptions I've used that generated results close to what I had in mind are "romantic comedy intro music, fast and exciting, new york city" (aiming for something like the Sex and the City theme) and "mature adult romance reality tv show theme song, breakbeats, seductive, intimate, saxophones, lots of saxophones" which did indeed produce cheesy porn music.
-----
STYLE: Earth Circuit Fusion
INSTRUMENTATION: - Deep analog synth bass with subtle distortion - Hybrid percussion combining djembe and electronic glitches - Polytonal synth arpeggios with unpredictable patterns - Processed field recordings for atmospheric texture - Circuit-bent toys creating unexpected melodic accents
VOCAL APPROACH: - Female vocalist with rich mid-range and clear upper register - Intimate yet confident delivery with controlled vibrato - Layered whisper-singing technique in verses - Full-voiced chorus delivery with slight emotional rasp - Spoken-word elements layered under bridge melodies - Stacked fifth harmonies creating ethereal chorus quality
PRODUCTION: - Grainy tape saturation on organic elements - Juxtaposition of lo-fi and hi-fi within same sections - Strategic arrangement dropouts for dramatic impact - Glitch transition effects between sections
---
One thing I have noticed with the new model is that it listens to direction in the lyrics more now, for example [whispered] or [bass drop], etc.
There are clear limits. I have been unsuccessful in spacial arrangement.
EDIT: I realized I didn't specify, this is when you do custom and you specify the lyrics and the style separately.
Why do I need a phone number or use a cloud provider? I don't want to be associated with any of those.
Its great, however it sucked in other languages, sounding like a foreigner trying to speak like a local.
Something about music? okay. What else? Is it all fake music or something?
Right now it probably blends the styles together, taking elements from both, but not following the required restrictions.
Lyrics are way overused, sound like eurodance, and are just overall disconnected from the music.
The music feels very generic, as if you landed in the biggest tourist trap instead of a proper nightclub.
All of these example get ruined by the most simple and boring lyrics imaginable. Poetry is an art and clearly the model doesn't yet grasp all of its nuances like it does for the rest of the "composition".
At this point the only thing that gives this away as AI generated are the vocals.
Only because the bar for music is so low nowadays. Thankfully poetry hasn't been commodified yet liked music has.
>Among the stars, where the dreams and freedom meet.
>Finding the ecstasy of life’s uncharted quest,
>In every pulse of the music, feel the zest.
Like... what?
Or, if I'm listening to music just for the vibe, I really don't care how it's created, as long as it doesn't offend me auditorially. I'm really not listening actively. So I suppose that's a bit of an indictment of myself, but I don't think it's a serious character flaw in myself. I should probably just try to pay more attention to the people around me at all times.
I have a lot of fun putting my own poetry in here and mashing it up with the styles that I enjoy listening to, or that I think would work well with the poem. Again, I don't want to like it, but I do.
# Original record scratch contest-style song https://suno.com/s/8MvZmfkDPIPmKLtm
And this is a good example how the "magic" is lost in a cover of that same contest entry with no attempt to curate:
# v4.5 cover https://suno.com/s/KyCZZNn6PpL4JHbO
Here's another one I put a bit of time into, but with a much simpler structure. What I appreciated about the original were the emotions it stirred up when the notes came together just-so:
# Original ambient synth https://suno.com/s/JtmmbdA2VtgO4drK
New cover, pretty decent but it lost what I liked the most (haven't had great luck with v4.5 remasters yet, but I do a lot of weird things):
# v4.5 cover https://suno.com/s/Gi8wy1QjUaHmYNKy
# Original piano piece https://suno.com/s/yj8rHRRgJEWD83GY
# v4.5 remaster https://suno.com/s/Xx5Y5SNl1MdDrLsO
When you ignore the stuff that humans shouldn't get credit for - e.g. I didn't "make" this song, or play any part in its "production", but I did "curate" it - there's still something left to give credit for, right? It's basically like a DJ digging through a mysterious crate of records.
EDIT: don't understand why this statement is downvoted; since when has HN been hostile to technology? Or if you think this is a paid bot comment, I'm real, here's my website: http://rochus-keller.ch/?cat=3
Whilst it's not clear where the training data is coming from, how can I be sure that I won't accidentally trip up something like YouTube's Content Match tool or other companies whom act on behalf of a copyright holder? Or did I miss something?