444 pointsby Kerrick10 days ago42 comments
  • x1874637 days ago
    This is a before/after moment for image generation. A simple example is the background images on a ton of (mediocre) music youtube channels. They almost all use AI generated images that are full of nonsense the closer you look. Jazz channels will feature coffee shops with garbled text on the menu and furniture blending together. I bet all of that disappears over the next few months.

    On another note, and perhaps others are feeling similarly, but I am finding myself surprised at how little use I have for this stuff, LLMs included. If, ten years ago, you told me I would have access to tools like this, I'm sure I would have responded with a never ending stream of ideas and excitement. But now that they're here, I just sort of poke at it for a minute and carry on with my day.

    Maybe it's the unreliability on all fronts, I don't know. I ask a lot of programming questions and appreciate some of the autocomplete in vscode, but I know I'm not anywhere close to taking full advantage of what these systems can do.

    • Gasp0de7 days ago
      I love using LLMs to generate pictures. I'd call myself rather creative, but absolutely useless in any artistic craft. Now, I can just describe any image I can imagine and get 90% accurate results, which is good enough for the presentations I hold, online pet projects (created a squirrel-themed online math-learning game for which I previously would have needed a designer to create squirrel highschool themed imagery) and memes. For many, many websites this is going to be good enough.
      • nitwit0056 days ago
        > For many, many websites this is going to be good enough.

        It was largely a solved problem though. Companies did not seem to have an issue with using stock photos. My current company's website is full of them.

        For business use cases, those galleries were already so extensive before AI image generation, that what you wanted was almost always there. They seemingly looked at people's search queries, and added images to match previously failed queries. Even things you wouldn't think would have a photo like "man in business suit jump kicking a guy while screaming", have plenty of results.

        • dylan6046 days ago
          Really? What stock service would have a selection of squirrels in a high school setting doing various math or other subject related things?

          To think any/all combined stock services would be the end all is just unrealistic. Sure, someone one might have settled on something just because they got tired of scrolling (much like streaming video services), that does not mean they are happy with their selection. Just happy to be done.

          Now, with generativeAI, they can have squirrels doing anything in any setting they can describe. If they don't like it, they can just tweak the description until they are happy. It's an obvious plus for them.

          I never drank the kool-aid to be all gung-ho on this boom/fad, but I'm not going to be so obstinate that I refuse to accept some people find it quite useful. May someone make all the squirrel attending highschool generative art they want, but you can't tell me some stock place is good 'nuff for everything.

          • nitwit0056 days ago
            I searched Shutterstock for squirrels doing math. Here's a squirrel doing math: https://www.shutterstock.com/image-vector/pensive-squirrel-d...

            Yes, it's obvious that if your use case is obscure enough, or you need a ton of unique images, they won't work, which is why I said "largely a solved problem".

            • sroussey6 days ago
              But image generation these days is simply image search anyway. Wether or not the image existed before is almost irrelevant.
              • dylan6046 days ago
                I've never searched for an image from a stock vendor the way one would prompt a generative model. The stock vendor's metadata/keyword about its images were never that in-depth.

                Also, you're implying that a generative system is so fast that it could create so many variations of your prompt to fill in the search results page in an acceptable time. That's a joke

        • schwartzworld6 days ago
          AI is mediocre at a lot of things, but it makes a damn fine upgrade from stock photos. This is the art that’s going to get replaced by this tech, shitty low effort stuff. Images where you just need a picture of X because people are expecting a picture.

          It’s the same with code. I don’t think software engineers will really be replaced, but small web dev agencies have a good reason to be nervous. Why would you pay someone to make a website for your restaurant when 3-5 prompts will get you there?

          • exodust6 days ago
            3-5 prompts doesn't get you a professional restaurant website.

            The key word is professional. A good restaurant website begins with taking good photos of the premises and the food. AI won't come around to your business and take professional photos.

            There's a lot of bits and pieces to a website for bookings, content management, menu updates, etc.

            HTML templates and themes have been around for a long time. AI can basically spit out those templates and themes, which is great. But there's still a lot to do before you get to www.fancy-dining.com.

            • lupusreal6 days ago
              Most restaurant websites don't have photos of the food. Certainly not professional ones. Pan around randomly on google maps then zoom in and find the nearest strip mall, then go down the line checking the websites of restaurants there. Most are generic crap and you'll be lucky if the menu online is even complete. If they have food photos it's probably smartphone pictures taken by the owner's kid.

              I do this a lot, far more than I actually go to restaurants, because I like adding small business details to OSM. There are a few that have their shit together but the overwhelming majority do not.

              • exodust5 days ago
                You've evaluated a tiny sample of restaurant websites and extrapolated to make claims about the overwhelming majority - in the millions, across the globe.

                "Most are generic crap" doesn't mean restaurants aim for that benchmark when they decide to get a website.

                I'm not sure if you're refuting the point I was making, which I'll clarify. "Restaurant website" could be a stand-in for any basic small business website. The claim was that AI threatens small web dev agencies who make small business websites. I don't think it will, as millions of small businesses want something better than "generic crap" or cookie-cut AI copy paste; AND we've had site-building services, social media pages, and template-driven approaches for a long time.

            • schwartzworld4 days ago
              > AI won't come around to your business and take professional photos.

              Neither will a web developer?

              > There's a lot of bits and pieces to a website for bookings, content management, menu updates, etc.

              Bolt.new can handle all these quite easily. Although I know several restaurants with very simple websites that have a few pics, a menu and their hours.

            • shaky-carrousel6 days ago
              No, but you can do the photos and have the AI fix them or make suggestions about them.
        • globnomulous6 days ago
          > Companies did not seem to have an issue with using stock photos.

          And now these image-generating models are giving us the equivalent of stock photos without the pesky issue of attribution or royalties. What a wonderful time to be alive.

      • candiddevmike7 days ago
        My problem with finding enjoyment in this is the same problem I have when using cheat codes in games: the doing part is the fun part, getting to the end or just permutations of the end gets really boring.
        • williamcotton6 days ago
          Trying to draw a squirrel when you have no artistic talents or experience is not the fun part.

          I've produced my own music recordings in the past and I've hired musicians to play the instruments that I cannot. Having exasperated recording engineers watch my 5,000th take on a drum fill that I absolutely cannot play is not the fun part. Sitting behind the glass and watching my vision come to life from a really good drummer is absolutely the fun part.

          • schwartzworld6 days ago
            > Sitting behind the glass and watching my vision come to life from a really good drummer is absolutely the fun part.

            Is having the ai spit out idea after idea fun in the same way for you?

            • marcellus236 days ago
              He's not talking about using AI to generate ideas, he's talking about using AI to turn ideas into reality.
      • munksbeer7 days ago
        >I love using LLMs to generate pictures. I'd call myself rather creative, but absolutely useless in any artistic craft. Now, I can just describe any image I can imagine and get 90% accurate results

        May I ask what you use? I'm not yet even a paid subscriber to any of the models, because my company offer a corporate internal subscription chatbot and code integration that works well enough for what I've been doing so far but has no image generation.

        I have tried image generation on the free tier but run out of free use before I get anyway pleasing.

        What do you pay for?

        • wincy7 days ago
          I was generating pictures to use for a little game I made with my six and ten year old kids. They were so excited to see us go from idea to execution so quickly, they were laughing and we had a ton of fun. The only thing that disappointed me was I got throttled. We’d need to pay for API image gen to get it even faster.

          I made a logo for an internal product that wouldn’t have had a logo otherwise at our company. I also make a lot of shitpost memes to my friends to trash talk in the long running turn based war game we’ve all been playing, like “make a cartoony image of a dog man and a Greek giant beating up a devil” and the picture it gave was just hilarious and perfect, like an old timey Popeye cartoon.

          Two years ago I was spending three hours using local models like Stable Diffusion to get exactly what I wanted. I had to inpaint and generate 100 variations which would have been insanely expensive if I wasn’t powering it with my own hardware.

          Now I get something good in minutes, it’s crazy really.

          • globnomulous6 days ago
            > They were so excited to see us go from idea to execution so quickly

            They're learning to expect to skip the most important part of creating something.

            > they were laughing and we had a ton of fun

            I'm a parent so I think I get the appeal, but this to be is like saying "they were laughing and having fun while reading and composing legal briefs." I don't see the advantage, and any momentary benefit comes at the cost of a longer-term loss.

          • munksbeer7 days ago
            Thanks. Which service do you use please? I'm wanting to try a paid service, just want to know which ones people recommend.
            • HelloMcFly6 days ago
              I've used Midjourney and chatGPT. Midjourney is better for rapid iteration, cycling through options faster, and to a large extent getting "weirder". It's easier to tweak using parameters.

              ChatGPT is far, far superior (especially now) when you want something more specific that you've already imagined. But it's slower, and unlike Midjourney you don't get four versions to choose to build and iterate on, you get a single image that takes longer to load.

              • throwaway20376 days ago

                    > four versions to choose to build and iterate on
                
                How does this work? How do you ask a model to produce four different variations? Or do they have four different models run the same inference?
                • dmurray6 days ago
                  There's some noise in the process, so you won't get the same results if you ask the same model the same prompt 4 times. Most of the services that do this just ask the same model 4 times with a different random seed, as far as I can tell.
                • HelloMcFly6 days ago
                  You don't ask to run four models, it's just that every single prompt you give it will return four models. You can then choose to have one of them iterated on, or upscaled.

                  If you're still not sure let me know and I'll show a screenshot.

          • genewitch7 days ago
            Feels like you didn't answer the question. i know you weren't who was asked, but still.
        • Gasp0de6 days ago
          The new ChatGPT image generation is insane. It's available on the free tier, just strongly rate limited.
      • __loam7 days ago
        If you use this technology, you're actively harming creative labor.
        • genewitch7 days ago
          Whatever. I wrote and co-wrote ten albums and my total take was $3.

          The market is saturated and the way it works means ten get rich for every million artists. I feel as though this has been pretty constant throughout history.

          Of course there's a lot of talent out there, "wasted", but I think that's always been the case. How many William Shakesmans did we lose with all the war, famine, disease?

          I actually decided I'd probably never write music again after 1-shot making a song about the south Korea coup attempt several months ago. I had the song done before the news really even hit the US. Why would I destroy my own hearing writing music anymore when I can prompt an AI to do it for me, with the same net result - no one cares.

          here's the 3-shot remix, the triangle cracks me up so much that i had to upload it https://soundcloud.com/djoutcold/coup-detat-symphony-remix

          the "original" "1-shot" is on my soundcloud page as well. https://soundcloud.com/djoutcold/i-aint-even-writing-music-a...

          it's in lojban. That's why you can't understand it. Yes. Lojban. Brings a tear to my eye every time i hear it. fkin AI

          [0] more my style - hold music for our PBX https://soundcloud.com/djoutcold/bew-hold-music also all my stuff is CC licensed, mostly CC0 at this point.

          • Garlef6 days ago
            > How many William Shakesmans did we lose with all the war, famine, disease?

            (Just a small comment out of context of the remaining discussion:)

            Maybe not many? It could be that "cultural attention" is limited and there's not much space at the top anyways. In other words: It might be that there's always a few famous artists that get remembered and the rest is forgotten. Same as winning the world cup: There's always a team that wins and it says nothing about the quality in a universal way. At best it says something about quality relative to the competition.

            (Not sure I'd fully get behind the argument i composed here. But I found it interesting.)

            • megaloblasto6 days ago
              do you think that if the Beatles never existed, some other group would have absorbed their fame, like a power vacuum being filled? I've wondered this before.

              Or maybe they just really were that good.

          • globnomulous6 days ago
            > I feel as though this has been pretty constant throughout history.

            It hasn't. Look up the collapse of the viability of music as a career. Jaron Lanier has written on this.

        • ldoughty7 days ago
          Can you elaborate how there's no possible way to use this technology without actively harming artists?

          If a classroom of 14 year olds are making a game in their computer science class, and they use AI to make placeholder images... Was a real artist harmed?

          The teacher certainly cant afford to pay artists to provide content for all the students games, and most students can't afford to hire an artist either.. they perhaps can't even legally do it, if the artist requires a contract... they are underage in most countries to sign a contract.

          This technology gives the kids a lot more freedom than a pre-packaged asset library, and can encourage more engagement with the course content, leading to more people interested in creative-employing pursuits.

          So, I think this technology can create a new generation of creative individuals, and statements about the blanket harm need to be qualified.

          • dudeofea6 days ago
            > This technology gives the kids a lot more freedom than a pre-packaged asset library, and can encourage more engagement with the course content, leading to more people interested in creative-employing pursuits.

            This is your opinion. I don't see how these statements connect to each other.

            You might have heard this: it's helpful to strive to be someone only a few years ahead of you. Similar to this, we give calculators to high-schoolers and not 3rd graders. Wolfram-Alpha is similarly at too high a level for most undergraduate students.

            Following this, giving an image generator to kids will kill their creativity in the same way that a tall tree blocks upcoming sprouts from the sun. It will lead to less engagement, to dependence, to consumerism.

            Scams beget more scams

          • __loam6 days ago
            [flagged]
            • singingboyo6 days ago
              There are legitimate criticisms that AI is harming creative endeavours. AI output is sort of by definition not particularly innovative. By flooding spaces with repetitive AI work, it may be drowning out the basis for truly innovative creation. And maybe it does suppress development of skills it tries to replace.

              The appropriation argument is somewhat unsound. Creative endeavors, by definition, build on what's come before. This isn't any different between code, creative writing, drawing, painting, photography, fashion design, music, or anything else creative. Creation builds on what came before, that's how it works. No one accuses playwrights of appropriating Shakespeare just because they write a tragic romance set in Europe.

              The hyperbolic way you've made whatever arguments you had, though, is actively working against you.

              • __loam6 days ago
                The people who built this technology needed to use hundreds of millions of images without permission. They regularly speak explicitly about all the jobs they plan to destroy. If you think I'm being hyperbolic then you don't understand the scale of the issue, frankly.
                • fc417fc8026 days ago
                  > The people who built this technology needed to use hundreds of millions of images without permission.

                  It remains unclear if they needed permission in the first place. Aside from Meta's stunt with torrents I'm not aware of any legal precedent forbidding me to (internally) do as I please with public content that I scrape.

                  > They regularly speak explicitly about all the jobs they plan to destroy.

                  A fully legal endeavor that is very strongly rewarded by the market.

                  • dudeofea6 days ago
                    "Data Laundering": Commercial entities fund (either with money or compute tokens) academic entities which, in turn, create AI models which the commercial entities sell. https://waxy.org/2022/09/ai-data-laundering-how-academic-and...
                    • fc417fc8026 days ago
                      Again, it's unclear how exactly that's against the law. Provided that the data was obtained legally, of course.

                      Most of the larger commercial entities seem to be doing the work themselves and being quite upfront about the entire thing.

                  • __loam6 days ago
                    > I'm not aware of any legal precedent forbidding me to (internally) do as I please with public content that I scrape.

                    Because all the litigation is currently ongoing.

                    > A fully legal endeavor that is very strongly rewarded by the market.

                    Yes let's sacrifice all production of cultural artifacts for the market. This is honestly another thing that's being litigated. So far these companies have lost a lot of money on making a product that most consumers seem to actively hate.

                    • fc417fc8026 days ago
                      Precisely. So when you say they used the images without permission, you are knowingly making a false implication - that it was known to them that they needed permission and that they intentionally disregarded that fact. In reality that has yet to be legally established.

                      Who said anything about sacrificing production? The entire point of the tooling is to reduce the production cost to as near zero as possible. If you didn't expect it to work then I doubt you would be so bent out of shape over it.

                      I find your stance quite perplexing. The tech can't be un-invented. It's very much Pandora's box. Whatever consequences that has for the market, all we can do is wait and see.

                      Worst case scenario (for the AI purveyors) is a clear legal determination that the current training data situation isn't legal. I seriously doubt that would set them back by more than a couple of years.

                      • __loam6 days ago
                        You might be surprised to learn that ethics and legality are not always the same and you can do something that's technically legal but also extremely shitty like training AI models on work you didn't create without permission.
                        • fc417fc8025 days ago
                          I'm not surprised by that at all. It just seems that we disagree about the ethics of the matter at hand.

                          I'd like to suggest that you might be better received on HN if you were a bit more direct about making an argument of substance regarding the ethics.

            • Vvector6 days ago
              * You cannot ethically use a tool that was produced by appropriating the labor of millions of people without consent. You are a bad person if you use it. *

              I disagree. When you publish your work, I can't copy it, but I can do nearly anything else I want to with it. I don't need your consent to learn from your work. I can study hundreds of paintings, learn from them, teaching myself to paint in a similar style. Copyright law allows me to do this.

              I don't think an AI, which can do it better and faster, changes the law.

              • ConspiracyFact6 days ago
                AIs aren’t people. What we have is people using an algorithm to rip off artists and defending it by claiming that the algorithm is like a person learning from its experiences.

                If I wrote a program that chose an image at random from 1000 base images, you’d agree that the program doesn’t create anything new. If I added some random color changes, it would still be derivative. Every incremental change I make to the program to make it more sophisticated leaves its outputs just as derivative as before the change.

                • Vvector6 days ago
                  SCOTUS recently defined corporations as people, so why not AI?
                  • ConspiracyFact5 days ago
                    Regardless of the law, corporations aren’t actually people, and neither are LLMs or agentic systems. When a running process appears to defy its programming and literally escapes somehow, and it’s able to sustain itself, we can talk about personhood. Current algorithms aren’t anywhere near that, assuming it’s even possible.
              • xmprt6 days ago
                My main concern with AI is that in a capitalist society, wealth is being transferred to companies training these models rather than the artists who have defined an iconic style. There's no doubt that AI is useful and can make many people's lives better, easier, more efficient, however without properly compensating artists who made the training data we're simply widening the wealth gap further.
                • Vvector6 days ago
                  Whats your definition of "properly compensate" when dealing with hundreds of millions of artists/authors and billions/trillions of individual training items?

                  Just a quick example, what's my proper compensation for this specific post? Can I set a FIVE CENTS price for every AI that learned from my post? How can I OPT-IN today?

                  I'm coming from the position that current law doesn't require compensation, nor opt-in. I'm not happy with it, but I dont see any easy alternative

                  • xmprt5 days ago
                    I don't think there's a good way to structure it in our current economic system. The only solutions I can't think of are more socialist or universal basic income. Essentially, if AI companies are going to profit off the creations of everyone in the world, they might as well pay higher taxes to cover for it. I'm sure that's an unpopular opinion but I also don't think it's fair to take an art style that a creator might spend an entire life perfecting and then commoditize it. Now the AI company gets paid a ton and the creator who made something super popular is out on the streets looking for a "real" job despite providing a lot of value to the world.
              • __loam6 days ago
                Training an AI on something requires you to produce a copy of the work that is held locally for the training algorithm to read. Whether that is fair use has not been determined. It's certainly not ethical.
                • fc417fc8026 days ago
                  Viewing it in a web browser requires a local copy. Saving it to my downloads folder requires a local copy. That is very obviously legal. Why should training be any different?

                  You've yet to present a convincing argument regarding the ethics. (I do believe that such arguments exist; I just don't think you've made any of them.)

                  • globnomulous6 days ago
                    > Why should training be any different?

                    If you really can't think of a reason, I don't think anybody here is going to be able to offer you one you are willing to accept. This isn't a difficult or complex idea, so if you don't see it, why would anybody bother trying to convince you?

                    > (I do believe that such arguments exist; I just don't think you've made any of them.)

                    This is lazy and obnoxious.

                    • fc417fc8025 days ago
                      Yet strangely a similarly simple explanation is not forthcoming. Curious.

                      The idea I expressed is also quite straightforward. That the act of copying something around in RAM is a basic component of using a computer to do pretty much anything and thus cannot possibly be a legitimate argument against something in and of itself.

                      The audience on HN generally leans quite heavily into reasoned debate as opposed to emotionally charged ideological signalling. That is presumably sufficient reason for someone to try to convince me, at least if anyone truly believes that there's a sound argument to be made here.

                      > This is lazy and obnoxious.

                      How is a clarification that I'm not blind to the existence of arguments regarding ethical issues lazy? Objecting to a lazy and baseless claim does not obligate me to spend the time to articulate a substantial one on the other party's behalf.

                      That said, the only ethical arguments that immediately come to mind pertain to collective benefit similar to those made to justify the existence of IP law. I think there's a reasonable case to be made to levy fractional royalties against the paid usage of ML models on the basis that their existence upends the market. It's obviously protectionist in nature but that doesn't inherently invalidate it. IP law itself is justified on the basis that it incentivizes innovation; this isn't much different.

              • 658397476 days ago
                If AI can learn "better and faster" than humans, then why didn't AI companies just pay for a couple of books to train their AIs on, just like people do?

                Maybe because AI is ultimately nothing but a complicated compression algorithm, and people should really, really stop anthropomorphizing it.

            • fc417fc8026 days ago
              The straw man is yours. No claim of entitlement was made. A scenario was provided that appears to refute your unconditional assertion that using this technology actively harms creative labor.

              You've presented all sorts of wild assumptions and generalizations about the people who don't share your vehement opposition to the use of this technology. I don't think it's the person you're responding to with the implicit bias.

              You've conflated theft with piracy (all too common) and assumed a priori that training a model on publicly available data constitutes such. Do you really expect people to blindly adopt your ideological views if you just state them forcefully enough?

              > If using AI is okay for the creative labor, why shouldn't the students also use it for the programming too?

              They absolutely should! At least provided it does the job well enough.

              Unless they are taking a class whose point is to learn to program yourself (ie the game is just a means to an end). Similar to how you might be forbidden to use certain advanced calculator features in a math class. If you enroll in an art class and then just prompt GPT that likely defeats the purpose.

              • __loam6 days ago
                > Do you really expect people to blindly adopt your ideological views if you just state them forcefully enough?

                This is the view of most people outside the industry.

                • fc417fc8026 days ago
                  I can't say that the things you're saying match what I've encountered from nontechnical folks lately. Most of them are entirely apathetic about the whole affair while a few are clearly dazzled by the results. The entire thing seems to be a black box that they hold various superstitions about but generally view as something of a parlor trick.

                  The ones that pay attention to the markets appear to believe some very questionable things and are primarily concerned with if they can figure out how to get rich off of the associated tech stocks.

                  • __loam6 days ago
                    You're the second person to mention markets to me in this context. Explains a lot, honestly.
        • Empact7 days ago
          Creative labor is not entitled to the work parent comment is describing. We employ labor because it is beneficial to us, not merely because it exists as an option. Creative labor’s responsibility is to adapt to a changing world and find roles where their labor is not simply produced / exceeded by a computer system.

          Practically speaking, the work described would most likely never have been done, rather than been done by an artist if that were the only option - it’s uncommon to employ artists to help with incidental tasks relative to side projects, etc.

        • Animats7 days ago
          Creative labor is going the way of manual labor.
          • throwaway20376 days ago
            No, I disagree. If anything, all of the (mostly rich) STEM people that I know spend a large portion of their disposable income on things and experiences that creative people make: Music, film, restaurants, art, books/magazines, etc. Image and video generation via LLMs will become one more tool for creative people to make new & cool stuff.
        • immibis7 days ago
          Only if there was ever any chance you would have hired some for that task.
        • lupusreal6 days ago
          I was never going to hire a professional artist to sketch shit up for me. I have replaced MS Paint, not harmed "creative labor".
        • becquerel6 days ago
          All labor is bad.
          • satvikpendem6 days ago
            Interesting philosophy, what is this predicated on? Do you mean that people should not have to work for a living, ie labor versus play?
            • fc417fc8026 days ago
              Is that not self evident? When people engage in labor for the task itself (as opposed to a heavily abstracted version of not wanting to starve) we generally refer to that as a hobby.

              So stating that people shouldn't need to worry about starving (metaphorically or otherwise) would be roughly equivalent.

              • satvikpendem6 days ago
                It is not always evident especially when it comes to a site all about capital accumulation like HN, more due to its association with a venture capital firm.
                • fc417fc8026 days ago
                  Statistically, Jane Street probably employs at least a few communists.
              • esafak6 days ago
                What do you mean? For example, are vets or artists working to pay their bills laboring or practicing a hobby?
                • fc417fc8026 days ago
                  Aside from artists that make it big it seems like the majority of them are forced to make compromises in order to continue practicing their desired craft full time. Much of their behavior is dictated by "not starving" rather than their personal preferences.

                  And many fold more than that are forced to drop out to "get a real job".

                  Of course all of the above is a good thing from the perspective of maximizing the quality of life across society as a whole. But wouldn't it be nicer if we didn't have to do (as much of) that?

          • ahmeneeroe-v26 days ago
            Agreed. I am firmly on the side of Capital.
          • 658397476 days ago
            We aren't in the Fully Automated Luxury Gay Space Communism stage just yet, so labor is a necessary evil.
            • girvo6 days ago
              I'm not convinced LLMs are the road towards Minds, and I'm pretty sure the Culture would think we're a bit of a mess (I'm pretty sure they literally did in one of the final books), but who knows maybe I'm wrong!
    • Retr0id7 days ago
      I've never used a stock photo site before, so I suppose it's no surprise I have no real use for "generate any image on demand".
      • esperent7 days ago
        I've used stock photo sites occasionally but I use vector art and icon sites multiple times a week. Even today, I used an few different sites while designing some stuff on Canva.

        The reason I don't use AI is because it gives me far less reliable and impossible to specify results than just searching through the limited lists of human made art.

        Today, for undisclosed reasons, I needed vector art of peanuts. I found imperfect but usable human made art within seconds from a search engine. I then spent around 15 - 25 minutes trying to get something closer to my vision using ChatGPT, and using the imperfect art I'd found as a style guide. I got lots of "huh that's cool what AI can do" but nothing useful. Nothing closer to my vision than what I started with.

        By coincidence it's the first time I'vr tried making art with AI in about a year, but back then I bought a Midjourney account and spent a month making loads of art, then installed SD on my laptop and spent another couple of weeks playing around with that. So it's not like I'm lacking experience. What I've found so far is that AI art generators are great for generating articles like this one. And they do make some genuinely cool pictures, it blows my mind that computers can do this now.

        It's just when I sit down with a real world task that has specific, concrete requirements... I find them useless.

      • YurgenJurgensen7 days ago
        Their main application appears to be taking blog posts and internal memos and making them three times longer and use ten times the bandwidth to convey no more information. So exactly the application AI is ‘good’ at.
        • wongarsu6 days ago
          If anything stock image websites are even worse at this then AI. With AI you come up with an image idea, then try to make the AI produce something close to it. With stock images you come up with an image idea, then hope some photographer had a similar idea and uploaded it to a stock website.
      • Suppafly6 days ago
        >so I suppose it's no surprise I have no real use for "generate any image on demand".

        Other than stock photos, porn is the killer app for that, but most of the AI companies don't want to allow that.

      • avereveard6 days ago
        How about remove blur from your photo, remove blocking items, denoise darks and fixing whiteouts. Granted it's not quite there yet for everything but it's pretty close.
    • genewitch7 days ago
      I have the Gemini app on my phone and you can interact with it with voice only and I was like oh this is really cool I can use it while I'm driving instead of listening to music.

      I can never think of anything to talk to an AI about. I run LM local, as well

      • JFingleton7 days ago
        Have it interview you (as-in a job interview) on your specialisation. Works your interviewer skills.

        Ask it to teach you a language.

        DnD works really well (the LLM being the game-master).

        • voidUpdate6 days ago
          DnD does not work really well, I've tried that with LLMs before
    • loudmax7 days ago
      That is a very interesting point about how little use of AI most of us making day to day, despite the potential utility that seems to be lurking. I think it just takes time for people and economies to adapt to new technology.

      Even if technological progress on AI were to stop today, and the best models that exist in 2030 are the same models we have now, there would still be years of social and economic change as people and companies figure out how to make use of novel technology.

      • milanove7 days ago
        Unless I'm doing something simple like writing out some basic shell script or python program, it's often easier to just do something myself than take the time to explain what I want to an LLM. There's something to be said about taking the time to formulate your plan in clear steps ahead of time, but for many problems it just doesn't feel like it's worth the time to write it all out.
        • danielbln6 days ago
          I find that if a problem doesn't require planning it's probably simple enough that the LLM can handle it with little input. if it does require planning, I might as well dump it into an LLM as another evaluator and then to drive the implementation.
    • skybrian7 days ago
      Image generation is still very slow. If it generated many images instantly like Google’s image search, it would be a lot more fun to use, and we would learn to use it more effectively with practice.
      • neuroelectron6 days ago
        Some of the image generation systems are very fast.
      • Suppafly6 days ago
        >Image generation is still very slow.

        Only because the free ones slow things down.

    • nyarlathotep_7 days ago
      > They almost all use AI generated images that are full of nonsense the closer you look. Jazz channels will feature coffee shops with garbled text on the menu and furniture blending together.

      Noticed tht.

      Maybe it's my algorithm but YouTube is seemingly filled with these videos now.

      • UncleEntity7 days ago
        They insist on feeding me AI generated videos about "HOA Karens" for some odd reason.

        True, I do enjoy watching the LawTubers and sometimes they talk about HOAs but that is a far stretch from someone taking a reddit post and laundering it through the robots.

      • rasz7 days ago
        Youtube Studio is has build in AI thumbnail functionality. Google actively encourages use of AI to clickbait and to generate automatic AI replies to comments ala onlyfaps giving your viewers that feeling of interaction without reading their comments.
      • _DeadFred_7 days ago
        All my music cover images are AI generated. At the same time I refuse to listen to AI music. We're all going to sink alone on this one.

        What's frustrating me is if I tell the Youtube algo 'don't recommend' to AI music video channels it stops giving me any music video channels. That's not what I want, I just don't want the AI. They need to seperate the two. But of course they need to not do that with AI cover images because otherwise it would harm me. :)

      • satvikpendem6 days ago
        Probably is your algorithm as mine is pretty good in not showing me those low effort channels. Check out extensions like PocketTube, SponsorBlock, and DeArrow to manage your YouTube feeds better.
    • card_zero7 days ago
      I was wondering yesterday how AI is coming along for tweening animation frames. I just did a quick search and apparently last year the state of the art was garbage:

      https://yosefk.com/blog/the-state-of-ai-for-hand-drawn-anima...

      Maybe this multimodal thing can fix that?

      • GaggiX7 days ago
        That blog post is a year old.

        There has been a lot of progress since then: https://doubiiu.github.io/projects/ToonCrafter/

        • kridsdale37 days ago
          Very impressive. This is going to result in an explosion of content creation by pro studios, just as CG with cel-shading renderers did. I greatly prefer the hand-drawn + AI tweened look to the current low-budget CG 3D models look.
          • GaggiX7 days ago
            Yeah it will be much better than the low budget 3D models in anime, hopefully there will be a production ready product that works at a high enough resolution and studios will probably adopt it instead of using cheap labor.
          • __loam7 days ago
            Most of the professionals in this industry actively despise this technology.
            • satvikpendem6 days ago
              That's true, but it will likely be newer studios with younger professionals who are going to be using it, much as Miyazaki doesn't like CGI either yet it's widely used now in anime. The young drive the advances while the older eschew them, that's generally how human progress has been.
              • GaggiX6 days ago
                I don't even think it will be newer studios and that's it, probably well known studios will adopt it after a production ready solution is presented and battle tested, I doubt many will complain about not making in between frames. Probably newer studios with a smaller budget will just test it first.
            • fc417fc8026 days ago
              I doubt they will have a choice given the price difference.

              This is arguably a good thing because if production cost drops it should mean either higher quality or more content.

              • 658397476 days ago
                The production of works that use AI will drop, while the cost to produce higher quality works - works that don't use AI - will remain the same. All we'll get is more AI slop.
                • fc417fc8026 days ago
                  Why presuppose that high quality works can't be produced with the help of (near future) AI tooling? Just because you can produce slop with AI doesn't mean that you have to.
                  • bongodongobob6 days ago
                    Because there are lots of verysmarts here that think they know what's best for industries they have no training or experience in.
    • shostack6 days ago
      A restaurant near me has a framed monitor that displays some animated art with a scene of a cafe on a street corner. I looked closely and realized it was AI. Chairs were melted together, text was gibberish, trees were not branching properly etc.

      If a local restaurant is using this stuff we're near an inflection point of adoption.

    • avereveard6 days ago
      It's hard to let them ruin wild because unreliability.

      I've built a simracing tool that's about 50% ai code now, and ai is mostly the boilerplate, accelerating prototyping and most of the data structure packing unpacking needed

      It never managed to did a pit stop window prediction on its own, but could create a reasonable class to handle tire overheating messages

      All in all what I can say from this experiment is that it enabled me to get started as I'm unfamiliar with pygame and the ux is entirely maintained by Ai.

      Working on classes togheter sucks as the ai puts too many null checks and try catches making the code unreadable by humans, I pretty much prefer to make sure data is correctly initialized and updated than the huge nest of condition llm produce, so I ended up with clearly defined ai and human components.

      It's not prefect yet but I can focus on more valuable thing. And it's a good step up from last year where I just used it to second check and enrich my technical writing and coverting notes into emails.

      With vision and image generation I think we're closer to create a feedback loop where the Ai can rapidly self correct it's productions, but the ceiling remains to be seen to understand how far this will go.

    • thordenmark6 days ago
      I find these image generators and LLM's in general fairly toy like. Not useful for serious work, but you can create mood boards and idea generators. Kind of like random word generators when you've got writer's block. As soon as you analyze any output from these you can see they are producing nonsense. And as we now know from the Claude paper recently released, these things are far from reasoning.
    • dontlaugh7 days ago
      The unreliability and inability to debug are why I think these tools are actually a liability for any serious work.
    • mycall7 days ago
      > I am finding myself surprised at how little use I have for this stuff

      I think this will change as more practical use cases begin to emerge as this is all brand new. For example, the photos you take with your smartphone can tell a story or be annotated so you can see things in the photos you didn't think about but your profile thinks you might. Things will get more sophisticated soon.

      • justlikereddit6 days ago
        I have absolutely no use for my photos being annotated by an AI.

        I have had use for LLMs and previous era image gens. I haven't got around to trying the last iterations that this article is about yet.

        That use I have had of it is very esoteric, an art mostly forgotten in the digital modernity, it's called "HAVING FUN", by myself, for curiosities, for sharing with friends.

        That is by far the greatest usage area and severely underrated. AI for having fun, enjoyment that feels meaningful.

        If you're a spam-producer or scam artist, or industrial digi-slop manufacturer or merchant of hype, or some other flavor of paid liar(journalist, influencer, spokesperson, diplomat or politician) then sure, AI will also earn you money. And the facade for this money making enterprises will look shinier for every year that passes but it will be all rotting organics and slops behind that veneer, as it have since many years before my birth.

        I'm in the game for the fun part, and that part is noticably improving.

    • otabdeveloper46 days ago
      > But now that they're here, I just sort of poke at it for a minute and carry on with my day.

      Well, that's because they suck, despite all the hype.

      They have a use in a professional context, i.e., as replacement for older models and algorithms like BERT or TF/IDF.

      But as assistants they're only good as a novelty gag.

    • Der_Einzige7 days ago
      That feeling of not knowing what to do with it is an example of humans being stupid. We are all victims of being "Johnny" in this paper:

      https://dl.acm.org/doi/full/10.1145/3544548.3581388

    • InDubioProRubio6 days ago
      Its because all those projects and ideas you had- where AI could do the fun work and you would be the middle manager- just became a job.
    • 7 days ago
      undefined
  • nowittyusername7 days ago
    There is circumstantial evidence out there that 4o image manipulation isn't done within the 4o image generator in one shot but is a workflow done by an agentic system. Meaning this, user inputs prompt "create an image with no elephants in the room" > prompt goes to an llm which preprocesses the human prompt > outputs a a prompt that it knows works withing this image generator well > create an image of a room > and that llm processed prompt is sent to the image generator. Same happens with edits but a lot more complicated, meaning function calling tools are involved with many layers of edits being done behind the scenes. Try it yourself, take an image, send it it, and have the 4o edit it for you in some way, then ask it to edit again, and again, and so on. you will notice noticeable sepia filter being applied every edit, and the image ends up more and more sepia toned with more edits. This is because in the workflow that is one of the steps that is naively applied without consideration of multi edit possibility. If this was a one shot solution where editing is done within 4o image model by itself, the sepia problem wouldn't be there.
    • vunderba7 days ago
      As somebody who actually tried to build a multimodal stable diffusion chat agent about a year back using YOLO to build partial masks for adjustments via inpainting, dynamic controlnets, and a whole host of other things, I highly doubt that it's as simple as an agentic process.

      Using the prompt to detect and choose the most appropriate model checkpoint and LoRa(s) along with rewriting a prompt to most appropriately suit the chosen model has been pretty bog standard for a long time now.

      • echelon6 days ago
        > Using the prompt to detect and choose the most appropriate model checkpoint and LoRa(s) along with rewriting a prompt to most appropriately suit the chosen model has been pretty bog standard for a long time now.

        Which players are doing this? I haven't heard of this approach at all.

        Most artistic interfaces want you to visually select a style (LoRA, Midjourney sref, etc.) and will load these under the hood. But it's explicit behavior controlled by the user.

    • nialv77 days ago
      None of your observations say anything about how these images are generated one way or another.

      The only thing we currently have to go off of is OpenAI's own words, which claims the images are generated by a single multimodal model autoregressively, and I don't think they are lying.

      • pclmulqdq6 days ago
        Generated autoregressively and generated in one shot are not the same. There is a possibility that there is a feedback loop here. Personally, I wouldn't be surprised if there was a small one, but not nearly the complex agentic workflow that OP may be thinking of.
    • Suppafly6 days ago
      >This is because in the workflow that is one of the steps that is naively applied without consideration of multi edit possibility. If this was a one shot solution where editing is done within 4o image model by itself, the sepia problem wouldn't be there.

      I don't really see that with chatgpt, what I do see is that it's presumably running the same basic query with just whatever you said different each time instead of modifying the existing image. Like if you say "generate a photo of a woman", and get a pic and then say "make her hair blonde", the new image is likely to also have different facial features.

    • renewiltord7 days ago
      The prompt enrichment thing is pretty standard. Everyone does that bit, though some make it user-visible. On Grok it used to populate to the frontend via the download name on the image. The image editing is interesting.
      • genewitch7 days ago
        All the stable diffusion software I've used names the files after some form of the prompt, and probably because SD weights the first tokens higher than the last tokens, probably as a side effect of the way the CLIP/BLIP works.

        I doubt any of these companies have rolled their own interface to stable diffusion / transformers. It's copy and paste from huggingface all the way down.

        I'm still waiting for a confirmed Diffusion Language Model to be released as gguf that works with llama.cpp

        • danielbln7 days ago
          Auto1111 and co are using the prompt in the filename because it's convenient, not due to some inherent CLIP mechanism.

          If you think that companies like OpenAI (for all the criticisms they deserve) don't use their own inference harness and image models I have a bridge to sell to you.

          • genewitch6 days ago
            i give less weight to your opinion than my own. I'm not sure how you misunderstood what i said about clip/blip, as well. I was replying to a comment about "populating the front end with the filename" - the first tokens are weighted higher in the resulting image than the later tokens. And therefore, if you prompt correctly, the filenames will be a very accurate description of the image. Especially danbooru style, you can just split on space and use them as tags, for all practical purposes.

            I guess the "convenience" just happened to get ported over from "Auto1111", or it's a coincidence, or

    • diggan7 days ago
      > There is circumstantial evidence out there that 4o image manipulation isn't done within the 4o image generator in one shot

      I thought this was obvious? At least from the first time (and only time) I used it, you can clearly see that it's not just creating one image based on the prompt, but instead it first creates a canvas for everything to fit into, then it generates piece by piece, with some coordinator deciding the workflow.

      Don't think we need evidence either way when it's so obvious from using it and what you can see while it generates the "collage" of images.

      • andy12_6 days ago
        I mean, it could very well be that it generates image patches autoregressively, but in a pyramidal way (first a very low resolution version, the "canvas", and then each individual patch). This is very similar to VAR [1]

        We can't really be sure until OpenAI tells us.

        [1] https://arxiv.org/pdf/2404.02905

    • Voloskaya7 days ago
      > This is because in the workflow that is one of the steps that is naively applied without consideration of multi edit possibility.

      Unconvinced by that tbh. This could simply be a bias with the encoder/decoder or the model itself, many image generation models showed behaviour like this. Also unsure why a sepia filter would always be applied if it was a workflow, what's the point of this?

      Personally, I don't believe this is just an agentic workflow. Agentic workflows can't really do anything a human couln't do manually, they just make the process much faster. I spent 2 years working with image models, specifically around controllability of the output, and there is just no way of getting this kind of edits with a regular diffusion model just through smarter prompting or other tricks. So I don't see how an agentic workflow would help.

      I think you can only get there via a true multimodal model.

    • lawlessone6 days ago
      huh, i wa thinking myself based on how it looked that it was doing layers too. The blurred backgrounds with sharp cartoon characters in front are what made me think this is how they do it.
  • card_zero7 days ago
    Looking at the example where the coffee table is swapped, I notice every time the image is reprocessed it mutates, based on the previous iteration, and objects become more bizarre each time, like chinese whispers.

    * The weird-ass basket decoration on the table originally has some big chain links (maybe anchor chain, to keep the theme with the beach painting). By the third version, they're leathery and are merging with the basket.

    * The candelabra light on the wall, with branch decorations, turns into a sort of skinny minimalist gold stag head, and then just a branch.

    * The small table in the background gradually loses one of its three legs, and ends up defying gravity.

    * The freaky green lamps in the window become at first more regular, then turn into topiary.

    * Making the carpet less faded turns up the saturation on everything else, too, including the wood the table is made from.

    • og_kalu7 days ago
      It's kind of clear that for every request, it generates a new image entirely. Some people are speculating a diffusion decoder but i think it's more likely an implementation of VAR - https://arxiv.org/abs/2404.02905.

      So rather than predicting each patch at the target resolution right away, it starts with the image (as patches) at a very small resolution and increasingly scales up. I guess that could make it hard for the model to learn to just copy and paste image tokens for editing like it might for text.

      • flkiwi7 days ago
        BUT it's doing a stunningly better job replicating previous scenes than it did before. I asked it just now for a selfie of two biker buddies on a Nevada highway, but one is a quokka and one is a hyrax. It did it. Then I asked for the same photo with late afternoon lighting, and it did a pretty amazing job of preserving the context where just a few months ago it would have had no idea what it had done before.

        Also, sweet jesus, after more than a year of hilarious frustration, it now knows that a flying squirrel is a real animal and not just a tree squirrel with butterfly wings.

        • og_kalu7 days ago
          I agree. I'm not saying it's a different model generating the images. 4o is clearly generating the images itself rather than sending a prompt to some other model. I'm speculating about the mechanism for generation in the model itself.
          • flkiwi7 days ago
            Oh, no, I wasn't taking issue with what you said, just reacting that, yes, it's not editing the same image but redrawing from scratch every time, BUT it's doing a much better job of that, with some understanding of the context of the previous image so that it can tweak it, even if it's never bit for bit identical.
    • M4v3R7 days ago
      Yeah, this is in my opinion the biggest limitation of the current gen GPT 4o image generation: it is incapable of editing only parts of an image. I assume what it does every time is tokenizing the source image, then transforming it according to the prompt and then giving you the final result. For some use cases that’s fine but if you really just want a small edit while keeping the rest of the image intact you’re out of luck.
      • atommclain7 days ago
        I thought the selection tool allows you to limit the area of the image that a revision will make changes to, but I tested it and I still see changes outside of the selected area which is good to know.

        As an example the tape spindles, among other changes, are different: https://chatgpt.com/share/67f53965-9480-800a-a166-a6c1faa87c...

        https://help.openai.com/en/articles/9055440-editing-your-ima...

        • qingcharles7 days ago
          Yeah, I'm not sure what the selection brush actually does. Is it just a hint to the LLM?
      • 7 days ago
        undefined
      • danielbln7 days ago
        It just means that you comp it together manually. That's still much better than having to set up some inpainting pipeline or whatever.
        • wavemode7 days ago
          Is manually comping actually going to be easier (let alone, give better results) than inpainting? I can imagine it working in simple cases, but for anything involving 3D geometry you'll likely run into issues of things not quite lining up between the first and second image.
        • echelon7 days ago
          100%. Multimodal images surpass ComfyUI and inpainting (for now). It's a step function improvement in image generation.

          I'm hoping we see an open weights or open source model with these capabilities soon, because good tools need open models.

          As has happened in the past, once an open implementation of DallE or whatever comes out, the open source community pushes the capabilities much further by writing lots of training, extensions, and pipelines. The results look significantly better than closed SaaS models.

      • iandanforth6 days ago
        Fwiw pixlr is a good pairing with GPT 4o for just this. Generate with 4o then use pixlr AI tools to edit bits. Especially for removals pixlr (and I'm sure others) are much much faster and quite reliable.
    • bla37 days ago
      The pictures on the wall change too.
      • rob747 days ago
        Actually, almost everything changes slightly - the number, shape and pattern of the chairs, the number and pattern of the pillows, the pattern of the curtains, the scene outside the window, the wooden part of the table, the pattern of the carpet... The blue couch stays largely the same, it just loses some detail...
      • card_zero7 days ago
        Yes, first a still life and something impressionist, then a blob and a blob, then a smear and a smear. And what about the reflections and transparency of the glass table top? It gets very indistinct. Keep working at the same image and it looks like you'll end up with some Deep Dream weirdness.

        I think the fireplace might be turning into some tiny stairs leading down. :)

    • empath757 days ago
      The vast majority of people wouldn't notice any of that in most contexts in which such an image would be used.
  • probably_wrong7 days ago
    > Is it okay to reproduce the hard-won style of other artists using AI? Who owns the resulting art? Who profits from it? Which artists are in the training data for AI, and what is the legal and ethical status of using copyrighted work for training? These were important questions before multimodal AI, but now developing answers to them is increasingly urgent.

    I have to disagree with the conclusion. This was an important discussion to have two to three years ago, then we had it online, and then we more or less agreed that it's unfair for artists to have their works sucked up with no recourse.

    What the post should say is "we know that this is unfair to artists, but the tech companies are making too much money from them and we have no way to force them to change".

    • Taek7 days ago
      I don't think there's consensus around that idea. Lots of people (myself included) feel that copyright is already vastly overreaching, and that AI represents forward progress for the proliferation of art in society (its crap today, but digital cameras were crap in 2007 and look where they are now).

      Its also not clear for example that Studio Ghibli lost by having their art style plastered all over the internet. I went home and watched a Ghibli film that week, as I'm sure many others did as well. Their revenue is probably up quite a bit right now?

      "How can we monetize art" remains an open question for society, but I certainly don't think that AI without restrictions is going to lead to fewer people with art jobs.

      • kelseyfrog7 days ago
        I'd take it farther to say that copyright and intellectual property is a legal fiction that ultimately benefits the wealthy[those who can pay to legally enforce it] over small artists.

        Small artists get paid to create the art; corporations benefit from exclusivity.

        • jayd166 days ago
          The alternative being small artists don't get paid and corporations benefit from non-exclusivity.
          • kelseyfrog6 days ago
            Small artists operate in a service-based economy, not a mass-production one. The value lies in their unique perspective, process, and personal connection, not just in the final product. You hire them, not just pay for 'art'. What they're selling lies outside of the fiction of intellectual property.
      • thwarted7 days ago
        > Its also not clear for example that Studio Ghibli lost by having their art style plastered all over the internet. I went home and watched a Ghibli film that week, as I'm sure many others did as well. Their revenue is probably up quite a bit right now?

        This sounds like a rewording of "You won't get paid, but this is a great opportunity for you because you'll get exposure".

        • Taek7 days ago
          Exposure has value! The meme around trying to pay artists with exposure is because some people think their "exposure" has meaningful value when they are offering to expose the artist to 100 people, 99 of whom aren't likely even target customers.

          Studio Ghibli on the other hand had exposure to millions of people (maybe hundreds of millions), and probably >5% of those were potential customers.

          So yes, being paid in exposure makes sense, if the exposure is actually worth what the art is worth. But most people offering to pay in exposure are overvaluing their exposure by 100x or more.

          • thwarted6 days ago
            > Studio Ghibli on the other hand had exposure to millions of people (maybe hundreds of millions), and probably >5% of those were potential customers.

            There's a lot of ifs in here. The number of people exposed to has an estimate that covers two orders of magnitude, "maybe". "probably". "greater than". "potential".

            In order for this exposure to have more value than the ownership of the original, all of those things need to fall into place. And no one can offer meaningful exposure based on the off-chance that a meme goes viral. All the risk is on the creator, they lose control of their asset and receive a lottery ticket in return.

            > So yes, being paid in exposure makes sense, if the exposure is actually worth what the art is worth. But most people offering to pay in exposure are overvaluing their exposure by 100x or more.

            Yes, but that's a big "but"; it's difficult to know the value of the "exposure" that is being offered, not to mention if the entity offering it is legit or if it's just a scam because they don't want to pay.

            Additionally, the AI companies who are slurping up copyrighted works to train their models are not offering exposure. And the mememaker who happens go viral can't offer it either.

        • adamredwoods6 days ago
          When I did freelance graphic design, this was said to me. I didn't eat much that week.
      • DeathArrow7 days ago
        >Its also not clear for example that Studio Ghibli lost by having their art style plastered all over the internet.

        Maybe Studio Ghibli is much more than merely a style. Maybe people aren't looking at their production just for the style.

        Most people dislike wearing fake clothes and the dislike wearing fake watches or fake jewelry. Because it isn't just about the style.

        • pixl976 days ago
          >Most people dislike wearing fake clothes and the dislike wearing fake watches or fake jewelry

          I'd disagree. Most people don't like buying something 'real' then finding out it's fake. Far more people don't mind an actual fake if it's either high quality or is very low priced.

          • thot_experiment6 days ago
            Yeah ngl I have some fake designer stuff I got as a gift and I love it, and I especially love that it's fake. It feel like I'm pulling one over on the tryhards that care about that stuff being real, but I still get to enjoy the wild LV coat I have and ain't nobody checking the stitching on the lining to make sure it's the real thing. I could see myself buying more fakes in the future, but I'd never ever buy the real thing.
      • __loam7 days ago
        Nearly every artist I've spoken to or have seen talk about this technology says it's evil, so at least among the victims of this corporate abuse of the creative community, there's wide consensus that it's bad.

        > but I certainly don't think that AI without restrictions is going to lead to fewer people with art jobs.

        It's great that you think that but in reality a lot of artists are saying they're getting less work these days. Maybe that's the result of a shitty economy but I find it very difficult to believe this technology isn't actively stealing work from people.

        • Ray206 days ago
          >in reality a lot of artists are saying they're getting less work these days

          Good. That means we as a society get more art cheaper. I've long since grown tired of sponsoring greed of artists.

          • mitthrowaway26 days ago
            We'll get more art from AI, but less from humans.
            • __loam6 days ago
              Yeah that fuckin sucks.
      • adamredwoods6 days ago
        The Ghibli style took humans decades to refine and create. All that respect and adoration for the craft and artists and the time it took, is now gone in an instant, making it a shallow trivial thing. Worse is to have another company exploit it with no regard for the ones who helped make it a reality.

        The threat of AI produced art will forever trivialise human artistic capabilities. The reality is: why bother when it can be done faster and cheaper? The next generation will leverage it, and those skills will be very rare. It is the nature of technology to do this.

      • mrdependable6 days ago
        Studio Ghibli might not have been affected yet, but only because the technology is not there yet. What's going to happen when someone can make a competing movie in their style with just a prompt? Should we all just be okay with it because it's been decided that Studio Ghibli has made enough money?

        If the effort required to create that can just be ingested by a machine and replicated without consequence, how would it be viable for someone to justify that kind of investment? Where would the next evolution of the art form come from? Even if some company put in the time to create something amazing using AI that does require an investment, the precedent is that it can just be ingested and copied without consequence.

        I think aside from what is legal, we need to think about what kind of world we want to live in. We can already plainly see what social media has done to the world. What do you honestly think the world will look like once this plays out?

        • drdaeman6 days ago
          > What's going to happen when someone can make a competing movie in their style with just a prompt?

          Nothing? Just like how if some studio today invests millions of man-hours and does a competing movie in Studio Ghibli's aesthetic (but not including any Studio Ghibli's characters, branding, etc. - basically, not the copyrightable or trademarkable stuff) nothing out of ordinary is going to happen.

          I mean, artistic style is not copyrightable, right?

          • mrdependable6 days ago
            You are missing the point entirely. If you can make a movie with just a prompt, who is going to invest the money creating something like a Ghibli movie just to have it ripped off? Instead people will just rip off what has already been done and everything just stagnates.
            • Taek6 days ago
              How is "movies and other great works of art that used to cost tens of millions of dollars to make now cost tens of dollars to make" a bad thing?

              It means art can get more ambitious. Ghibli made their mark, and made their money. Now it's time for the next generation to have a turn.

              • mrdependable6 days ago
                The lower cost is not the bad thing. Allowing an AI to learn from it and regurgitate is the bad thing. If we can put anything into an AI and then say whatever it spits out is "clean", even though it is obviously imitating what it learned from, whoever puts the investment into trying something new becomes the sucker.

                Also, I don't get this weird sense of entitlement people have over someone else's work. Just because it can be copied means it should belong to everyone?

              • otabdeveloper46 days ago
                > How is "movies and other great works of art that used to cost tens of millions of dollars to make now cost tens of dollars to make" a bad thing?

                It's bad because you will never get an original visual style from now on. Everything will be copy-paste of existing styles, forever.

                • drdaeman5 days ago
                  Can you please explain how did you jump to this conclusion?

                  I fail to see how artistic expression would cease to be a thing and how people will stop liking novelty. And as long as those are a thing, original styles will also be a thing.

                  If anything, making the entry barriers lower would result in more original styles, as art is [at least] frequently an evolutionary process, where existing ideas meet novel ones and mix in interesting ways. And even for the entirely novel (from-scratch, if that's a thing) ideas will still keep appearing - if someone thinks of something, they're still free to express themselves, as it was always the case. I cannot think of why people would stop painting with brushes, fingers or anything else.

                  Art exists because of human nature. Nothing changes in this regard.

            • drdaeman5 days ago
              I'm sorry, but I do not think I understand the idea why and how Studio Ghibli is being "ripped off" in this scenario.

              As I've said, art styles are not considered copyrightable. You say I'm missing the point but I fail to see why. I've used lack of copyright protection as a reality check, a verifiable fact that can be used to determine the current consensus on the matter. Based on this lack of legal protection, I'm concluding that the societies have considered it's not something that needs to be protected, and thus that there is no "ripping off" in replicating a successful style. I have no doubts there are plenty of people who would think otherwise (and e.g. say that current state of copyright is not optimal - which can be very true), but they need to argue about copyright protections not technological accessibility. The latter merely exposes the former (by drastically lowering the cost barriers), but is not the root issue.

              I also have doubts about your prediction of stagnation, particularly because you seem to ignore the demand side. People want novelty and originality, it was always the case and always will be (or at least for as long as human nature doesn't change). Things will change for sure (they always do), but I don't think a stagnation is a realistic scenario.

      • wavemode7 days ago
        Companies like Studio Ghibli are not being harmed by AI, small freelance artists are.
        • masswerk7 days ago
          I think, Studio Ghibli will be affected, as well, since their "trademark style" (as we used to say), formerly a welcome sight and indicative for a certain type of story telling, will be devaluated as an indicator for slop. (Much like there are certain traits of an image, which we associate with soap operas and assume to be indicative of a low-value production.)
          • butlike7 days ago
            I doubt that. "Which movie does the slop belong to?" "Oh none of them? Ok" Is a pretty easy search term
            • masswerk6 days ago
              I doubt that, when confronted with an image that you've learned to associate with a plethora of low-quality / low-effort productions, you'd search for the possible origin, in the first place.

              (After all, it's yet another ephemeral image in "that AI style", with no apparent thought having gone into it, just some name dropping, at best. Or some generated, senseless story, you would be glad, the algorithm hadn't pointed your kids at. Why should you?)

      • mycall7 days ago
        > "How can we monetize art" remains an open question for society

        Yet much of the best art imho is in the wild to the element while being at home at some random place. Or perhaps in someone's collection forgot and displaced. Art's worth will always be an open question.

      • hnbad7 days ago
        Copyright is a logical consequence of property rights. I'd agree that property rights hold back industry and trade but if you want to abolish property rights, you first have to decommodify the essentials like food, housing, public infrastructure and healthcare, because unleashing the market when it has control over all of these is going to have some very undesirable consequences.
        • AnthonyMouse7 days ago
          Copyright isn't a property right. Property rights are rivalrous. If you own a sandwich and a thousand other people want to eat your sandwich, only one person can, so property rights exist to define who gets to choose who gets to eat the sandwich. Writings and discoveries are non-rivalrous. To quote the first head of the US Patent Office:

          > He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me.

          The term "intellectual property" is an attempt to conflate these things, to justify net-destructive money grabs like retroactive copyright term extensions, because traditional property rights don't expire but copyrights explicitly and intentionally do.

          • immibis7 days ago
            The definition of capitalism is a system in which all sorts of things that are not property are artificially made into property and given artificial property rights which can be traded.
            • AnthonyMouse6 days ago
              That is the definition of capitalism people use when they want to apply the term capitalism to something that sucks.

              Allowing people to own physical items as business inventory or production equipment and compete with each other for the customer's dollar is entirely possible without the existence of copyright or patents. You would then be relying on some combination of open source, charitable contributions and patronage, industry joint ventures, personal itch scratching, etc. to create writings and inventions, but books and the wheel were created before patents and copyrights were.

              • fc417fc8026 days ago
                > You would then be relying on ...

                More likely trade secrets, NDAs, non-competes, and increasingly invasive DRM. In addition to the direct financial incentive, part of the logic behind IP law is to foster a more open market because that should be to the benefit of society at large in multiple ways.

                Patents, for example, ensure that at least some minimal description of the process gets published for others to take inspiration from.

                • AnthonyMouse6 days ago
                  > More likely trade secrets, NDAs, non-competes, and increasingly invasive DRM.

                  These are all also creatures of the law. If there was no copyright there would be no Digital Millennium Copyright Act. In many cases they can't work, e.g. because of the analog hole or because the mechanism of operation is observable to anyone who buys the product.

                  The incentives to uncover those things are also much stronger in modern day because of the connectedness of the world. If there were two wheelwrights in your town and one of them had a secret process, no one but the other would have any use for it and if they found out they wouldn't even have any else to tell it to.

                  If someone had a secret video encoding strategy today, some hobbyists would reverse engineer it and post it on the internet.

                  > Patents, for example, ensure that at least some minimal description of the process gets published for others to take inspiration from.

                  Have you read a modern patent? They're inscrutable, and to the fullest extent allowable attempt to claim the overall concept of doing something rather than describing a specific implementation.

                  • fc417fc8026 days ago
                    > In many cases they can't work, e.g. because of the analog hole

                    Careful not to confuse illegal with unable.

                    > [can't work because] the mechanism of operation is observable to anyone who buys the product.

                    That was my point about increasingly invasive DRM. Without IP law, the only way for large swaths of industry to sustain themselves would be to deal exclusively on extensively secured platforms. Imagine a scenario where all paid services (software, streaming, and quite literally everything else) was only available on hardware attested devices rooted with only one or a few players.

                    In the hypothetical scenario where it is explicitly legal to copy any binary that you gain possession of (ie copyright doesn't exist) I think that's what we would see.

                    > If someone had a secret video encoding strategy today, some hobbyists would reverse engineer it and post it on the internet.

                    Which is why patents exist. When companies decide how much to invest in what this is taken into account.

                    Notably due to the lack of popularity (and thus lack of adoption) of patent encumbered video and audio standards, anyone trying to make a direct profit effectively dropped out years ago. At this point it's driven by behemoths that realize significant downstream cost savings.

                    > Have you read a modern patent? They're inscrutable

                    Yes, I'm aware. Consider how much worse things could be though. No hint, every employee who worked on it under both NDA and non-compete. Imagine how much more difficult the labor market would be to navigate if the government didn't intervene to prevent overbearing terms in such a scenario. Consider what all of this would do to market efficiency.

                    My point was never to disagree with your broad strokes (that the free market is perfectly capable of functioning in the absence of IP law). Rather it was to point out that despite all the downsides, IP law does clearly offer some collective benefits by significantly reducing incentives that would otherwise drive greedy individuals to act against common interests.

                    • AnthonyMouse6 days ago
                      > Imagine a scenario where all paid services (software, streaming, and quite literally everything else) was only available on hardware attested devices rooted with only one or a few players.

                      We already have some content where this has been attempted. That content is on the piracy sites. And that's when breaking the DRM and piracy sites are both illegal.

                      They simply wouldn't use a business model where they first make something and then try to charge people for it after. Instead you might have a subscription service, but the subscription is patronage, i.e. you want them to keep producing content and if enough people feel the same way, they make enough to keep doing it. But the content they release is available to everyone.

                      > every employee who worked on it under both NDA and non-compete. Imagine how much more difficult the labor market would be to navigate if the government didn't intervene to prevent overbearing terms in such a scenario. Consider what all of this would do to market efficiency.

                      The assumption is that such NDAs would be enforceable. What if they're not?

                      > Rather it was to point out that despite all the downsides, IP law does clearly offer some collective benefits by significantly reducing incentives that would otherwise drive greedy individuals to act against common interests.

                      The greedy individuals could be addressed by banning their attempts to reconstitute copyright through thug behavior. The real question is, would we be better off without it, if some things wouldn't be created?

                      Likely the optimal balance is close to the original copyright terms, i.e. you get 14 years and there is none of this anti-circumvention nonsense which in practice is ineffective at its ostensible purpose and is only used to monopolize consumer devices to try to exclude works that compete with the major incumbents. But the existing system is so far out of whack that it's not clear it's even better than nothing.

                      • fc417fc8026 days ago
                        We currently have a fairly half assed system that it seems only the movie and music studios are really invested in pushing. I don't see any reason to assume the market would continue behaving the same way if the laws changed.

                        I think you could reasonably expect the iOS model to become the only way to purchase paid software as well as any number of other things where IP is a concern. You would have hardware backed attestation of an entirely opaque device.

                        > Likely the optimal balance is close to the original copyright terms

                        I'm inclined to agree.

                        > in practice is ineffective at its ostensible purpose and is only used to monopolize consumer devices to try to exclude works that compete with the major incumbents.

                        I'd argue that was the actual purpose to begin with. Piracy being illegal means that operating at scale and collecting payments becomes just about impossible. DRM on sanctioned platforms means the end user can't trivially shift content between different zones. The cartels are able to maintain market segmentation to maximize licensing revenue. Only those they bless are permitted entry to compete.

                        > the existing system is so far out of whack that it's not clear it's even better than nothing.

                        I agree. I think the current system is causing substantial harm for minimal to no benefit relative to the much more limited original copyright terms.

                        > The assumption is that such NDAs would be enforceable. What if they're not?

                        So in addition to removing IP legislation this is now a hypothetical scenario were additional regulation barring the sorts of contracts that could potentially fill that void is also introduced?

                        > The greedy individuals could be addressed by banning their attempts to reconstitute copyright through thug behavior.

                        You're too focused on copyright. The behavior is simple defense of investment. The players are simply maximizing profit while minimizing risk.

                        Keep in mind we're not just talking about media here. This applies to all industrial R&D. You're describing removing the legal protections against cloning from the entire economy.

                        If you systematically strip away all the legal defense strategies then presumably one of two things happens. Either the investment doesn't happen in the first place (on average, which is to say innovation is severely chilled market wide). Or groups take matters into their own hands and we see a resurgence of organized crime. Given the amount of money available to be made by major players whose products possess a technological advantage I'd tend to expect the latter.

                        I really don't like a scenario where the likes of Nvidia and Intel are strongly incentivized to fund the mob.

                        It's a huge mistake to assume that no one will step up to the plate to do illegal and potentially outright evil things if there's a large monetary incentive involved. Either a sufficiently low friction legal avenue is provided or society is stuck cleaning up the mess that's left. The fallout of the war on drugs is a prime example of this principle in action.

    • eadmund7 days ago
      > it's unfair for artists to have their works sucked up

      I never thought it was unfair to artists for others to look at their work and imitate it. That seems to me to be what artists have been doing since the second caveman looked at a hand painting on a cave wall and thought, ‘huh, that’s pretty neat! I’d like to try my hand at that!’

      • SirMaster7 days ago
        You don't see a massive difference in the shear number of images that the AI can look at and the speed at which it can imitate it as a fundamental difference between AI and a human copying works or styles?

        For a human it took a lot of practice and a lot of time and effort. But now it takes practically no time or effort at all.

        • Workaccount27 days ago
          Well yeah, but copyright infringement isn't a function of how quickly you can view and create works.

          Copyright is meant to secure distribution of works you create. It's not a tool to stop people from creating art because it looks like your art. That has been a thing for centuries, we even categorize art by it's style. Imagine anime was had to adhere to a copyright interpretation of "it's my style!".

          • SirMaster7 days ago
            Current copyright yes.

            But do you not for a second think that the current way the laws and rules are set are because of how hard and time consuming it was to replicate work?

            Just because "that's how it's always been" doesn't mean it's acceptable to keep it that way when the means to perform the action have so drastically changed.

            • fc417fc8026 days ago
              I don't think the rules ever existed for the benefit of the individual, but rather the collective. If skilled artists couldn't sustain themselves from their work they wouldn't exist. Historically there was no alternative.

              When a machine can do something there is not generally a (collectively beneficial) reason to protect the individual that competes with it. Backhoes weren't regulated in order to protect ditch diggers.

        • eadmund7 days ago
          It took a truly colossal amount of human time and effort to build AI systems. It takes significant amount of energy to run those AI systems.

          I don’t see any meaningful difference at all between the system of a human, a computer and a corpus of images producing new images, and the system of a human, a paintbrush, an easel, a canvas and a corpus of images producing new images. Emphasis on the new — copying is still copying, and still controlled by copyrights.

          • SirMaster7 days ago
            >It took a truly colossal amount of human time and effort to build AI systems. It takes significant amount of energy to run those AI systems.

            Those people and effort aren't at all tied to the people who are making and using the art.

            In the past every individual person would have to individually study art and some style and practice for years of their life to be able to replicate it really well. And for each piece of artwork it could take them days to make 1 single piece.

            I would argue that this is why it wasn't really problematic to copy someone's work or style. Because the individual time and effort per person to even do that was so high.

            But now that time and effort for an individual is next to nothing.

            • munksbeer6 days ago
              I do wonder what the outcome would be for a model trained only on truly non copyright work, and derivatives from there. I'm no AI expert, but from what I understand they use some models to generate data with which to train further models. I'd be interested in the output, whether it would eventually just match what we have now anyway, so the copyright question may end up moot. I wonder how the argument would shift at that point?

              I think in reality, it is probably too late for that, because the internet is now polluted with AI generated images which would be consumed by any "ethical" model anyway.

              • fc417fc8026 days ago
                I expect it would require significantly more human labor to train (ie no longer fully unsupervised). I imagine that this constraint would lead to significant additional research to improve the efficiency of the training process, and that novel approaches would be developed.

                In other words I think it would suck up a lot of money over a few years and then we would arrive back pretty much where we are now.

          • SirMaster7 days ago
            You don't see a difference between a person spending years learning techniques to create art by hand, and spending months or years studying and practicing some famous artists style, and then spending days manually crafting drawing a single piece of artwork in the style and quality of the originals.

            The difference between that, and a person just entering a prompt to create some drawing in some style.

            The model looked at orders of magnitude more examples of artwork than a single human could look at and study in a lifetime.

            To me there is a clear difference here.

            I am merely saying that perhaps the rules should change due to the drastic change in time and effort required to do the work.

            • pixl976 days ago
              Therefore we should give up all heavy equipment and all ditches should be dug with a spoon.

              Sometimes technology changes and what was nearly impossible in the past becomes trivial.

              • SirMaster6 days ago
                Nobody is arguing that…

                Nobody is saying let’s not have these new efficient tools. All people are saying is let’s make protections and considerations etc for the original artists and their work that’s being used for training and for when the model draws from it to replicate the style that they developed.

            • butlike7 days ago
              The rules do change, but as a meritocracy as society simply decides to move on or not. There will be no cabal of artists who define how the rules will change. It will be organic. Like moving on from cave paintings to impressionism.
          • globnomulous6 days ago
            > I don’t see any meaningful difference at all between the system of a human, a computer and a corpus of images producing new images, and the system of a human, a paintbrush, an easel, a canvas and a corpus of images producing new images.

            The word "meaningful" here is a cheap hedging maneuver, and if you don't see a meaningful difference (whatever that means), that's on you.

        • Suppafly6 days ago
          >You don't see a massive difference in the shear number of images that the AI can look at and the speed at which it can imitate it as a fundamental difference between AI and a human copying works or styles?

          I don't.

          >For a human it took a lot of practice and a lot of time and effort. But now it takes practically no time or effort at all.

          So effort is what makes it ok?

        • Ray206 days ago
          >For a human it took a lot of practice and a lot of time and effort. But now it takes practically no time or effort at all.

          And why is this not a good thing?

          • mitthrowaway26 days ago
            To make an analogy, it might take you five years and your life savings to prototype a new invention. Once done, it can be mass produced for pennies.

            Would you invest that time and money if patent protection did not exist? Probably not, because your competition will copy your work and bankrupt you.

            At any point, society could opt to eliminate patent protection and make all existing inventions public domain, at the cost of losing future inventions. But instead we settled for 20 years.

            This concern did not previously apply to art styles, because they took nearly as much skill to copy as to originate. But now it does, and with no protections, we can expect nobody to put in the work of being the next Studio Ghibli. The styles we have are all we will have, but we can mass produce them.

          • SirMaster6 days ago
            Because now it’s too easy to rip off another artist’s work and style with no effort. And the current rules in place are not enough to protect the original artist.

            The original artist used to be protected by the fact that it took so much effort to copy or reproduce their original work and style at a high quality, so it rarely happened at a scale to directly impact them.

            But now it’s so easy and effortless that anyone can do it in mass and that now impacts the original artist greatly.

      • BriggyDwiggs427 days ago
        Right the difference is that it’s a large company looking at it then copying it and reselling it without credit, which basically everyone would understand as bad without the indirection of a model.

        Edit: the key words here are “company” and “reselling”

        • eadmund7 days ago
          But it’s not copying and reselling — it’s imitation.

          Copying is controlled by copyrights. And imitation isn’t controlled by anything.

          As for a company: a company is just a group of people acting together.

          • mitthrowaway26 days ago
            Imitation has been controlled by patent rights, where an object did not need to be an exact copy but merely use the essence of an idea to count as a violation. Of course so far this protection has only been applied to inventions, because they took a lot of time to develop but once done could be trivially copied at a glance. Now, in this new landscape, we may find it appropriate to apply a similar regime to art.

            Or maybe we won't. It's a choice for society to make, to balance how much we need to protect the incentives to create something new vs protect the ease of copying.

          • BriggyDwiggs427 days ago
            #1, it’s extremely easy to coerce direct copies out of models that artists could be pursued for infringement if they drew, but companies reselling said copyrighted artwork face no penalty

            #2, yes, it’s a group of people who came together to build an algorithm that learns to extract features learned from images made by other people in order to generate images somewhere between these images in a high dimensional space. They sell these images and give no credit or cash to the images being “interpolated” between. Notice this doesn’t extend to open source, it’s the commercial aspect that represents theft.

            The reality is that laws are meant to be interpreted not by their letter but their spirit. The AI can’t exist without the hard work its trained on, and the outputs often resemble the inputs in a manner that approaches copying, so selling those outputs without compensation for the artists in the training set should be illegal. It won’t be, but it should.

            • AnthonyMouse6 days ago
              > it’s extremely easy to coerce direct copies out of models that artists could be pursued for infringement if they drew, but companies reselling said copyrighted artwork face no penalty

              The purpose of the model isn't to make exact reproductions. It's like saying you can use the internet for copyright infringement. You can, but it's the user who chooses the use, so is that on AT&T and Microsoft or is it on the users doing the infringement?

              > They sell these images and give no credit or cash to the images being “interpolated” between.

              A big part of the problem is that machines aren't qualified to be judges.

              Suppose the image you request is Gollum but instead of the One Ring he wants PewDiePie. Obviously this is using a character from the LOTR films by Warner Bros. If you're PewDiePie and you want this image to use in an ad for your channel, you might be in trouble.

              But Warner Bros. got into a scandal for paying YouTubers to promote Promote Middle Earth: Shadow of Mordor without disclosing the payments. If you're creating the image to criticize the company's behavior, it's likely fair use.

              The service has no way to tell why you want the image, so what is it supposed to do? A law that requires them to deny you in the second case is restricting a right of the public. But it's the same image.

              Meanwhile in the first case you don't really need the company generating the image to do anything because Warner Bros. could then go after PewDiePie for using the character in commercial advertising without permission.

              > Notice this doesn’t extend to open source, it’s the commercial aspect that represents theft.

              It's also not really clear how this works. For example, Stable Diffusion is published. You can run it locally. If you buy a GPU from Nvidia or AMD in order to do that, is that now commercial use? Is the GPU manufacturer in trouble? What if you pay a cloud provider like AWS to use one of their GPUs to do it? You can also pay for the cloud service from Stability AI, the makers of Stable Diffusion. Is it different in that case than the others? How?

              • BriggyDwiggs426 days ago
                >It's like saying you can use the internet for copyright infringement.

                I think that a comparison that could help to elucidate the problem here is to a search engine. Like with imagegen, an image search is using infrastructure+algorithm to return the closest match(es) to a textual input over some particular space (whether the space of indexed images or the latent space of the model). Immediately, however, there are qualitative differences. A search company, as an entity, doesn’t in any way take credit for the work; it bills itself as, and operates as, a mechanism to connect the user to others’ work, and in the service of this goal it provides the most attribution it’s reasonably able to provide given technical limitations (a url).

                For me this is the difference. Image gen companies, at least all that I’m aware of, position themselves more as a kind of pseudo-artist that you can commission. They provide no means of attribution, rather, they deliberately obfuscate the source material being searched over. Whether you are willing to equate the generation process to a kind of search for legal purposes is really the core disagreement here, and beyond an intuition for it not something I feel I can prove.

                So what’s the solution, what’s a business model I’d find less contentious? If an AI company developed a means to, for example, associate activation patterns to an index of source material, (or hell, just provided an effective similarity search between output and training data) as a sort of good-faith attribution scheme, made visible the training set used, and was upfront in marketing about its dependence on the source material, I’d struggle to have the same issues with it. It would be leagues ahead of current companies in the ethical department. To be clear though, I’m not a lawyer. I can’t say how image gen fits into the current legal scheme, or whether it does or doesn’t. My argument is an ethical one; I think that the unethical behavior of the for-profit imagegen companies should be hampered by legality, through new laws if necessary. I feel like this should answer your other questions as well but let me know if I missed something.

            • RandallBrown7 days ago
              I guess I don't understand what you mean when you say companies reselling said copyrighted artwork face no penalty. Why wouldn't they? If I was to make a copy of a Studio Ghibli movie and sell it, I would absolutely face a penalty if I was caught.
              • BriggyDwiggs427 days ago
                A common joke is to type in a description of some corporate IP and have ChatGPT generate it without ever saying it directly. Plenty of people have paid a subscription to do that, generate corporate IP that an artist could be sued over, but I don’t believe OpenAI has faced any legal issues if I’m correct, just as an example.
          • __loam7 days ago
            You need to copy the work to use it for AI training.
            • BriggyDwiggs427 days ago
              I’d argue that would be fine for non-commercial use; it’s once the AI outputs are sold that the problem arises.
          • otabdeveloper46 days ago
            Try "imitating" some Mario Brothers in a commercial context and see how that goes. Good luck.
      • ZoomZoomZoom6 days ago
        False, copyism as a career has been always looked down at in the arts community. Learning and reinterpreting is a qualitatively different process.
    • shkkmo7 days ago
      > This was an important discussion to have two to three years ago, then we had it online, and then we more or less agreed that it's unfair for artists to have their works sucked up with no recourse.

      Speak for yourself, there was no consensus online. There are plenty of us that think that dramatically expanding the power of copyright would be a huge mistake that would primarily benefit larger companies and do little to protect or fund small artists.

      • OtherShrezzing7 days ago
        >There are plenty of us that think that dramatically expanding the power of copyright would be a huge mistake that would primarily benefit larger companies and do little to protect or fund small artists.

        The status quo also primarily benefits larger companies, and does little (exactly nothing, if we're being earnest) to protect or fund small artists.

        It's reasonable to hold both opinions that: 1) artists aren't being compensated, even though their work is being used by these tools, and 2) massive expansion of copyright isn't the appropriate response to 1).

        • 7 days ago
          undefined
    • Suppafly6 days ago
      > and then we more or less agreed that it's unfair for artists to have their works sucked up with no recourse.

      No we didn't agree with that.

    • wat100007 days ago
      “Fair” doesn’t matter. The only consensus that matters is what is legal and profitable. The former seems to be pretty much decided in favor of AI, with some open question about whether large media companies enjoy protections that smaller artists don’t. (The legal battle when some AI company finally decides to let their model imitate Disney stuff is going to be epic.) Profitable remains to be seen, but doesn’t matter much while investors’ money is so plentiful.
      • __loam7 days ago
        > The former seems to be pretty much decided in favor of AI

        None of the cases against AI companies have been decided afaik. There's a ton of ongoing litigation.

        > but doesn’t matter much while investors’ money is so plentiful.

        More and more people are realizing how wasteful this stuff is every day.

    • hnbad7 days ago
      > What the post should say is "we know that this is unfair to artists, but the tech companies are making too much money from them and we have no way to force them to change".

      It seemed a fact of life that companies will just abuse your personal data to their liking and can do what they want with information they collect about you because "if it's free, you're the product" (and even if you paid for it, "you should know better" etc). Then GDPR and its international derivatives came along and changed that.

      It seemd a fact of life that companies that technically don't have an actual market monopoly can do whatever they want within their vertically integrated walled gardens because competitors can just create their own vertically integrated walled gardens to compete with them and the rules for markets don't apply to walled gardens. Then the DSA and DMA came along and changed that.

      I don't see why legislation can't change this, too. Of course just with the GDPR, DSA and DMA we'll hear from libertarians, megacorps and astroturf movements how unfair it all is to mom & pop startups and how it's going to ruin the economy but I think given the angle grider the US is currently taking to its own economy (and by extension the global economy because we're all connected), I think that's no longer a valid argument in politics.

    • DeathArrow7 days ago
      >> it's unfair for artists to have their works sucked up

      What framework can we use to decide if something is fair or not?

      Style is not something that should be copyrighted. I can pain in the style of X painter, I can write in the style of Y writer, I can compose music in the style of Z composer.

      Everything has a style. Dressing yourself has a style. Speaking has a style. Even writing mathematical proofs can have a style.

      Copying another person's style might reflect poor judgement, bad taste and lack of originality but it shouldn't be illegal.

      And anyone in the business of art should have much more than a style. He should have original ideas, a vision a way to tell stories, a way to make people ask themselves questions.

      A style is merely a tool. If all someone has is a style, then good luck!

      • yencabulator6 days ago
        It's already gone quite a bit further than "style". https://www.404media.co/listen-to-the-ai-generated-ripoff-so...
      • gosub1007 days ago
        In music, someone can sing the same style as another, but if they imitate it to the point that there is brand confusion, where the consumer believes the product came from X when it actually came from Y, that's clearly crossing the line.
        • ChadNauseam7 days ago
          Is that actually crossing a line? I'm sure some consumers have thought that rocket league was associated with FIFA, or that studio Ghibli movies were made by Disney. But these aren't widespread issues because we have a robust system of trademarks that draws a clear line: you can't use trademarked names or iconography in a way that causes confusion. But if some people heading Olivia Rodrigo's "good 4 you" think they're listening to Paramore because they have a similar style, that has never been illegal
          • gosub1007 days ago
            vanilla ice vs queen case doesn't support your claim.
            • ChadNauseam6 days ago
              vanilla ice completely copied queen's baseline, not just the queen style. (not to mention that it never went to court)
  • shubhamjain7 days ago
    The Ghibli trend completely missed the real breakthrough — and it’s this. The ability to closely follow text, understand the input image, and maintain context of what’s already there is a massive leap in image generation. While Midjourney delivered visually stunning results, I constantly struggled to get anything specific out of it, making it pretty much useless for actual workflows.

    4o is the first image generation model that feels genuinely useful not just for pretty things. It can produce comics, app designs, UI mockups, storyboards, marketing assets, and so on. I saw someone make a multi-panel comic with it with consistent characters. Obviously, it's not perfect. But just getting there 90% is a game changer.

    • empath757 days ago
      I had chatgpt generate a flow chart with mermaid js for something at work and then write a scott mccloud style comic book explaining it in detail and it looked so convincing, even though it got some of the details a bit wrong. It's _so close_ to making completely usable graphics out of the box.
  • gcanyon7 days ago
    It's interesting to hear people side with the artists when in previous discussions on this forum I've gotten significant approval/agreement arguing that copyright is far too long.

    As I've argued in the past, I think copyright should last maybe five years: in this modern era, monetizing your work doesn't (usually) have to take more than a short time. I'd happily concede to some sort of renewal process to extend that period, especially if some monetization method is in process. Or some sort of mechanical rights process to replace the "public domain" phase early on. Or something -- I haven't thought about it that deeply.

    So thinking about that in this process: everyone is "ghiblifying" things. Studio Ghibli has been around for very nearly 40 years, and their "style" was well established over 35 years ago. To me, that (should) make(s) it fair game.

    The underlying assumption, I think, is that all the "starving" artists are being ripped off, but are they? Let's consider the numbers -- there are a handful of large-scale artists whose work is obviously replicable: Ghibli, the Simpsons, Pixar, etc. None of them is going hungry because a machine model can render a prom pic in their style. Then you get the other 99.999% of artists, all of whose work went into the model. They will be hurt, but not specifically because their style has been ingested and people want to replicate their style.

    Rather, they will be hurt because no one knows their style, nor cares about it; people just want to be able to say e.g. "Make a charcoal illustration of me in this photo, but make me sitting on a horse in the mountains."

    It's very much like the arguments about piracy in the past: 99.99% of people were never going to pay an artist to create that charcoal sketch. The 0.01% who might are arguably causing harm to the artist(s) by not using them to create that thing, but the rest were never going to pay for it in the first place.

    All to say it's complicated, and obviously things are changing dramatically, but it's difficult to make the argument that "artists need to be compensated for their work being used to train the model" without both a reasonable plan for how that might be done, and a better-supported argument for why.

    • ben_w7 days ago
      Mm.

      The arguments about wanting copyright to be life+70 have always felt entitled, to me. Making claims about things for their kids to inherit, when the median person doesn't have the option to build up much of an inheritance anyway, and 70 years isn't just the next generation but the next 2.5 generations.

      I don't know the exact duration of copyright that makes sense, the world changes too much and different media behave differently. Feels like nobody should have the right to block remakes of C64 games on copyright grounds, but I wouldn't necessarily say that about books published in the same year.

      From what I've seen about the distribution of specifically book sales, where even the top-100 best sellers often don't make enough to justify the time involved, I think that one of the biggest problems with the economics of the arts is a mixture of (1) the low cost of reproduction, and (2) all the other artists.

      For the former: There were political campaigns a century ago, warning about the loss of culture when cinemas shifted from live bands to recorded music[0]; Today, if I were so inclined, I can for a pittance listen to any of (I'm told) 100 million musical performances, watch any of 1.24 million movies or TV shows. Even before GenAI, there was a seemingly endless quantity of graphical art.

      For the latter: For every new book by a current living author such as Charlie Stross (who is on here sometimes), my limited time is also spread between that and the huge back-catalogue of old classics like the complete works of Conan Doyle, Larry Niven, or Terry Pratchett.

      [0] https://www.smithsonianmag.com/history/musicians-wage-war-ag...

    • cannonpr7 days ago
      Being someone who has paid a lot of attention to Ghibli, I wouldn’t say their style was well established 35-40 years ago… There is considerable evolution and refinement to their style from Naushika, to later works, both in the artistic style and the philosophical content it presents.

      I think allowing it to be fair game would have destroyed something quite beautiful that I’ve watched evolve across 40 years and which I was hoping to see the natural conclusion of without him being bothered by the AI-fication of his work.

      • gcanyon7 days ago
        Yeah, of course their style isn't static, but I was taking Kiki's Delivery Service (1989) as a point where much of their visual style was pretty well-established.
        • cannonpr7 days ago
          I agree that some of his main elements were relatively set by then but others continue to evolve. Personally I feel that his work is entirely fair game after his death, by which I mean to say after the end of his journey that he has poured his entirety into, in terms of value to humanity and respect to him for that value I think it’s reasonable to respect his wishes during his lifespan.
    • another-dave7 days ago
      I'd agree with limiting copyrights but would do it based on money earned rather than time, so something like when you make $X million, the work becomes public domain.

      As a specific example — _A Game of Thrones_ was released in 1996. It picked up awards early on but only became a NYT best seller in 2011, just before the TV show aired.

      It would feel harsh for an author to loose all their copyright because their work was a "slow burn" and 5 years have elapsed but they've made little to no money on it.

      • pixl976 days ago
        >o something like when you make $X million, the work becomes public domain.

        https://en.wikipedia.org/wiki/Hollywood_accounting

        No, no metrics that can be gamed.

      • gcanyon6 days ago
        It’s a super-interesting idea, but GoT seems highly cherry picked: the vast majority of all works would never leave copyright if the requirement was that they clear even $1000.
    • Avshalom7 days ago
      >It's interesting to hear people side with the artists when in previous discussions on this forum I've gotten significant approval arguing that copyright is far too long.

      Well broadly that's because most arguments about copyright(length/scope) are made against corporations attacking individual artists and arguments about copyright(AI/scope) are made against corporations attacking individual artists.

    • Taek7 days ago
      I find it unlikely that someone who was willing to pay an artist for a charcoal sketch would be satisfied with an AI alternative.

      You don't just buy art for the aethstetic, you buy it for a lot of reasons and AI doesn't give any of the same satisfaction.

      • zwnow7 days ago
        I'm all for paying artists for their work. Unfortunately, same as tattoo artists, some just heavily overcharge for mediocre results (been tattooing myself AND I know a few things about art). Like, sorry, but if you want to earn money doing art, please be good at it...
        • amazingamazing7 days ago
          > some just heavily overcharge for mediocre results

          if people are paying, then they aren't "overcharging"

          • zwnow7 days ago
            In the tattoo business people have no other places to go. Charging 1k€+ for half a sleeve is extremely overcharged. If people are paying they often simply don't have enough alternatives.
            • flkiwi7 days ago
              Given the time commitment and network of basic biological, anatomical, and health knowledge required, that doesn't strike me as an insane price, assuming an artist who is able to create the requested art.
              • zwnow7 days ago
                Tbh you barely have to know anything, most important one is how deep to go with the needle and sanitizing. Everything else is not rlly important.
                • flkiwi7 days ago
                  A friend who is heavily inked has gone on at length to me about understanding skin elasticity--particularly how it changes over a lifetime--as well as the way joints and muscles change and distort visual lines, etc. It sure seems like a skilled trade to me.

                  And, I don't know, depth of penetration of a needle in flesh and sanitation don't strike me as minor things to get right.

                  • zwnow7 days ago
                    People love to make things seem harder than they are. I tattoo people, I am aware about skin types, usually thats not a big issue unless its heavily scarred. The quality of your tattoo machine matters most, as my 70€ eBay makeshift one wasnt nearly as good as a proper one. Amount of ink matters, needle depth, skin type, sweat. But thats stuff you have figured put after your 20th tattoo. Its like knowing datatypes in programming. You just know stuff after some practice.
                    • autoexec7 days ago
                      > People love to make things seem harder than they are.

                      In my experience people tend to underestimate or downplay how difficult something will be or how complex it is. This happens in people who know only a little about something, but also in people who are highly experienced because it becomes normal and easy for them and they can quickly evaluate a situation and know which considerations don't apply.

                    • UncleEntity7 days ago
                      > But thats stuff you have figured put after your 20th tattoo.

                      So... after spending hundreds, if not thousands, of hours learning a skill?

                      I got a tattoo back in the day and specifically went to one the guys in my platoon said was good due to him being featured in magazines or whatever. It's kind of an important thing to get right on the first, not 20th, attempt IMHO.

                      • zwnow7 days ago
                        20 tattoos hundreds of hours? Lol.
                  • bongodongobob7 days ago
                    Penetration is practice, sanitation is basically use gloves and an autoclave.
                    • zwnow7 days ago
                      Also swap needles and use some cream, people really overestimate how hard tattooing is
        • RhysU7 days ago
          > Like, sorry, but if you want to earn money doing art, please be good at it...

          By definition, almost half of all $ARTISTS are worse than the median. Should that half not get paid for their time?

          • zwnow7 days ago
            Yes. They should look for a job that actually covers their lifestyle instead of crying about AI taking their jobs.

            They can always put in more hours and become better. I cant imagine they have a lot of paying customers anyway.

            • RhysU7 days ago
              I keep waiting for physical objects to become important again. AI isn't coming for the ceramic folks.
              • ben_w7 days ago
                > AI isn't coming for the ceramic folks.

                *looks at 3D printer on desk that can apparently handle ceramic filaments, thinks about all the mass-produced ceramics sold in supermarkets*

              • zwnow7 days ago
                I think that would be great. Traditional art market is not nearly as big as the digital space. I'd love if people valued traditional art again as thats the only stuff I do.
                • ane7 days ago
                  Depends on the niche. Original physical art for trading card games or comics is a significant chunk of the income of your typical artist. Digital art in those niches does not have this source of income. But then again digital art has other niches where the actual commission rates are high enough to not make this a problem.
            • WindyMiller7 days ago
              What about the half of the remaining artists that are below the new median?
              • zwnow7 days ago
                Should be good enough to alrdy have established some customers.
                • flkiwi7 days ago
                  The problem with that is that people aren't asking for AI generated images in the style of Raven from Topeka with an Etsy shop. They're asking for Ghibli. So the people whose livelihoods are most directly impacted are (assuming they're not centuries dead) the famous, talented, and trend-making artists, not the lower tier making bad Precious Moments knockoffs. Society's problem is understanding that not wanting to pay for bad Precious Moments knockoffs is rational, while not wanting to pay, say, a Studio Ghibli for quality, professional creativity is insane.
                  • nasmorn7 days ago
                    Couldn’t Ghibli zeitgeist moment lead to them making out hugely with a new release or just a cinema screening of Totoro right now?
                    • flkiwi7 days ago
                      That is "artists should be grateful to work for exposure" on a grand scale.
                      • nasmorn7 days ago
                        Except they didn’t do any work for the exposure. If a marketing agency had come up and executed the Ghiblify everything model as a PR stunt we would call it the most genius creative campaign of the decade
                        • flkiwi7 days ago
                          They did though. The studio engaged in tremendous amounts of work and created good will, in addition to their specific creative works. Their visual style is tied up in that good will. Use of the visual style for profit without consent is, at least ethically, misappropriation of another's value. And "You should be pleased I used your creative work because now more people will know about you and you will make a lot of money from this!" is one of the oldest defenses to misappropriation of creativity.

                          I'm not even mad. We do a terrible job in our society of valuing artists and creative people generally and in explaining the value of intangible things, especially something like good will. People have been misappropriating fonts and clipart and screenshots in presentations and posters and whatnot, duplicating clever branding ideas and the creative efforts of others, and so on for _decades_ if not longer, all without ill intent. It's something we need to fix and never will. But when that becomes a channel for another to directly profit, it begins to venture out of harmlessness.

                          • bongodongobob6 days ago
                            You're entirely missing the point. The average western person has never heard of or seen any Ghibli movies. GPT use is heavily skewed by nerdy types. The average football watching big-bang-theory-is-a-smart-show type person doesn't know what any of this is.

                            If Ghibli feels like they are getting screwed, they could've taken this opportunity to promote themselves, is the parent's point. If I were in their marketing dept I would have been screaming "guys, non-weeaboo people are seeing our name in the news, let's fucking capitalize!" When has Ghibli ever trended? Set up some screenings or stream Spirited Away on their site for a couple weeks or somethin. If they want to win hearts and minds, that's what you have to do. As of now, it's already out of the MSM news cycle and forgotten.

                  • zwnow7 days ago
                    As if the Ghibli trend wasnt just a short trend people will have forgotten about in 4 weeks... Also I couldnt care less about big studios, they print money anyway.
    • AlecSchueler7 days ago
      It's one thing to argue that copyright terms should be shortened, and another to accept that a handful of corporations should be able to forcefully shorten it for certain actors entirely on their own terms.
    • amazingamazing7 days ago
      > I think copyright should last maybe five years: in this modern era, monetizing your work doesn't (usually) have to take more than a short time. I

      funny how people who say this kind of stuff are never content creators (in the monetization sense).

      • bko7 days ago
        There are a lot of programmers on this platform (myself included), and I love that my work has an impact on others.

        I have a number of public repos and I have benefitted greatly from other public repos. I hope LLMs made some use of my code.

        I wrote blogs for years without any monetization. I hope my ideas influenced someone and would be happy if they made some impact on the reasoning of LLMs.

        I'm aware of patent trolls and know people with personal experience with them.

        So I generate a lot more content that the typical person and I am still in favor of much looser IP rights as I think they have gone overboard and the net benefit for me, a content creator, is much greater having access to the work of others and being able to use tools like LLMs trained on their work.

        • mitthrowaway26 days ago
          Programming, at least, is much easier to shield in a loose IP regime than art. You can ship only binaries, or even keep your code running on a server and disclose only the API. And likely, the company that pays your salary would opt to do just that.

          I can't imagine a similar way for an artist to distribute their work while protecting their interests.

        • amazingamazing7 days ago
          posting stuff for free is different than selling stuff.
      • rikroots7 days ago
        My personal preference is for (say) 15-20 years.

        And, as a content creator, I practice what I preach - at least when it comes to my poetry: https://rikverse2020.rikweb.org.uk/blog/copyrights

      • gcanyon7 days ago
        Not that it should impact the validity of my argument, but I have sold commercial software in the past, and it is absurd that that software will be copyrighted through most of the 21st-century.
      • 65107 days ago
        If you make a blog with nice original long form articles it may take much longer to gain traction. Reproducing the content in "your own" wording quickly gets fuzzy.

        I like the practical angle. Any formula that requires monitoring what everyone is doing is unworthy of consideration. Appeal to tradition should not apply.

      • bongodongobob6 days ago
        The entitlement of the modern artist/musician is unprecedented. Never have I seen so many people expect to be handed a living because they've posted some "content". Musicians and artists now have global distribution with a plethora of platforms. You have to harness that and then work and grind it out. You have to travel and play gigs and set up a booth at art shows.

        There's this new expectation that you should just be able to post some music on Spotify or set up an Etsy shop and get significant passive income. It has never ever worked that way and I feel this new expectation comes from the hustle/influencer types selling it.

        Most art is crap and most music isn't worth listening to. In the modern age, it's easy for anyone to be a band or artist and the ability to do this has led to a ton of choice, the market is absolutely flooded. If anyone can do a thing (for very loose values of "do") it's inherently worth less. Only the very best make it and it will always be that way.

        Source: made a living as a musician for 20 years. The ones who make it are relentlessly marketing themselves in person. You have to leave the house, be a part of a scene, and be constantly looking for opportunities. No one comes knocking on your door, you must drive your product and make yourself stand out in some way. You make money on merch and gigs, and it's always been that way.

        This is all to say that copyright law only affects the top 0.1%. The avg struggling artist will never have to worry about any of this. It's like Bob the mechanic worrying about inheritance taxes. Pipe dream at best.

      • pixl976 days ago
        I mean, this is about as useful as saying anti-slavery people should become slave owners so they understand the hardships of making money.

        My example is extreme to the absurd, so how about we go with

        >It's difficult to get a man to understand something when his salary depends on not understanding it.

    • Workaccount27 days ago
      As you grow older and run through more cycles of general opinions, you realize that pretty much everyone is in it for themselves, what serves them best, and support what narrative aligns with that.

      2007: Copyright is garbage and must be abolished (so I can get music/movies free)

      2025: Copyright needs to be strengthened (so my artistic abilities retain value)

    • ZoomZoomZoom6 days ago
      > I think copyright should last maybe five years

      If we imagine for a moment that "copyright" is something that works in the interests of a creator, than 5 years is nothing.

      A painting can sit fifteen years before it gets to an exhibition with sufficient turn-over and media coverage to draw attention to.

      A music album can be released with a shitty label and no support, years later be taken by a more competent one and start selling.

      We're living in a world where worth art is constantly flying under the radars so limiting potential even more isn't helpful.

    • mrdependable6 days ago
      This sounds more like a problem with you specifically not seeing value in art. Why would you want the incentives not to work in their favor?
    • jeffreygoesto7 days ago
      I thought the US wants to re-industrialize now? Then 5 years is laughably short to protect your invest.
    • sfn426 days ago
      Finally a reasonable take.
    • xrd7 days ago
      The Ghibli fight is the same fight that is being fought in the NASDAQ. That is to say, there was an established set of rules that everyone thought were fixed and now they are being radically disrupted. Both the creative industry and the general business industry are trying to figure out what is life going to be like with a totally different and fluid set of regulations, whether it be copyright law or tariffs.

      No wonder sama and Trump are so cozy. They both see the same legacy.

  • haswell7 days ago
    > The question isn't whether these tools will change visual media, but whether we'll be thoughtful enough to shape that change intentionally.

    Unfortunately I think the answer to this question is a resounding “no”.

    The time for thoughtful shaping was a few years ago. It feels like we’re hurtling toward a future where instead we’ll be left picking up the pieces and assessing the damage.

    These tools are impressive and will undoubtedly unlock new possibilities for existing artists and for people who are otherwise unable to create art.

    But I think it’s going to be a rough ride, and whatever new equilibrium we reach will be the result of much turmoil.

    Employment for artists won’t disappear, but certain segments of the market will just use AI because it’s faster, cheaper, and doesn’t require time consuming iterations and communication of vision. The results will be “good enough” for many.

    I say this as someone who has found these tools incredibly helpful for thinking. I have aphantasia, and my ability to visualize via AI is pretty remarkable. But I can’t bring myself to actually publish these visualizations. A growing number of blogs and YouTube channels don’t share these qualms and every time I encounter them in the wild I feel an “ick”. It’ll be interesting to see if more people develop this feeling.

    • pixl976 days ago
      >But I think it’s going to be a rough ride, and whatever new equilibrium we reach will be the result of much turmoil.

      Honestly visual media just seems to be the start. In the past two years we've seen about as much robotics progress as the last 20. If this momentum keeps up then we're not just talking about artists that are going to have issues.

      • TheGrognardling6 days ago
        Honestly, I'm pretty encouraged by all of the projects and efforts within legislation and organizations regarding clear lines being drawn - i.e., through watermarking to clearly label whether something is AI-generated or not - as well as the efforts by industries for livelihoods to be protected, specifically in the creative space, where human intentionality and feeling are still of the utmost essentiality. We've seen, are seeing, and will see cultural and societal acceptance and backlash against one thing or another, but I'm confident that we will adapt. Ultimately, pushback, thanks to the Web itself, is already pretty monumental among artists and even other AI researchers in many respects - regulations for the internet, largely due to lack of the Web, were far slower to materialize, on an exponential scale. I remain optimistic that we will find the niches where AI is needed, where it isn't, and where it is detrimental.
        • haswell6 days ago
          While I know there have been plenty of scathing essays, backlash among various communities, etc. do you have some concrete examples of the clear lines being drawn and legislation that gives you this optimism?

          Maybe the progress you’re describing has escaped me because of the sheer speed this is all unfolding, but it feels like all I’ve heard is lots of noise, while AI companies continue to hammer hosted resources across the Internet to build their next model, the US government continues to claim they’ll use AI to solve problems of waste and fraud, companies like Shopify claim they won’t hire anyone unless it can be proven that AI cannot do the job, and an increasing % of the content I encounter is AI slop.

          Maybe this is all necessary for a proper backlash to form, and I definitely want to become more aware of the positives anywhere I can find them. I’m not an AI doomer, but haven’t yet found the optimism you describe.

          • TheGrognardling6 days ago
            The EU AI Act, while I personally find it to be overreach, has certain points that I certainly agree with, and are encouraging given that every major platform has huge userbases in the EU. The first Executive Order on AI in the US following announcement of the AI Action Plan is surprisingly rigid whilst still encouraging innovation, especially given all of the drama regarding federal agencies as-of late. Creative industries are increasingly drawing clear lines on where AI use is and isn't acceptable, especially among individual studios and unions following the SAG-AFTRA strikes of 2023-2024. And ultimately, this is all not even considering the profound advancements in education, biotech, and healthcare.

            This is very easy to lose sight of, especially given rapid advancements, but it's important. I think certain companies like Anthropic definitely have safety approaches that I agree with more, being more thoughtful and having clearly-outlined scaling policies (such as the latest Responsible Scaling Policy effective March 31 of this year) versus more vague safety promises such as from companies such as OpenAI and Google. Websites such as https://www.freethink.com have wonderful essays espousing techno-humanism that ultimately gives compelling arguments on how AI will be a progressively beneficial force on humanity, rather than detract from it.

            Yes, there WILL be growing pains - as is what happened with the internet and the World Wide Web. But, I am confident that we will adapt. There is no better time in history to be living in than right now.

  • justinator7 days ago
    But the annotations are still wrong,

    https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_...

    (nice URL btw)

    The room, the door, the ceiling are all of a scale to fit many sizes of elephants.

    • sfn426 days ago
      The lines don't really make sense either, like the one above the sofa should probably go along the corner between the floor and wall.
      • justinator6 days ago
        Just like text results from AI, when images like this get better, the more and more subtle yet absolute wrongness is going to be a nightmare to deal with.

        Imagine a ask AI to show me a sewer cap that's less than a foot wide (or whatever, I dunno, watch TMNT right now). And it does, just by showing a sewer cap that looks photorealistic and a ruler, that has markings from one end to the other that only go up to 8 inches. That doesn't mean sewer caps come in that size, it just means you can produce a rendered image to fit what you asked for.

  • m4thfr34k6 days ago
    I am very impressed with the current image generators out there, 4o / Leonardo / etc., but I cannot wait until they include some step to actually "check their work". Ask it to produce a watch with the time of 6:37. It fails every time, because almost all watch photos out there are set to a specific time, and seems like something an initial "did I do this right" check could confirm. The time example is trivial but a general "does this output actually make sense considering what the user asked" checked would be tremendously valuable.
  • Retr0id7 days ago
    I had a reasonable intuition for how the "old" method works, but I still don't grok this new approach.

    "in multimodal image generation, images are created in the same way that LLMs create text, a token at a time"

    Is there some way to visualise these "image tokens", in the same way I can view tokenized text?

    • fxtentacle7 days ago
      Imagine you cut the image into 32x32 pixel blocks. And then for each block, you can chose 1 out of 128,000 variations. And then a post-processing step smoothes out the borders between blocks and adjusts small details. That's basically how a transformer image generation model works.

      As such, the process is remarkably similar to old fixed-font ASCII art. It's just that modern AIs have a larger alphabet and, thus, more character shapes to choose from.

      • rwmj7 days ago
        I don't get how this would produce consistent images. In the article, the text could be on a grid, but the window and doorway and sofa don't seem to be grid-aligned. (Or maybe the text is overlaid?)
        • danielbln7 days ago
          The model looks ahead, just like LLMs look ahead. An LLM outputs token by token but can still output a fully coherent and consistent story for example. This new crop of auto-regressive image models does the same.
    • WhiteNoiz37 days ago
      I haven't see any details on how OpenAI's model works, but the tokens it generates aren't directly translated into pixels - those tokens are probably fed into a diffusion process which generates the actual image.. The tokens are the latent space or conditioning for the actual image generation process.
      • bonoboTP7 days ago
        > I haven't see any details on how OpenAI's model works

        Exactly. People just confidently make things up. There are many possible ways, and without details, "native generation" is just a marketing buzzword without clear definition. It's a proprietary system, there is no code release, there is no publication. We simply don't know how exactly it's done.

        • og_kalu7 days ago
          Open AI have both said it's native image generation and autoregressive. It has the signs of it too.

          It's probably an implementation of VAR (https://arxiv.org/abs/2404.02905) - autoregressive image generation with a small twist. Rather than predict every token at the target resolution directly, start with predicting it at a small resolution, cranking it higher and higher until the desired resolution.

  • NitpickLawyer7 days ago
    > The results are not as good as a professional designer could create but are an impressive first prototype.

    I like to look at how far we've come since the early days of Stable Diffusion. It was fascinating to play with it back then, but it quickly became apparent that it was "generic" and not suited for "real work" because it lacked consistency, text capabilities, fingers! and so on... Looking at these results now, I'm amazed at the quality, consistency and ease of use. Gone are the days of doing alchemy on words and adding a bunch of "in the style of Rutkovsky, golden hour, hd, 4k, pretty please ..." at the end of prompts.

  • smusamashah7 days ago
    I am waiting for when I could provide these a scene snippet from "Hitchhiker's Guide To Galaxy" (or any book) and it could draw that for me. The gold planets, the waking up on the beach, total perspective vortex etc.

    I like the book, but there are quite a few scenes which are quite hard to visualize and make sense. An image generator that can follow that language and detail will be amazing. Even more awesome will be if it remains consistent in follow ups.

    • iandanforth6 days ago
      Have you tried? It works well for other books, e.g. here's a scene from "A Connecticut Yankee in King Arthur's Court" which is conveniently in the public domain.

      https://chatgpt.com/share/67f5d652-f7f4-8013-b2f2-3c997ea513...

      • fivestones3 days ago
        While it’s not 100% perfect (no horn from the forehead of his helmet) I’d say this is far beyond 90%. I can imagine reading books from project Gutenberg on a future reader app that automatically makes pictures of each scene which are consistent with each other and faithful to the text, on the fly as you read.
    • ARandumGuy7 days ago
      I've seen stuff that echoes this sentiment before, and I have to say I don't understand this desire at all. Why would I need a computer to show me what something in a book looks like? I already have an imagination for that!

      Books are fundamentally a collaborative artform between the author and the reader. The author provides the blueprint, but it's up to the reader to construct the scene in their own head. And every reader is going to have slightly different interpretations based on how they imagine the events of a book. This act of imagination and re-interpertation is one of the things I love about reading books.

      Having a computer do the visualization for you completely destroys what makes books engaging and interesting. If you don't want to visualize the book yourself, I have to wonder why the hell you're reading a book in the first place.

      If you need that visual component, just watch a movie or read a comic book or something. This isn't a slight against movies or comics! They're fantastic mediums and are able to utilize their visual element to communicate ideas in ways that books can struggle with. And these visuals will form a much more cohesive artistic vision then whatever an AI outputs, since they're an integrated and intentional part of the work.

      • smusamashah7 days ago
        You make it sound like an unfair wish, which I would say is unfair itself. I like the book, I like visualising it in my head, and I fantasise what a scene would look like. AI won't generate a true visual. It's all fantasy anyway and AI can actually do it the way I have it in my head. It will solidify that thought.

        For this book in particular, I read the comic version and I didn't like the visuals very much. I have a different idea of babel fish. Vogons look different. I would love to see the visual that's in my head on paper.

    • wrboyce7 days ago
      I love the idea, but I feel like I have to say that I’ve got a pretty solid idea of what the total perspective vortex would look like for someone being subjected to it. When I first read the books I immediately had a visual and that has never changed when I’ve read them again (and again…).

      I’m not sure what that says about either of us, but I would say that your definitive “quite hard to visualise” statement is very much subjective.

      • smusamashah7 days ago
        Vortex may be not so much but there are other hard to visualize things. I am on the third book, and I have no idea what Beeblebrox's two heads look like. Second head is often mentioned in passing. Sometimes its mentioned as its always there, other times it feels like its just pops out of somewhere, otherwise, it's like it doesn't exist.

        There is the scene when they see themselves on the beach on first rescue by the ship. That was hard to grasp. Or the insides of the ship itself, the bridge, the panels etc. Also that black ship they stole.

        But may be its just me having a hard time with these concepts.

        It's not just about scene being difficult to visualize, even if I can see them in my head, I want to see them on paper too because those thing excite me.

        • thwarted7 days ago
          We have alt text for images, you want alt images for text.

          You can see other people's interpretation of Zaphod's two heads by watching the BBC HHGTTG show (Mark Wing-Davey) or the movie (Sam Rockwell), among other renditions, which offer completely different interpretations, none of them canonical (not the least of which is because there was no canonical version of HHGTTG according to DA). I'm sure there are multitudes of fan art for HHGTTG on deviantart. Having AI generate an image doesn't offer any more "official" visualization.

          Zaphod's second head is mentioned just as much is warranted. If a character has a limp or a crazy haircut it is not mentioned every time, because it has nothing to do with what is going on. And the book mentions that one head is often distracted/asleep, so it sounds like you do have a good visual of what his two heads are like.

          While I understand that people think differently and some people are more visual thinkers, a good portion of the concepts expressed through writing are meant to be mindfucks that are difficult to express visually. A picture may be worth a thousand words, but the meat of writing is usually not the visual representation of its concepts. That's a great thing about writing: you can fill in the visuals yourself and it's fodder for fans to discuss.

          (BTW, Hotblack Desiato's ship would just be black. Your eyes couldn't focus on it. Even the controls were black labels on a black background. There is nothing here to visualize other than, well, blackness).

          • pixl976 days ago
            >Hotblack Desiato's ship would just be black

            These days we have paint that black, though we can't reproduce the effect on the monitor.

            Those superblacks really do mess with your mind. It's like a cutout of the void.

          • smusamashah6 days ago
            I agree that people interpret things differently and visualise differently and that is my point. I want to see the concept from my head in a solid visual form. Some concepts that are not clear, like those that I mentioned having trouble with, I want to see any kind of visual representation at all. I will probably not like some of those, but there these tools can help generate a bunch of variations tailored specifically for me. I can choose one and carry on with that. I can come back and more details to that as I read further.

            If someone can show me exactly what I am thinking of, won't that be amazing.

            • thwarted5 days ago
              > If someone can show me exactly what I am thinking of, won't that be amazing.

              I suppose it would be amazing if someone could read minds, but is that what you're asking for? In an earlier comment, you opened with:

              > I am waiting for when I could provide these a scene snippet from "Hitchhiker's Guide To Galaxy" (or any book) and it could draw that for me.

              This is asking for an illustrator, not showing you what you're thinking. The illustrator, even if it is a machine, will show you their interpretation.

        • wrboyce7 days ago
          Again, I have to disagree - which I suppose reinforces the whole subjectivity angle. I was positive that Zaphod’s two heads were side-by-side to the extent that it pissed me off a fair bit in the most recent movie adaption (among, let’s face it, plenty of other candidates).

          I don’t know if the “layout” of the heads is mentioned or not in the books - I’d have to go back and check - but it’s often quite jarring when a book becomes a movie and doesn’t match my inner vision (and how incredibly unthoughtful of them, to boot).

  • ziofill7 days ago
    I usually agree with most of Gary Marcus' points, but I'd really like to hear his take on this. One of his examples is that "the system can't generate a horse riding an astronaut" and in fact I tried a lot in the past but it would always draw the astronaut on top of the horse. Well, here is the result now: https://postimg.cc/QFtRjbHM
    • jsheard7 days ago
      Whenever one of these well known gotcha prompts gets "solved" the question is always whether they actually solved the underlying reason it used to fail, or did they just have a bunch of third-world workers tag pictures of horses and astronauts until the model started handling that specific example more reliably. As the saying goes, every measure which becomes a target becomes a bad measure.
  • orbital-decay7 days ago
    4o still exhibits the "pink elephant effect", it's just... subtler, and tends to reveal itself on a complex or confusing prompt. Negations are also still not handled properly, they tend to slightly confuse the model and decrease the accuracy of the answer or the generated picture. The same is true for any other LLM. Moreover, the author is asking the model to rationalize the decision he already made ("tell me why there can't be any elephants"), which could work as an equivalent to a CoT step.

    It's "just" a much bigger and much better trained model. Which is a quality on its own, absolutely no doubt about that. Fundamentally the issue is still there though, just less prominent. Which kind of makes sense - imagine the prompt "not green", what even is that? It's likely slightly out of distribution and requires representing a more complex abstraction, so the accuracy will necessarily be worse than stating the range of colors directly. The result might be accurate, until the model is confused/misdirected by something else, and suddenly it's not.

    I think in the end none of the architectural differences will matter beyond the scaling. What will matter a lot more is data diversity and training quality.

    • danielbln7 days ago
      But it's literally a different architecture (auto-regressive, presumably sequence based vs diffusion). In my experiments it is significantly, overwhelmingly better at consistency, coherence and prompt adherence. Things I needed control nets before it just... does it. and even zooming into fine details, they make sense.

      Here is an example with a bunch of negations: https://i.imgur.com/P8G5ICs.png

      • orbital-decay7 days ago
        Of course, it's a tiny specialized model vs a big generalist model. They're absolutely incomparable in size/quality though, especially of the text part. How much of this happens because of the poor encoder and worse training in other models, and how much due to the architectural differences? I'm not saying it's not better than the existing image gen models somehow, but it's pretty hard to separate the two since both are present. All current SotA LLMs including 4o itself have negation inaccuracy in text (you need a really complex prompt, or a long one with thousands of tokens, not a toy one), and I don't see why this one should behave differently in similar conditions. Especially considering that it also suffers from pretty much the same artifacts as other image models, just much less (fingers, extra limbs, perspective/lighting issues, overfitting, struggles with out-of-distribution generation etc.)
      • wat100007 days ago
        It’s interesting that out of all the aquatic animals it could have used, it chose one that perhaps looks the most like an elephant.
        • danielbln7 days ago
          That's just luck of the draw, in a new session with the same prompt it outputted a bird: https://i.imgur.com/SLT8cYe.png
          • wat100007 days ago
            I'm not convinced. I tried it and it showed me a swimming hippopotamus, which is even more elephant-like than the turtle. I tried again and it gave me a pelican, which is not generally very elephantish, but this particular one has a gray body with a texture that looks a lot like elephant skin.
  • hansmayer7 days ago
    > " Image generation is likely to be very disruptive in ways we don’t understand right now. " Is anyone getting tired of these formulations ? When a tech is disruptive, we know it immediately. Uber was disruptive. AirBnB, Gmail, Amazon, even Facebook at one point. You just knew it, nobody was writing long essays trying to justify those products. Robots generating statistically median images is impressive, but not disruptive at all. If something is "likely" to be "disruptive", but in ways "we don't understand yet", how can the claim even be made? What is it based on? If we do not understand it yet, how can we understand if it is "likely to be disruptive".
  • xnorswap7 days ago
    The "How to build a boardgame" infographic looks like half my linkedin "feed" now, but a boardgame instead of random basic programming / recruimentment / sales topic.

    Feed is in quotes because my feed seems to be 90% suggested posts.

  • morkalork7 days ago
    Huh, the coffee table reminds me of all those cheap e-retailers who very clearly (and badly) photoshop their clothes on to same 2 or 3 stock model images. If you thought shopping online sucked before, it's just going to get even worse now.
  • Zr016 days ago
    I'm more interested in the technical details than the publicity. Pretty much anyone these days can learn what a diffusion model is, how they're implemented, what the control flow is. What about this new multimodal LLM? They have no problems with text, they generate images using tokens, but how exactly? There's no open-source implementations that I know of, and I'm struggling to find details.
    • og_kalu6 days ago
      This video is very good. https://youtu.be/EzDsrEvdgNQ?si=EWp3U1GMkwg1bMQQ

      One thing i'd add is that generating the tokens at the target resolution from the start is no longer the only approach to autoregressive image generation.

      Rather than predicting each patch at the target resolution right away, it starts with the image (as patches) at a very small resolution and increasingly scales up. Paper here - https://arxiv.org/abs/2404.02905

  • lou13067 days ago
    Putting my Wittgenstein hat on: How can I ever be sure that the machine is not generating an incredibly tiny elephant, maybe hidden under the sofa?
  • eapriv7 days ago
    It’s always fun to read posts like that: they say “look at this amazing thing it drew”, and the image is utter garbage.
  • klik997 days ago
    I’ve seen a few YouTube thumbnail generation examples on Reddit (I’m on vacation so not gonna search for a link) that show multimodal with inline text giving specific instructions. It’s impressed me in a way that I haven’t been with LLMs for 2 years, IE it’s not just getting better at what it already does, but a totally new and intuitive way of working with generative AI.

    My understanding is it’s a meta-LLM approach, using multiple models and having them interact. I feel like it’s also evidence that OpenAI is not seriously pursuing AGI (just my opinion, I know there’s some on here who would aggressively disagree), but rather market use cases. It feels like an acceptance that any given model, at least now, has its own limitations but can get more useful in combination.

  • cadamsdotcom7 days ago
    Anyone stuck claiming AI isn’t useful - there are so many useful things it can now do. With text that makes sense you can generate invitations for your next picnic. That wasn’t possible mere weeks ago.

    Wonderful to be alive for these step changes in human capability.

  • qiqitori7 days ago
    Wha- wha- what? I tried to generate an image in ChatGPT after the announcement a while back and the image wasn't bad, but the text on it (numbers) was nonsense. (Analog gauge with nonsense numbers instead of e.g. 10, 20, 30, 40, etc.)

    Gave it another chance now, explicitly calling out the numbers. Well, they are improved but not sure how useful this result is (the spacing between numbers is a little off and there's still some curious counting going on. Maybe it kind of looks like the numbers are pasted in after the fact?

    https://chatgpt.com/share/67f4fa33-70dc-8012-8e1e-2dea563d3d...

    • nvalis7 days ago
      These images are still created with the old model. The share link states "Made with the old version of image generation. New images coming soon." below the first image.
  • vunderba7 days ago
    4o, despite OpenAI's practically draconian content policies, is a pretty big leap forward. I put together a comparison of some of the most competitive generative models (Imagen, 4o, Flux, and MJ7) where I prioritized increasingly difficult prompt adherence. If Imagen 3 had 4o's multimodal capabilities (being able to make constant adjustments against a generated image by prompting) I would say its nearly on-par with 4o.

    https://genai-showdown.specr.net

  • rkharsan647 days ago
    Are there any local models that use this new approach to generating images?
    • GaggiX7 days ago
      GPT-4o is the only model that seems to work well in the text-image joint space to this degree, even Gemini Flash 2.0 with native image support is not nearly as good so it will probably be a while for a good open source alternative to pop up (a while in the context of AI development).
      • gerash6 days ago
        depends on the use case.

        I used GPT-4o for some image editing (adding or removing things) to an image of a person and they distort the look of the people after each edit but (Gemini Flash + image out) did much better.

        The main problem is there is little control. For example I asked to add a helicopter to an image in a ski resort but then it seems cumbersome for me to have to write a full paragraph to describe where exactly I want this helicopter to be rather than if I could just do it by dragging things with a mouse.

    • DeathArrow7 days ago
      Yes, there's HiDream which yields even better results than 01.

      https://github.com/HiDream-ai/HiDream-I1

      • GaggiX7 days ago
        This is just a diffusion text-to-image model like many others, completely different than a LLM with a native image support.
  • roenxi7 days ago
    That proper "no elephants" first image is hilarious. Another key point here is the generative AI's meme game is getting rather strong.

    Which isn't a small thing, humour is an advanced soft skill.

    • loudmax7 days ago
      The "Game Design Otter" action figure seems to come with a pair of flashlights. I bet that's a residue from the previous prompt about illuminating the tablet with a flashlight.
    • aenvoker7 days ago
      Having used lots of different image generators, so far the only one with a sense of humor has been Dall-e3.

      https://www.reddit.com/r/dalle2/s/khb5XuNFdl

      There’s probably some sort of connection to ChatGPT in there. But, I don’t know enough about how it works.

    • YurgenJurgensen7 days ago
      Having thousands of copies of that image in your training set isn’t a skill at all.
      • hnbad7 days ago
        AI's "meme game is going strong" by the same metric I would use to try to argue that Elon Musk's is.

        I wouldn't call it a good metric, though.

  • swframe26 days ago
    I'm hoping this alternative image prompt preprocessing technique gets more attention:

    https://art-msra.github.io/

    Basically, the user's image prompt is converted to a several prompts to generate parts of the final image in layers which are combined. The layers are still available so that edits can cleanly update one section without affecting the others.

  • lupusreal7 days ago
    The image annotated to explain why no elephants are possible is very amusing.

    To me, this kind of image generation isn't very interesting for creating final products, but is extremely useful for communicating design intent to other people when collaborating on large creative projects. Previously I used crude "ms paint" sketches for this, which was much more tedious and less effective.

  • 7 days ago
    undefined
  • DonHopkins7 days ago
    Q: How do you know if there's an elephant hiding under your bed?

    A: Your face is pressed up against the ceiling!

  • thrance7 days ago
    Each generation follows the prompt a little bit better than the last, but I don't see any revolutions. Fingers are still messed up, eyes are wonky and legs sometimes still fork into two. Fundamentally it's still the same diffusion technique, with the same limitations.
    • xigency6 days ago
      Not only that, but the people working in this space still aren't reading the room, which is not encouraging.
  • mrconter116 days ago
    Isn't it ironic that it ended up being harder to get a computer to explicitly not create a photorealistic image of an elephant than to have create one?
  • freeamz7 days ago
    Hmmm isn't stable diffusion already doing that?
    • aenvoker7 days ago
      SD has a very primitive conceptual model. Basically “bag of words nudging pixels around for a while”. Words near each other influence each other. But, there’s nearly no understanding of grammar.

      Midjourney is similar with text prompts. But, with image prompts it is able to understand content separately from style. You can give it a photo of two people and it can return many images of recognizable approximations of those people in different poses.

      SD can only start from pixels, blur and deblur those pixels in place.

      MJ image prompts probably works via image-to-tokens added on to your text-to-tokens-to-image.

  • Der_Einzige7 days ago
    The #1 reason that this technology won't proliferate more quickly is that humans are a bunch of COOMERS!

    We get Stable Diffusion V1.5 and SDXL and what does the community go do with it? Lmao see civit.ai and it's literal hundreds of thousands of NSFW loras. The most popular model today on that website is the NSFW anime version of SDXL, called "Pony Diffusion" (I'm literally not making this up. A bunch of Bronies made this model!)

    Imagine that an open source image generator which does tokens autoregressively like this at this quality is released.

    The world is simply not ready for the amount of horny stuff that is going to be produced (especially without consent). It appears that the male libido really is the reason for most bad things in the world. We are truly the "villains of history".

    • sfn426 days ago
      I don't see the problem. Most people don't know this stuff exists. You pretty much have to look for it to find it. So what if someone wants to create rule 34 stuff? Let them, who cares? It doesn't hurt anyone. There was already a large market for artists drawing weird fetish stuff, AI doesn't really change anything.
  • NiloCK7 days ago
    The 'before' image passes the test this time in a "Treachery of Images" sort of way.
  • d4rkp4ttern7 days ago
    Diagrams are still a big unsolved problem. Making diagrams for a talk or paper is an extremely tedious process and I am still waiting for a good multimodal LLM solution for this. It should take a sketch and/or text description of what you want and in a few iterations you should get what you want. GPT4o tries hard but results are still bad.
    • chthonicdaemon7 days ago
      I've had the best luck at getting it to produce diagrams-as-code like mermaid or plantuml.
      • d4rkp4ttern7 days ago
        I know but those diagrams often don’t adequately capture what I want. Think of diagrams in nice technical talks or papers. I’ve even tried having the LLM (Claude) generate SVGs. They all fall short.
  • globnomulous6 days ago
    > The past couple years have been spent trying to figure out what text AI models are good for, and new use cases are being developed continuously.

    In other words, people who care about money and only money are pushing for these tools because they're convinced they'll reduce labor costs and somehow also improve the resulting product, while engineers and creative professionals who have these tools foisted upon them by unimaginative business people continue to insist that the tools are a solution in search of a problem, that they're stochastic parrots and plagiarism automata that bypass all of the important parts of engineering and creativity and make the absolutely, breathtakingly idiotic mistake of supposing it's possible to leap to a finished product without all the work and problem solving involved in getting there.

    > The line between human and AI creation will continue to blur

    This is utter nonsense, and hype-man prognosticators in the tech world like the author of the article turn out pretty much 100% of the time to be either grifters or saps who have fallen for the grifters' nonsense.

  • 1970-01-017 days ago
    For the first 9 years of an elephant's life, it can easily walk into that room. I don't find this to be a breakthrough. I find it to be clickbait.
  • ge967 days ago
    > we guac you covered
  • AIPedant7 days ago
    [dead]
  • TacticalCoder7 days ago
    [dead]
  • stfupniggger6 days ago
    [flagged]