3 pointsby nate3 hours ago2 comments
  • BoredPositron2 hours ago
    We mainly do full finetunes on diffusion models and their text encoders like z-image, flux2 klein to adapt them to our clients visual style and train LoRas for people and products. The quality goes up immensely if the model has a better grasp of professional visual terms. Training the right kind of leather or plastic (mainly for the pattern) helps when you are scaling to 12-16k and want 99.9% reproduction, everything becomes a texture at that size and if you don't have them trained it's a mess.
    • nate2 hours ago
      Ah. That makes sense. Is this something where you do it once and you are done? Or is it something you re-finetune based on performance or reviews you get back from the client. i.e. Client doesn't like something so you go back for another cycle of

      Also, is this something that's a pain in the ass to manage multiple versions of the model? One (maybe more in draft mode) for each client?

      • BoredPositron2 hours ago
        We do one finetune on the base model to iron out a few of its problems, like plastic skin and its poor understanding of visual terms and reproduction. It also really helps it understand the normal maps we use for perspective templating.

        What we are mostly producing are LoRAs, and we put them through a staged training process. The first stage is all about the textures, the second stage focuses on the product itself, and the last stage dials in the exact perspectives we need.

        Despite what the research out there says, we actually get better results sticking with LoRAs instead of LoKRs. The pain is generating the dataset because you have to adapt it for every product. The actual training is basically just fire and forget.

  • aimadetools2 hours ago
    [dead]