Today, we’re headed to the frozen north. Dispite the snow on the ground, the sun is out and the light is perfect for a brisk shoot at the weather-worn cabins of Colter.
Two months ago, I fell into the trap that is Stable Diffusion. Today, I released my first trained model based on the snowbound town of Colter from Red Dead Redemption 2. For anyone interested in SD image generation, you can grab a copy at CivitAI. https://civitai.com/models/137327. I’d appreciate you taking a look, and giving it a like or a rating if you’re so inclined. The LoRA model is stylistically versatile, and there’s a bunch of SFW examples I made of its range.
As always, images link to full-size PNGs that contain prompt metadata.
I…have got to learn more about this AI shit….
deleted by creator
Too time consuming?
Yes
Thanks for your contributions to the community!
I have questions if you don’t mind.
In really trying to get into LoRA training in general and there’s a lot of things I can’t intuitively work out or find solid answers to.
For example, with this, what is your “class”? And did you use regularization images? (I want to make a habit of using them). If you did, what did you use for them? Like places that aren’t this? Like deserts and forests, etc?
Would you consider elaborating on batch size, repeats, epochs, etc, too?
Thanks again!
deleted by creator
Oof. Dude. You’re not wrong about what is and isn’t available online. But it’s okay. New frontier or whatever. Haha.
I’ve been mulling over the regularization image thing, so I created a reddit post asking about it, but I basically asked, “are these images supposed to represent what the model thinks ‘this’ thing is, and in that case, regularization images would serve the role of being ‘this, but not this’” or is it more like, “these fill in the gaps when the LoRA is lacking?”
I suspect it’s more like the first. That said, it might actually make sense to include all the defective and diverse images for the purpose of basically instructing the LoRA/model to be like, “I know you think I’m asking for ‘this,’ but in reality, that’s not what I want.”
If that’s the case, it might make sense to ENSURE your regularization images are way off base and messed up or whatever. Or at least anything in the class that you know you def don’t want.
I don’t have confirmation of any of this. I’m VERY new here (like ran my first LoRA training yesterday).
I like the idea of your batch size.
Ah. The captioning is something I REALLY need to think about. I’m guessing the cabin caption idea you used, basically you lost flexibility but gained accuracy by going that approach? I wonder if you could tag it ‘cabin, church’ and retain some of both?
The steps, to me, sound very high, but I can’t say, for sure. Ahaha. Because for people, I’ve heard 1500 to 3000.
I’ll be sure to come back and share findings once I have more. I think to really “do this right” you HAVE to train some of your own shit, but to do it well, as you’ve quickly realized, you’ve got to understand the methodology/philosophy of how it’s done.
Well. Maybe scratch some of what I said above. As with many things, the answer is simply more complicated than that.
I found this video fairly useful in helping understand the process. I hope it helps.
Thanks for always sharing the process. These are really good. The backgrounds look great. Always impressed you pull these from games.
deleted by creator
First, another great album, thank you!
One thing I always find interesting is to look at the clothes generated, and they’re usually kinda bizarre, but I the jacket/jumper thing from #6 would actually look pretty cool in real life
deleted by creator
This is really amazing work!
Hey these look great and that 1st one is just… wow . What model do you use?
deleted by creator