StableDiffusion

98 readers
1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago
MODERATORS
551
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/eimas_dev on 2024-09-10 22:30:57+00:00.

552
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/theJunkyardGold on 2024-09-10 20:22:58+00:00.

553
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Tokyo_Jab on 2024-09-11 01:21:18+00:00.

554
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/dal_mac on 2024-09-10 21:03:43+00:00.

555
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/TheLatentExplorer on 2024-09-10 20:33:51+00:00.


A month ago, u/nrehiew_ posted a diagram of the Flux architecture on X, that latter got reposted by u/pppodong on Reddit here.

It was great but a bit messy and some details were lacking for me to gain a better understanding of Flux.1, so I decided to make one myself and thought I could share it here, some people might be interested. Laying out the full architecture this way helped me a lot to understand Flux.1, especially since there is no actual paper about this model (sadly...).

I had to make several representation choices, I would love to read your critique so I can improve it and make a better version in the future. I plan on making a cleaner one usign TikZ, with full tensor shape annotations, but I needed a draft before hand because the model is quite big, so I made this version in draw.io.

I'm afraid Reddit will compress the image to much so I uploaded it to Github here.

Flux.1 architecture diagram

556
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Ecstatic_Bandicoot18 on 2024-09-10 19:39:05+00:00.


Back when I was really into it, we were all on SD 1.5 because it had more celeb training data etc in it and was less censored blah blah blah. ControlNet was popping off and everyone was in Automatic1111 for the most part. It was a lot of fun, but it's my understanding that this really isn't what people are using anymore.

So what is the new meta? I don't really know what ComfyUI or Flux or whatever really is. Is prompting still the same or are we writing out more complete sentences and whatnot now? Is StableDiffusion even really still a go to or do people use DallE and Midjourney more now? Basically what are the big developments I've missed?

I know it's a lot to ask but I kinda need a refresher course. lol Thank y'all for your time.

Edit: Just want to give another huge thank you to those of you offering your insights and preferences. There is so much more going on now since I got involved way back in the day! Y'all are a tremendous help in pointing me in the right direction, so again thank you.

557
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Agreeable_Effect938 on 2024-09-10 15:40:12+00:00.

558
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Murky_Cheetah_7993 on 2024-09-10 14:34:31+00:00.

559
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Natural_Reserve_8197 on 2024-09-10 13:25:37+00:00.

560
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/NarrativeNode on 2024-09-10 12:47:48+00:00.

561
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/mrfofr on 2024-09-10 08:32:14+00:00.

562
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/terrariyum on 2024-09-10 02:36:48+00:00.


1. A free resource is around the corner

SD1 has several free ones. Midjourney has midlibrary.io. For SDXL, I made a free an open-source webapp to explore SDXL artist styles, and there are others.

.

2. What about Flux, Auraflow, etc.?

Someone will make them for free soon. I encourage anyone to fork my webapp, which uses this free and open-source database of artists and style tags. The code is easily adaptable for any model, and it's well documented. The database is the best open-source artist and style tags database available. I'm not bragging, I'm challenging the community to one-up me.

.

3. Time and effort alone don't justify a pay-wall

I can say that having personally spent hundreds of hours developing my webapp code, hand creating the database of tags, generating thousands of images, and vetting which artists actually work in SDXL. I don't want anything in return because this information simply needs to be free.

.

4. Don't encourage pay-only diffusion

We know the big companies would love to ban open-weights so that we all have to pay them a recurring fee forever. If you pony up for a pay-wall, even if it's just a one-person small time outfit, it sends a message. It adds a brick to the wall of closed-source and pay-only diffusion. Instead, donate to those who donate their time to the community. None of us could be diffusing if not for the open-weight models and free tutorials and and information exchange. Let's keep it going.

.

5. Pay-walling artist style data and artist imitation is unjustifiable

Diffusion models and any resources related to them wouldn't be possible if not for the artists who made the artwork that the models are trained on. Scraping artwork to train is ethical if the resulting model is free and open source. Just like I can ethically and legally look at any artist's work and manually imitate their style. But trying to earn money off of an artist's brand name or their specific works isn't legal or ethical. That's like selling a course called, "Learn how to paint like Rutkowski".

563
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Jackledead on 2024-09-10 01:04:27+00:00.

564
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/AiAdventurer on 2024-09-09 23:31:46+00:00.

565
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Bobsprout on 2024-09-09 22:40:25+00:00.

566
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Thin_Ad7360 on 2024-09-10 05:39:58+00:00.


train/infer:

project page:

567
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Fun-Complaint1023 on 2024-09-09 22:54:57+00:00.

568
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/screean on 2024-09-10 00:30:16+00:00.

569
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Patient-Librarian-33 on 2024-09-09 23:13:33+00:00.

570
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/cogniwerk on 2024-09-09 19:11:04+00:00.

571
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/High_Sleep3694 on 2024-09-09 20:54:21+00:00.

572
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/dal_mac on 2024-09-09 20:21:28+00:00.

573
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ReidDesignsPro on 2024-09-09 18:41:24+00:00.

574
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Starkeeper2000 on 2024-09-09 18:33:34+00:00.


I didnt expect that it´s possible but it worked! I´ve trained my first Lora for FluxDev local on a 4070 RTX (Mobile) and i want to share the result with you.

575
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Iory1998 on 2024-09-09 17:02:21+00:00.


Hi,

A few weeks ago, I made a quick comparison between the FP16, Q8 and nf4. My conclusion then was that Q8 is almost like the fp16 but at half size. Find attached a few examples.

After a few weeks, and playing around with different quantization levels, I make the following observations:

  • What I am concerned with is how close a quantization level to the full precision model. I am not discussing which versions provide the best quality since the latter is subjective, but which generates images close to the Fp16. - As I mentioned, quality is subjective. A few times lower quantized models yielded, aesthetically, better images than the Fp16! Sometimes, Q4 generated images that are closer to FP16 than Q6.
  • Overall, the composition of an image changes noticeably once you go Q5_0 and below. Again, this doesn't mean that the image quality is worse, but the image itself is slightly different.
  • If you have 24GB, use Q8. It's almost exactly as the FP16. If you force the text-encoders to be loaded in RAM, you will use about 15GB of VRAM, giving you ample space for multiple LoRAs, hi-res fix, and generation in batches. For some reasons, is faster than Q6_KM on my machine. I can even load an LLM with Flux when using a Q8.
  • If you have 16 GB of VRAM, then Q6_KM is a good match for you. It takes up about 12GB of Vram Assuming you are forcing the text-encoders to remain in RAM), and you won't have to offload some layers to the CPU. It offers high accuracy at lower size. Again, you should have some Vram space for multiple LoRAs and Hi-res fix.
  • If you have 12GB, then Q5_1 is the one for you. It takes 10GB of Vram (assuming you are loading text-encoder in RAM), and I think it's the model that offers the best balance between size, speed, and quality. It's almost as good as Q6_KM. If I have to keep two models, I'll keep Q8 and Q5_1. As for Q5_0, it's closer to Q4 than Q6 in terms of accuracy, and in my testing it's the quantization level where you start noticing differences.
  • If you have less than 10GB, use Q4_0 or Q4_1 rather than the NF4. I am not saying the NF4 is bad. It has it's own charm. But if you are looking for the models that are closer to the FP16, then Q4_0 is the one you want.
  • Finally, I noticed that the NF4 is the most unpredictable version in terms of image quality. Sometimes, the images are really good, and other times they are bad. I feel that this model has consistency issues.

The great news is, whatever model you are using (I haven't tested lower quantization levels), you are not missing much in terms of accuracy.

Flux.1 Model Quants Levels Comparison

view more: ‹ prev next ›