StableDiffusion

98 readers
1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago
MODERATORS
901
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/efwufh9 on 2024-08-23 10:47:16+00:00.

902
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/akilmaf on 2024-08-23 07:47:45+00:00.

903
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/rolux on 2024-08-23 07:23:10+00:00.

904
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Bobsprout on 2024-08-23 06:46:05+00:00.

905
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/3deal on 2024-08-23 05:57:21+00:00.

906
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/GabratorTheGrat on 2024-08-23 06:34:49+00:00.

907
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Natural_Reserve_8197 on 2024-08-23 03:27:41+00:00.

908
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/lazyspock on 2024-08-23 03:26:21+00:00.

Original Title: I know this was already said but... Flux's prompt adherence is CRAZY. First image: a real photo. Second image: the prompt ChatGPT created from the photo. Third image: Flux's render from the prompt (Euler, 20 steps)

909
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/jinofcool on 2024-08-23 00:31:05+00:00.

910
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Pretend_Potential on 2024-08-22 18:49:37+00:00.


according to the post Emad made on twitter this morning, Stable Diffusion was open sourced on Aug 22, 2022.

Can you believe it has only been 2 years? look at all the changes - not just with generative AI but also with the entire world that grabbed the technology and exploded with advances.

911
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Major_Specific_23 on 2024-08-22 22:35:58+00:00.

912
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/AstraliteHeart on 2024-08-22 22:15:32+00:00.

913
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Agreeable_Effect938 on 2024-08-22 22:05:34+00:00.

914
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Pultti4 on 2024-08-22 20:50:55+00:00.

915
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/X3ll3n on 2024-08-22 20:33:40+00:00.

916
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/PixarCEO on 2024-08-22 19:57:42+00:00.

917
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/arcanite24 on 2024-08-22 19:12:34+00:00.

918
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/gigglegenius on 2024-08-22 18:57:12+00:00.


I just had this idea of typing in street names and directions for prompting like "north, south" etc. to create a model of my little german town. I ran it through my auto cropping scripts (that are in effin PHP, don't ask why) but I noticed the PHP standard algorithm of bilinear is crappy af (fences are a pixelated mess). I am too lazy to port it to python but yea I will get there.

And I took all of the pictures on a single day with consistent weather conditions.

I don't know I think it is kind of neat to be able to type a real location and sky direction and get pictures. It is questionable, how useful this is, but I like to do it just for learning. Did you ever attempt crazy LoRas or had innovative ideas for captioning techniques, or training methods?

919
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Takeacoin on 2024-08-22 18:10:51+00:00.

920
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/applied_intelligence on 2024-08-22 18:49:14+00:00.

921
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/JackKerawock on 2024-08-22 16:06:38+00:00.

922
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Parogarr on 2024-08-22 14:27:59+00:00.

923
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/lonewolfmcquaid on 2024-08-22 14:20:00+00:00.

Original Title: Now we have sorta conquered prompt adherence, what's the next big frontier you'd like to see image gen model makers to take on. i think scene consistency should be next big thing, like going from wide shot to a medium shot in various angles and vice versa would be the game over feature.

924
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/zeekwithz on 2024-08-22 14:21:51+00:00.

925
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ArmadstheDoom on 2024-08-22 13:05:19+00:00.


So, I love Flux. I think it's a huge advancement. But I've noticed that there are things it seems to just... not know how to do well? It's entirely possible that I don't know how to prompt, but these are things I've noticed, and I figure that at the very worst, I'll learn how to fix these problems, and maybe other people can share their own issues.

  1. There's a weird attachment to hands in pockets? When generating people, there's an odd thing I've noticed that the model just loves to put people's hands in their pockets, unprompted. I mean, I think this might be something done to avoid having to generate hands, but I've noticed it?
  2. Emotions are really hard to generate? Some it seems to understand, like 'happy' or 'laughing' but others it just doesn't seem to understand at all, like 'surprised' in my experience. It's really hard to generate expressions such as 'confused' or 'worried' because it just doesn't seem to parse it well?
  3. Styles are kind of hit and miss? Loras can solve this, though I've found that it can be kind of picky with whether or not it uses loras in the first place. It doesn't seem to understand say, the difference between a comic art style and an anime art style, and will often conflate the two, for example.
  4. The lack of a negative prompt means that it's really hard to dissuade the model from things you don't want. For example, I noticed that a lot of images had earrings and watches and wristbands unprompted. But putting things like 'no earrings' in the prompt only made it add them more, and made the prompt longer.
  5. It seems to only really grasp the first 75 tokens or so? One thing I've noticed is that while the training prompts are usually very long, in practice it often ignores prompt tokens towards the end of the prompt, and that moving them ahead in the prompt makes them more likely to be accepted.

Those are the ones on the top of my head, maybe you have others or fixes? I feel like there's so much hype around what it can do that there's not as much discussion on things it doesn't do or how to make certain things easier.

view more: ‹ prev next ›