StableDiffusion

98 readers
1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago
MODERATORS
676
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/zer0int1 on 2024-09-03 15:51:11+00:00.

677
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/aneryx on 2024-09-03 13:26:16+00:00.

678
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ThinkDiffusion on 2024-09-03 12:40:07+00:00.

679
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/renderartist on 2024-09-03 12:25:16+00:00.

680
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/luckyyirish on 2024-09-03 11:35:49+00:00.

681
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Dear-Spend-2865 on 2024-09-03 07:58:43+00:00.

682
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/aHotDay_ on 2024-09-03 05:12:16+00:00.

683
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/raekwonda_patreon on 2024-09-03 07:19:00+00:00.

684
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Previous_Power_4445 on 2024-09-03 06:11:15+00:00.


Major updates:

  • Flux controlnet support for Xlab, InstantX controlnets
  • improvements to the queue UI. You can view all outputs generated by the same task now
  • Bookmarking in the node library + recursive expanding / collapsing node library folders

Full blog post

685
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/renderartist on 2024-09-03 04:53:00+00:00.

686
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/tintwotin on 2024-09-02 21:56:30+00:00.

687
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/terminusresearchorg on 2024-09-02 21:30:21+00:00.


release:

Left: Base Flux.1 Dev model, 20 steps

Right: LoKr with configure.py default network settings and --flux_attention_masked_training

this is a chunky release, the trainer was majorly refactored

But for the most part, it should feel like nothing has changed, and you could possibly continue without making any changes.

You know those projects you always want to get around to but you never do because it seems like you don't even know where to begin? I refactored and deprecated a lot to get the beginnings of a Trainer SDK started.

  • the config.env files are now deprecated in favour of config.json or config.toml
    • the env files still work. MOST of it is backwards-compatible.
    • any kind of shell scripting you had in config.env will no longer work, eg. the $(date) call inside TRACKER_RUN_NAME will no longer 'resolve' to the date-time.
    • please open a ticket on github if something you desperately needed is no longer working, eg. datetimes we can add a special string like {timestamp} that will be replaced at startup
  • the default settings that were previously overridden in a hidden manner by train.sh are, as best I could, integrated correctly into the defaults for train.py
    • in other words, some settings / defaults may have changed but, now there is just one source of information for the defaults: train.py --help
  • for developers, there's now a Trainer class to use
    • additionally, for people who are aspiring developers or would like a more interactive environment to mess with SimpleTuner, there is now a Jupyter Notebook that lets you peek deeper into the process of using this Trainer class through a functional training environment
    • it's still new, and I've not had much time to extend it with a public API to use, so it's likely things will change in these internal methods, and not recommended to fully rely on it just yet if this concerns you
      • but, future changes should be easy enough for seasoned developers to integrate into their applications.
    • I'm sure it could be useful to someone who wishes to make a GUI for SimpleTuner, but, remember, currently it's relying on WSL2 for Windows users.
  • bug: multigpu step tracking in the learning rate scheduler was broken, but now works. resuming will correctly start from where the LR last was, and its trajectory is properly deterministic
  • bug: the attention masking we published in the last releases had an input-swapping bug, where the images were being masked instead of the text
    • upside: the resulting fine details and text following in a properly masked model is unparalleled, and really makes Dev feel more like Pro with nearly zero effort
    • upside: it's faster! the new code places the mask properly at the end of the sequence which seems to optimise for pytorch's kernels; just guessing that it simply "chops off" the end of the sequence and stops processing it rather than having to "hop over" the initial positions when we masked at the front when using it on the image embeds.

The first example image at the top used attention masking, but here's another demonstration:

Steampunk inventor in a workshop, intricate gadgets, Victorian attire, mechanical arm, goggles

5000 steps here on the new masking code without much care for the resulting model quality led to a major boost on the outputs. It didn't require 5000 steps - but I think a higher learning rate is needed for training a subject in with this configuration.

The training data is just 22 images of Cheech and Chong, and they're not even that good. They're just my latest test dataset.

Alien marketplace, bizarre creatures, exotic goods, vibrant colors, otherworldly atmosphere

a hand is holding a comic book with a cover that reads 'The Adventures of Superhero'

a cybernetic anne of green gables with neural implant and bio mech augmentations

Oh, okay, so, I guess cheech & chong make everything better. Who would have thought?

I didn't have any text / typography in the data:

A report on the training data and test run here, from a previous go at it (without attention masking):

Quick start guide to get training with Flux:

688
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Aplakka on 2024-09-02 21:10:49+00:00.

689
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/No-Connection-7276 on 2024-09-02 20:06:40+00:00.

690
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/knluong1 on 2024-09-02 16:48:14+00:00.

691
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/haofanw on 2024-09-02 15:08:36+00:00.

692
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/zhenghaoz on 2024-09-02 12:48:19+00:00.


About more than one year ago, Qualcomm had the possibility of fast Stable DIffusion on Snapdragon NPU (aka. Hexagon):

World’s first on-device demonstration of Stable Diffusion on an Android phone

Today, I made it come true as a publicly available Android app supporting 6 checkpoints (more checkpoints will be added in the future).

LiteDiffusion

It takes less than 1 minute to generate a image in 40 steps on my Xiaomi 13 (Snapdragon 8 Gen 2).

693
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Jujarmazak on 2024-09-02 06:50:02+00:00.

694
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Amadeus_AI on 2024-09-02 06:42:52+00:00.


  • Hires fix upscale right after the img2img process, using its latent.
  • Inspired by wcde/custom-hires-fix-for-automatic1111, which has been archived and isn't functioning since 2023.
  • Simple 150 lines extension with minimized features and dependency.
  • It also works in text2img, you can check the consistency between the Original "Hires. fix" and this extension.
  • I made this because I need this feature🤣
  • (Originally, if you upscale using/before img2img, the anatomy will be broken at higher resolution.)
695
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/PixarCEO on 2024-09-02 16:23:10+00:00.

696
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Nuckyduck on 2024-09-02 13:55:34+00:00.

697
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Secret_Ad8613 on 2024-09-02 10:15:40+00:00.

698
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Takeacoin on 2024-09-02 07:20:14+00:00.

699
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ThunderBR2 on 2024-09-02 08:01:27+00:00.

700
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/BlipOnNobodysRadar on 2024-09-02 07:56:43+00:00.


Hey just to be transparent and in good faith attempt to have dialogue for the good of the SD subreddit, we need to talk about a new mod who has a history of strange behavior and is already engaging adversarially with the community. Hopefully they don't ban this post before the other mods see it.

It's fine to have personal opinions, but their behavior is quite unstable and erratic, and very inappropriate for a moderator position. Especially since it is now supposed to be neutral and about open models in general. They're already being controversial and hostile to users on the subreddit, choosing to be antagonistic and deliberately misinterpreting straightforward questions/comments as "disrespectful" rulebreaking rather than clarify their positions reasonably. (Note I don't disagree with the original thread in question being nuked for NSFW, just their behavior in response to community feedback).

The mod "pretend potential" is crystalwizard. I remember them from the StableDiffusion discord. They were hardcore defending SAI for the SD3 debacle, deriding anyone who criticized the quality of SD3. I got the perhaps mistaken the impression that they were a SAI employee with how deeply invested they were. Whether that's the case or not, their behavior seems very inappropriate for a supposedly neutral moderator position.

I'll post a few quick screenshots to back up this up, but didn't dig too deep. Just some quick references from what I remembered. They claimed anyone who criticized the SD3 debacle was a "shill" and got very weird about it, making conspiracy theories that anyone who spoke out was a single person on alt accounts or a shill (calling me one directly). They also claimed the civitai banning of SD3 and questions about the SD3 original license were "misinformation".

view more: ‹ prev next ›