StableDiffusion

An astonishing paper was released a couple of days ago showing a revolutionary new image generation paradigm. It's a multimodal model with a built in LLM and a vision model that gives you unbelievable control through prompting. You can give it an image of a subject and tell it to put that subject in a certain scene. You can do that with multiple subjects. No need to train a LoRA or any of that. You can prompt it to edit a part of an image, or to produce an image with the same pose as a reference image, without the need of a controlnet. The possibilities are so mind-boggling, I am, frankly, having a hard time believing that this could be possible.

They are planning to release the source code "soon". I simply cannot wait. This is on a completely different level from anything we've seen.

408

1

FYI if you're using something like JoyCaption to caption images: Kohya does not support actual newline characters between paragraphs, it stops parsing the file after the first one it hits, your ca... (old.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ZootAllures9111 on 2024-09-20 02:27:39+00:00.

Original Title: FYI if you're using something like JoyCaption to caption images: Kohya does not support actual newline characters between paragraphs, it stops parsing the file after the first one it hits, your caption text needs to be separated only by spaces between words (meaning just one long paragraph)

I noticed this was the case a while ago, figured I'd point it out. You can confirm it by comparing metadata in a Lora file to captions that had newlines, any text after one for a given image simply won't be present in that metadata.

409

1

Kurzgesagt Artstyle Lora (www.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/flyingdickins on 2024-09-19 22:37:03+00:00.

410

1

Some of Fisher Price's unreleased products (www.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/an303042 on 2024-09-19 21:05:45+00:00.

411

1

FLUX in Forge - best image quality settings (old.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/371830 on 2024-09-19 17:53:26+00:00.

After using Flux for over a month now, I'm curious what's your combo for best image quality? As I started local image generation only last month (occasional MJ user before), it's pretty much constant learning. One of the things that took me time to realize is that not just selection of the model itself is important, but also all the other bits like clip, te, sampler etc. so I thought I'll share this, maybe other newbies find it useful.

Here is my current best quality setup (photorealistic). I have 24GB, but I think it will work on 16 GB vram.

flux1-dev-Q8_0.gguf
clip: ViT-L-14-TEXT-detail-improved-hiT-GmP-TE-only-HF.safetensors - until last week I didn't even know you can use different clips. This one made big difference for me and works better than ViT-L-14-BEST-smooth. Thanks u/zer0int1
te: t5-v1_1-xxl-encoder-Q8_0.gguf - not sure if it makes any difference vs t5xxl_fp8_e4m3fn.safetensors
vae: ae.safetensors - don't remember where I got this one from
sampling: Forge Flux Realistic - best results from few sampling methods I tested in forge
scheduler: simple
sampling steps: 20
DCFG 2-2.5 - with PAG below enabled it seems I can bump up DCFG higher before the skin starts to look unnatural
Perturbed Attention Guidance: 3 - this adds about 40% inference time, but I see clear improvement in prompt adherence and overall consistency so always keep it on. When going above 5 the images start looking unnatural.
Other optional settings in forge did not give me any convincing improvements, so don't use them.

412

1

FastSD CPU ComfyUI extension (i.redd.it)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/simpleuserhere on 2024-09-19 15:30:21+00:00.

413

1

An Air of Water & Sand (Flux.1-dev GGUF Q4.KS) (i.redd.it)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/SmaugPool on 2024-09-19 14:10:52+00:00.

414

1

Due to popular demand: Cringe skulls Lora for FLUX (www.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Patient-Librarian-33 on 2024-09-19 12:20:16+00:00.

415

1

Image to Video for CogVideoX-5b implemented in Blender add-on (old.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/tintwotin on 2024-09-19 09:56:48+00:00.

Image to Video for CogVideoX-5b implemented in diffuserslib by zRdianjiao and Aryan V S has now been added to the free and open-source Blender VSE add-on: Pallaidium.

416

1

Matcha Latte Ceremony (AnimateDiff LCM + Adobe After Effects) (old.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/JBOOGZEE on 2024-09-19 08:56:14+00:00.

417

1

Elektroschutz⚡ LoRA (www.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/an303042 on 2024-09-19 08:15:52+00:00.

418

1

A simple Flux pipeline workflow (i.redd.it)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/wonderflex on 2024-09-19 06:27:25+00:00.

419

1

Hszd-workflow-icon (www.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/zazaoo19 on 2024-09-19 03:36:44+00:00.

420

1

Flux Chromatic aberration VHS footage style LoRa (www.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Pultti4 on 2024-09-19 02:33:07+00:00.

421

1

landscape. (i.redd.it)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/EcoPeakPulse on 2024-09-19 02:27:02+00:00.

422

1

Nosferatu (1922) Style FLUX.1 D (www.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Junior_Economics7502 on 2024-09-18 19:54:38+00:00.

423

1

Jigsaw Puzzle 🧩 Flux LoRa (www.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Angrypenguinpng on 2024-09-18 22:00:39+00:00.

424

1

Omegle webcam [Flux Dev] V1 (www.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ScarletEnthusiast on 2024-09-18 17:21:57+00:00.

425

1

CogVideoX-5b Image To Video model weights released! (old.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Old_Reach4779 on 2024-09-18 16:18:06+00:00.

Hugging face:

Hugging face space:

Github:

Comfyui node: (kijai just inserted i2v example workflow 😍)

License: Apache-2.0 license !