StableDiffusion

98 readers
1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago
MODERATORS
276
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/stbl_reel on 2024-09-29 12:15:43+00:00.

277
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/reditor_13 on 2024-09-29 08:49:35+00:00.

278
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/rolux on 2024-09-29 08:16:48+00:00.

279
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/kozakfull2 on 2024-09-29 00:02:25+00:00.

280
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ryanontheinside on 2024-09-28 21:51:32+00:00.

281
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/NeededMonster on 2024-09-29 00:11:27+00:00.

282
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/LocoMod on 2024-09-28 22:39:45+00:00.

283
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/renderartist on 2024-09-28 22:36:15+00:00.

284
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/cgpixel23 on 2024-09-28 17:54:59+00:00.

285
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Major_Specific_23 on 2024-09-28 21:05:00+00:00.

286
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/TabCompletion on 2024-09-28 15:54:01+00:00.

287
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/TableFew3521 on 2024-09-28 11:03:53+00:00.


As I was trying to save time while having good results, I tried 3 different ones (Kohya_SS, ComfyUI/Kohya and Ai-toolkit) I still think Ai-toolkit is way better than Kohya, and I think is because the shceduler "Flowmatch", is the only different config, and even with bad quality images you can achieve amazing skin texture on LoRAs, but in Kohya even tho I save like 5 hours (wich is crazy), I get good results but with this plastic skin texture of Flux no matter the resolution of the images I use. What is your experience? You agree or disagree with me? You think there's a better trainer than the ones I mentioned?

288
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/camenduru on 2024-09-28 16:03:48+00:00.

289
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/an303042 on 2024-09-28 13:45:35+00:00.

290
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/kastmada on 2024-09-28 11:34:36+00:00.

291
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/AdQuirky7106 on 2024-09-28 02:14:31+00:00.


I already liked Steve Mould...a dude that's appeared on Numberphile many times. But just now watching a video on a certain kind of dumb little visual illusion, he unexpectedly launched into the most thorough and understandable explanation of how CLIP-inferred diffusion models work that I've ever seen. Like, by far. It's just incredible. For those that haven't seen this, enjoy the little epiphanies from connecting diffusion-based image models, LLMs, and CLIP, and how they all work together with cross-attention!!

Starts at about 2 minutes in.

292
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/blankey1337 on 2024-09-27 13:56:33+00:00.


Download it at civitai:

Have fun!

293
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/NunyaBuzor on 2024-09-27 23:46:54+00:00.


Code:

Project Page:

Note: Everything information you see below comes from the project page, please take the results with a grain of salt on its quality.

Example

Ctrl-X is a simple tool for generating images from text without the need for extra training or guidance. It allows users to control both the structure and appearance of an image by providing two reference images—one for layout and one for style. Ctrl-X aligns the image’s layout with the structure image and transfers the visual style from the appearance image. It works with any type of reference image, is much faster than previous methods, and can be easily integrated into any text-to-image or text-to-video model.

Ctrl-X works by first taking the clean structure and appearance data and adding noise to them using a diffusion process. It then extracts features from these noisy versions through a pretrained text-to-image diffusion model. During the process of removing the noise, Ctrl-X injects key features from the structure data and uses attention mechanisms to transfer style details from the appearance data. This allows for control over both the layout and style of the final image. The method is called "Ctrl-X" because it combines structure preservation with style transfer, like cutting and pasting.

Results of training-free and guidance-free T2I diffusion with structure and appearance control

Results of training-free and guidance-free T2I diffusion with structure and appearance control

Ctrl-X is capable of multi-subject generation with semantic correspondence between appearance and structure images across both subjects and backgrounds. In comparison, ControlNet + IP-Adapter often fails at transferring all subject and background appearances.

Ctrl-X also supports prompt-driven conditional generation, where it generates an output image complying with the given text prompt while aligning with the structure of the structure image. Ctrl-X continues to support any structure image/condition type here as well. The base model here is Stable Diffusion XL v1.0.

Results: Extension to video generation

294
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/jenza1 on 2024-09-27 18:07:49+00:00.

295
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Much_Can_4610 on 2024-09-27 14:56:14+00:00.

296
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/zazaoo19 on 2024-09-27 11:20:49+00:00.

297
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/rwbronco on 2024-09-27 21:17:11+00:00.

298
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/lhg31 on 2024-09-27 21:08:37+00:00.

299
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/SideMurky8087 on 2024-09-27 14:33:32+00:00.

300
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/jjjnnnxxx on 2024-09-27 09:11:55+00:00.

view more: ‹ prev next ›