StableDiffusion

One of my neighbors is caring for his elderly wife who has lost a significant portion of her vision (not legally blind yet, I think) and he has trouble describing images to her in his own words with enough detail.

For example, his wife would ask for the description of a scarf with details from online clothing store and she's getting frustrated with him with his limited descriptions.

My wife and I come over and visit and she seems to really like the details my wife provides vs what her husband would describe. His wife would just say my wife gives all the details.

I've tried describing a few things and both my wife and her think I suck at it so I just stopped.

Here's an example of a scarf her husband and I tried to describe to her and we didn't do so well but apparently my wife did great explaining all the details and all the variations of the scarf.

Links appreciated.

159

1

Enhancing Flux prompting with Llama 3.2 Vision Model - an experiment (old.reddit.com)

submitted 1 week ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/super-curses on 2024-10-07 12:41:23+00:00.

I built a thing.

Takes a bare bones prompts and uses Llama 3.2 to embellish it. Prompt is passed to Flux/schnell to generate the image. (essentially doing what chatGPT is doing with image generation). Seems to generate some good images with Flux/schnell.

'Review Image' passes the generated image back to Llama 3.2 for a review, I can then pass the suggestions back into the prompt to regenerate it. (need to do some tweaking of the prompts here to ensure the suggestions really change the new prompt).

160

1

The best open-source lipsync tools as of right now (old.reddit.com)

submitted 1 week ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Smart-Swordfish-1162 on 2024-10-07 12:12:34+00:00.

Hey! Doing a deep-dive on open-source lipsync tools out there right now.

The best one so far that I've found is MuseTalk - , but I'm still searching. Wav2Lip, Wav2Lip-HD, SadTalker etc. don't really have convincing outputs.

Looking for quality like this:

Please advise, thanks!

161

1

[FLUX] Vintage Anime Filter (www.reddit.com)

submitted 1 week ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/BigRub7079 on 2024-10-07 10:53:09+00:00.

162

1

C:\Usersyour prompt here\Pictures\Photos\ also works for producing real photos (sometimes weird) (www.reddit.com)

submitted 1 week ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/FortranUA on 2024-10-07 08:25:10+00:00.

163

1

Poor Reviews of Facefusion not allowed (old.reddit.com)

submitted 1 week ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/FitContribution2946 on 2024-10-07 03:39:45+00:00.

So that you know what kind of person the developer of Facefusion is and why you should not support them. If you make a tutorial on the software, and it is negative, he will report you.

THe below video did nothing but goes through public domain material and was critical of the developer.

However, the developer is a bully and uses intimidation tactics. Is this really who you want to support in the community?

From the mouth of the developer himself:

"Remove every FaceFusion video - never upload any video about FaceFusion or me again.

I don't want you to mention, review or even install FaceFusion ever again. You are a persona non grata..."

Don't even "mention" me? Does that seem right? So like, if you even speak the name of my project you will be punished? Does that seem like a violation of speech rights? Unfortunately, Youtube and these other big platforms are too large to parse each complaint so just go by default with the accuser. The developer knows that and so manipulates the system. Do you see what I mean by "cyberbully"?

164

1

I'm pleasantly surprised by CogVideoX5B (old.reddit.com)

submitted 1 week ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/the_bollo on 2024-10-07 01:45:32+00:00.

165

1

Huge news for Kohya GUI - Now you can fully Fine Tune / DreamBooth FLUX Dev with as low as 6 GB GPUs without any quality loss compared to 48 GB GPUs - Fine Tuning yields such good results that no ... (www.reddit.com)

submitted 1 week ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/CeFurkan on 2024-10-07 00:53:14+00:00.

Original Title: Huge news for Kohya GUI - Now you can fully Fine Tune / DreamBooth FLUX Dev with as low as 6 GB GPUs without any quality loss compared to 48 GB GPUs - Fine Tuning yields such good results that no LoRA config and training will ever yield

166

1

Common Camera and Smartphone File Naming Formats - Wildcards (www.reddit.com)

submitted 1 week ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/RalFingerLP on 2024-10-06 16:44:11+00:00.

167

1

AI reinterpretation "The Lady with an Ermine" (i.redd.it)

submitted 1 week ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/monosiniestro on 2024-10-06 03:25:57+00:00.

168

1

Icy reflection (i.redd.it)

submitted 2 weeks ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/AyakonArts on 2024-10-06 11:58:22+00:00.

169

1

Industrial 🏭 (www.reddit.com)

submitted 2 weeks ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ectoblob on 2024-10-06 10:15:06+00:00.

170

1

I tested the use of a random "FILENAME_1234.jpg" to make highly realistic photos with flux posts that have been posted as of recent (old.reddit.com)

submitted 2 weeks ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/idunno63 on 2024-10-06 10:35:50+00:00.

Over the last few days there has been several posts regarding using some random .jpg file name in your flux prompt (ex: IMG-7587.JPG or IMG_7587.JPG) and I figured I would test and show my results and comparisons. I'm using Flux Dev FP8 on Forge.

Prompt: "IMG-7587.JPG Christmas party" Steps: 20, Sampler: Euler, Schedule type: Beta, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 293551553, Size: 1152x896, Model hash: 275ef623d3, Model: flux1-dev-fp8, Beta schedule alpha: 0.6, Beta schedule beta: 0.6,

Prompt: "Christmas party" Steps: 20, Sampler: Euler, Schedule type: Beta, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 293551553, Size: 1152x896, Model hash: 275ef623d3, Model: flux1-dev-fp8, Beta schedule alpha: 0.6, Beta schedule beta: 0.6,

Prompt: "IMG_7587.JPG vacation" Steps: 20, Sampler: Euler, Schedule type: Beta, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 293551553, Size: 1152x896, Model hash: 275ef623d3, Model: flux1-dev-fp8, Beta schedule alpha: 0.6, Beta schedule beta: 0.6

Prompt: "vacation" Steps: 20, Sampler: Euler, Schedule type: Beta, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 293551553, Size: 1152x896, Model hash: 275ef623d3, Model: flux1-dev-fp8, Beta schedule alpha: 0.6, Beta schedule beta: 0.6

Prompt: "IMG-7587.JPG Graduation" Steps: 20, Sampler: Euler, Schedule type: Beta, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 293551553, Size: 1152x896, Model hash: 275ef623d3, Model: flux1-dev-fp8, Beta schedule alpha: 0.6, Beta schedule beta: 0.6

Prompt: "Graduation" Steps: 20, Sampler: Euler, Schedule type: Beta, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 293551553, Size: 1152x896, Model hash: 275ef623d3, Model: flux1-dev-fp8, Beta schedule alpha: 0.6, Beta schedule beta: 0.6

I found that using anything more than a few words in your prompt would break the style of the image. the style is not consistent. Took a few rolls before I found a good seed that seemed to work.

171

1

How do people generate realistic anime characters like this? (old.reddit.com)

submitted 2 weeks ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/reyjand on 2024-10-06 09:29:39+00:00.

172

1

I recreated a look based off an image generated using a Flux LoRA of myself (www.reddit.com)

submitted 2 weeks ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Lozmosis on 2024-10-06 05:30:23+00:00.

173

1

🌊 Depth Pro with Depth Flow Workflow (old.reddit.com)

submitted 2 weeks ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/camenduru on 2024-10-06 04:21:40+00:00.

174

1

FacePoke and you can try it out right now! with Demo and code links (old.reddit.com)

submitted 2 weeks ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/flipflapthedoodoo on 2024-10-05 23:51:04+00:00.

175

1

Is this the future of photoshop? Invoke 5.0 (old.reddit.com)

submitted 2 weeks ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Fuzzy_Bathroom7441 on 2024-10-05 23:49:23+00:00.

Watch from 35:00

I've been using Invoke for only two weeks, and I’m truly impressed. Invoke's canvas, control layers, regional prompting, inpainting, and outpainting are mind-blowing. It’s heading towards a Photoshop-like experience, almost like what the Photoshop of the future would be. Invoke's UI design is so user-friendly and easy to navigate. I wonder why this web UI has been overlooked by so many users. I know most good features and flexibility are first available in ComfyUI, but Invoke offers a completely different user experience. Some features are only available for SDXL, but Flux's control layers are under development, and I hope to see them soon.