StableDiffusion

98 readers
1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago
MODERATORS
976
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/uncletravellingmatt on 2024-08-20 03:01:24+00:00.

977
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/andreigeorgescu on 2024-08-20 02:52:15+00:00.

978
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/balianone on 2024-08-20 02:29:44+00:00.

979
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Striking-Long-2960 on 2024-08-20 01:02:17+00:00.

980
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Livid-Fly- on 2024-08-19 23:25:40+00:00.

981
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/-becausereasons- on 2024-08-19 22:58:32+00:00.


Flux really struggles with view points. For instance top down, low from the ground up etc. I've been having considerable challange direction the camera.

982
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Standard-Anybody on 2024-08-19 22:36:52+00:00.


This is a cross post from r/LocalLLaMA but with a bit of emphasis on why this might be interesting for image generation.

Existing methods for adapting large language models (LLMs) to new tasks are not suited to multi-task adaptation because they modify all the model weights -- causing destructive interference between tasks. The resulting effects, such as catastrophic forgetting of earlier tasks, make it challenging to obtain good performance on multiple tasks at the same time. To mitigate this, we propose Lottery Ticket Adaptation (LoTA), a sparse adaptation method that identifies and optimizes only a sparse subnetwork of the model. We evaluate LoTA on a wide range of challenging tasks such as instruction following, reasoning, math, and summarization. LoTA obtains better performance than full fine-tuning and low-rank adaptation (LoRA), and maintains good performance even after training on other tasks -- thus, avoiding catastrophic forgetting. By extracting and fine-tuning over lottery tickets (or sparse task vectors), LoTA also enables model merging over highly dissimilar tasks.

983
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Healthy-Nebula-3603 on 2024-08-19 21:23:26+00:00.

984
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ThunderBR2 on 2024-08-19 19:33:32+00:00.

985
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/JellyDreams_ on 2024-08-19 19:11:54+00:00.

986
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ramainen_ainu on 2024-08-19 16:34:07+00:00.

987
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/not5 on 2024-08-19 21:48:05+00:00.


Hi all!

Andrea here, you might remember me from some product photography relighting videos and workflows.

Anyway, since I work in the genAI field, and Flux Dev seems to be the model of choice in the (pun unintended) dev's world, I thought I'd ask my lawyer a legal opinion about the license agreement, and his opinion seem to be the opposite of what the community here usually upvotes.

I thought it'd be cool to start a discussion on it, because I've seen so many opposite opinions here and on GitHub / HuggingFace / YT / Discord that I'd be happy if someone in the same position as I am wanted to share their findings as well.

THE DIFFERENCES

My lawyer's opinion:

- no commercial use of the model and outputs, regardless of article 2 (d), about outputs ownership

Community's opinion:

- no commercial use of the model for finetuning and as the backbone of a service, no commercial use of the outputs for training, because of article 2 (d), about outputs ownership

ARTICLE 2 (D) AND 1 (C)

The article in question states:

Outputs. We claim no ownership rights in and to the Outputs. You are solely responsible for the Outputs you generate and their subsequent uses in accordance with this License. You may use Output for any purpose (including for commercial purposes), except as expressly prohibited herein. You may not use the Output to train, fine-tune or distill a model that is competitive with the FLUX.1 [dev] Model.

My lawyer indicated that "except as expressly prohibited herein" can refer to article 1 (C), which states:

“Non-Commercial Purpose” means any of the following uses, but only so far as you do not receive any direct or indirect payment arising from the use of the model or its output: (i) personal use for research, experiment, and testing for the benefit of public knowledge, personal study, private entertainment, hobby projects, or otherwise not directly or indirectly connected to any commercial activities, business operations, or employment responsibilities; (ii) use by commercial or for-profit entities for testing, evaluation, or non-commercial research and development in a non-production environment, (iii) use by any charitable organization for charitable purposes, or for testing or evaluation. For clarity, use for revenue-generating activity or direct interactions with or impacts on end users, or use to train, fine tune or distill other models for commercial use is not a Non-Commercial purpose.

thus making it virtually impossible to use the outputs in any commercial way, because under (II) there is a stated potential use by commercial or for-profit entities, and in this case the only licit way to use it would be for testing, evaluation, or non commercial R&D, paving the way to license adoption if the testing yields satisfactory results.

His theory is that BFL specified the non-ownership of outputs under 2 (d) in order to a) distance themselves from unforeseeable or unwanted outputs, b) reiterate on the public domain nature of outputs, and c) making it effectively impossible to create commercially usable outputs because of article 1 (III).

The community, on the other hand, seems to be set on interpreting the whole of article 1 as a collection of definitions, and article 2 (d) as the actual license agreement. This is mostly because of a) article 2's name (License Grant), and b) (IMO) the inherent preference for a more permissive license.

As such, the community steers towards reading the license in such a way that the non-commercial use of the model only applies to the model itself and not the outputs, as if the two were separable not only theoretically but also in practice. It's this in practice that I'm having troubles reconciling.

OTHER PEOPLE'S OPINIONS

A startup I'm working with has asked their lawyers, and they're quite puzzled by the vagueness created by article 2 (d). They suggest asking BLF themselves.

Matteo (Latent Vision, or Cubiq, the dev behind IPAdapter Plus)'s latest Flux video was released without monetization, with him explaining that the license wouldn't permit monetizing the video (even if IMO, if the community's interpretation of the license agreement was correct, YT videos would fall under article 1 (c) (I), " testing for the benefit of public knowledge".

WHAT I'M DOING

For now, I'm both asking you here and writing an email to BFL hoping for some clarification on the matter. In the meantime, I'm waiting to develop further on Flux Dev just to err on the side of caution.

Did anyone in the community here ask their lawyer(s) about their opinion on this license?

988
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/shokuninstudio on 2024-08-19 19:17:02+00:00.

989
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Sroyz on 2024-08-19 15:41:39+00:00.

990
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/DigitalCloudNine on 2024-08-19 12:33:57+00:00.

991
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/BootstrapGuy on 2024-08-19 16:03:21+00:00.

992
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/morerice4u on 2024-08-19 15:51:41+00:00.

993
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Zombiehellmonkey88 on 2024-08-19 15:12:25+00:00.

994
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/rolux on 2024-08-19 14:09:00+00:00.

995
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Enturbulated on 2024-08-19 11:10:06+00:00.


Forge just updated [1] and added a bunch of GGUF quant types! And, gee, the list of types there looks just about exactly like the list of quants available from city96[2]. Gee, What a coincidence!

So what the hell, here goes!

[Images got eaten on post? Eh, link here: ]

Prompt: (photograph of) a perky pug in a cluttered kitchen, the floor strewn with the remains of cereal boxes, standing next to a sign reading "I found food!"

CFG 1, Distilled CFG 3, 896 x 1152, Euler / Simple, 20 steps

Running Forge, from commit e5f213c "upload some GGUF supports"

RTX 2060 6GB, using CPU for RNG

Four images per quant, seeds 4111973613 through 4111973616

flux1-dev-Q2_K 6m16s total

flux1-dev-Q3_K_S 7m15s total

flux1-dev-Q4_0 6m40s total

flux1-dev-Q4_1 6m51s total

flux1-dev-Q4_K_S 6m50s total

flux1-dev-Q5_0 8m36s total

flux1-dev-Q5_1 8m34s total

flux1-dev-Q5_K_S run errored - sha256sum matches with huggingface's info - skip for now!

flux1-dev-Q6_K 7m52s total

flux1-dev-Q8_0 7m10s total

Mostly straightforward increase in detail as quant size goes up, or increase in variance in images as quant type goes down in size. Also as expected, time increase with file size increase, though Q3 and Q5 take more time (because more steps) to process. As always, expect some variation with different hardware though the trends tend to hold.

Worth a mention that model behavior may be influenced in other ways. Playing with dev and schnell Q8 and Q4 a little bit before this, Q8 may more readily change image styles if it sees keywords (or combinations of keywords) that suggest a style. Shouldn't be surprised at other aesthetic quirks that surface between the different quants. Not explored this enough to say much else about that yet though.

Oh, if you're wondering about choice of subject material? Yes, I'm a bit annoyed at a lovely puppy that did nothing wrong but take advantage of opportunity people should not have given it ... repeatedly! :-)

[1]

[2]

996
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/darkside1977 on 2024-08-19 13:41:13+00:00.

997
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Firm_Ear9809 on 2024-08-19 12:50:36+00:00.

998
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Bobsprout on 2024-08-19 12:16:07+00:00.

999
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Djo226 on 2024-08-19 10:52:39+00:00.

1000
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/CaptTechno on 2024-08-19 09:24:42+00:00.


Currently I really like UltraSharp4x.

view more: ‹ prev next ›