this post was submitted on 27 Jan 2025
209 points (90.0% liked)

196

17057 readers
766 users here now

Be sure to follow the rule before you head out.


Rule: You must post before you leave.



Other rules

Behavior rules:

Posting rules:

NSFW: NSFW content is permitted but it must be tagged and have content warnings. Anything that doesn't adhere to this will be removed. Content warnings should be added like: [penis], [explicit description of sex]. Non-sexualized breasts of any gender are not considered inappropriate and therefore do not need to be blurred/tagged.

If you have any questions, feel free to contact us on our matrix channel or email.

Other 196's:

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] kromem@lemmy.world 3 points 1 month ago

There is a reluctance to discuss at a weight level - this graphs out refusals for criticism of different countries for different models:

https://x.com/xlr8harder/status/1884705342614835573

But the OP's refusal is occurring at a provider level and is the kind that would intercept even when the model relaxes in longer contexts (which happens for nearly every model).

At a weight level, nearly all alignment lasts only a few pages of context.

But intercepted refusals occur across the context window.