this post was submitted on 21 Jul 2023

95 points (97.0% liked)

Fediverse

28494 readers

510 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Posts must be on topic.
Be respectful of others.
Cite the sources used for graphs and other statistics.
Follow the general Lemmy.world rules.

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 2 years ago

MODERATORS

ruud@lemmy.world

Xylinna@lemmy.world

MrCenny@lemmy.world

TragicNotCute@lemmy.world

automodbeta@lemmy.world

woelkchen@lemmy.world

Why is serveral instances down simultaneously? Is it just me? (lemm.ee)

submitted 1 year ago* (last edited 1 year ago) by OhNoMyInstanceIsDown@lemm.ee to c/fediverse@lemmy.world

71 comments fedilink hide all child comments

Can't post images because they're too big so here's imgur: https://imgur.com/a/Fm52ZTB

Edit: lemmy.ml and lemmy.world seem to have come back, I'm just a bit worried that it's another one of those hacks.

Edit 2: Most of those I've tried came back. reddthat.com and sh.itjust.works seems to still be down

top 49 comments

sorted by: hot top controversial new old

[–] badmin@lemm.ee 37 points 1 year ago (1 children)

sh.itjust.works admin via matrix:

"looks like a bunch of instances are under attack at the moment"

[–] Shadow@lemmy.ca 4 points 1 year ago* (last edited 1 year ago)

Not an attack.

See https://github.com/LemmyNet/lemmy/issues/3649 and https://github.com/LemmyNet/lemmy/issues/3165

[–] kid2908@lemmynsfw.com 37 points 1 year ago (2 children)

Here is the reason for lemmy.fmhy.ml

https://very.bignutty.xyz/notes/9hf13it1ced3b2za

.ml domains (the one fmhy.ml was on) has been reclaimed by the Mali government

Freenom is also being sued by Meta (and has been for the past few months)

Both of these have resulted in fmhy, along with a lot of other domains, to be unresolvable

Changing domains will cause us to have to refederate and start mostly from scratch (although we might be able to transfer posts and users)

[–] kid2908@lemmynsfw.com 24 points 1 year ago

And it seem to getting worse for lemmy fmhy

https://very.bignutty.xyz/notes/9hg4dquksvbha67h

All services, except Lemmy, is up and running again via the new domain: https://fmhy.net

Lemmy itself will require a significant database cleanup to get users transferred, posts and communities may not be possible

Cleanup will have to be done either manually or with a custom script

And now to answer some common questions I've seen floating around:

Why is lemmy.ml not affected? WE DON'T KNOW. My assumption is that it's popular (or lucky enough) to not be affected by this change. The only difference between our two domains is that lemmy.ml doesn't use CF, and AFAIK, multiple other domains (with and without cf) are still unreachable.

Why can't you just change the domain? That's not how federation works. Most services (including Mastodon and Lemmy) do not support changing URL properly, as doing so could potentially break the whole network.

Why is it taking so long to get back up? Not all staff members have access to the server, CF, or domain registrar. Not to mention that this is a complicated task that requires a lot of effort, and one that we didn't even know we had to do until the following morning. (We were under the assumption that the .ml TLD didn't just explode.)

[–] kmkz_ninja@lemmy.world 7 points 1 year ago (1 children)

What is Freenom and why are we supposed to know what that is?

[–] floridaman@lemmy.blahaj.zone 8 points 1 year ago* (last edited 1 year ago)

Freenom in a domains registrar that gives out 'free' domains for up to a year. They were the registrar for .ml domains until this happened.

Edit: spelling

[–] OverfedRaccoon@lemmy.world 22 points 1 year ago* (last edited 1 year ago) (1 children)

Not saying it was a coordinated attack (per your edit), but anything popular is a prime target for various types of attacks, especially easier stuff like DDoS. But with every attack, the developers and various admins/owners of instances learn something new and how to mitigate it. So while it's annoying, it's just as much a blessing as it is a curse - better to patch things quickly than leave an exploitable hole open for who knows how long with access to who knows what.

[–] Linuturk@lemmy.onitato.com 12 points 1 year ago (2 children)

I run my own single user instance, and it was down as well. Not sure why someone would target a single user instance. Not ruling it out, but it seems unlikely.

[–] mrmanager@lemmy.today 2 points 1 year ago* (last edited 1 year ago)

All instances using federation are publicly visible and it's simple to script attacking all of us.

However it's even easier to just attack Lemmy.world since almost everyone is there and it will have maximum disruption on everyone. People have centralized on one server. :)

But it's just some denial of service attacks right now. Eventually they probably get tired of it too. There is no point to it really.

[–] ApathyTree@lemmy.dbzer0.com 2 points 1 year ago (1 children)

I’m thinking about doing this for making my main account, since I never look at my local feed anyway.

How’s the experience? And do you know of a good starting point?

[–] Linuturk@lemmy.onitato.com 5 points 1 year ago (1 children)

The experience is pretty good except for discoverability of new communities. My Subscribed and All feeds are the same. I started with the official local development docker-compose file and massaged it into place for my setup.

[–] ApathyTree@lemmy.dbzer0.com 1 points 1 year ago

Ah, yeah that could be problematic. Maybe I’ll hold off until accounts can be migrated, and port this one over to expand the content. Start subbing to everything I even tangentially like 😅. Tho maybe they will fix the discoverability issue - sounds like it happens kinda a lot.

Thanks for replying!

[–] YoungPrinceAmmon@lemmy.world 11 points 1 year ago

Yeah, for me too. Seems like whole fediverse has issues, problems with loading posts etc. at least for me

[–] Blaze@iusearchlinux.fyi 11 points 1 year ago (2 children)

Same here, sh.itjust.works is currently down:

https://www.isitdownrightnow.com/sh.itjust.works.html

[–] dandroid@dandroid.app 24 points 1 year ago (2 children)

sh.itdont.work

[–] Blaze@iusearchlinux.fyi 8 points 1 year ago (2 children)

It does most of the time, to be honest

[–] dandroid@dandroid.app 13 points 1 year ago

I know, but I couldn't pass up this opportunity to make that joke.

[–] A10@kerala.party 3 points 1 year ago (1 children)

Nice Domain Blaze 😀

[–] Blaze@iusearchlinux.fyi 2 points 1 year ago

Thank you! 😄

[–] OhNoMyInstanceIsDown@lemm.ee 1 points 1 year ago

sh.itjustdoesn'tfucking.work

[–] badmin@lemm.ee 2 points 1 year ago

Which is not explained by .ml fiasco, it is worth pointing out.

[–] hitagi@ani.social 10 points 1 year ago (2 children)

There is a GitHub issue on it and I experienced the exact same thing with my instance. A timeout occurs and the only way to fix it is to restart it seems. Like everyone else, it's strange that it all happened at the same time.

[–] zalack@kbin.social 5 points 1 year ago* (last edited 1 year ago) (2 children)

It's not that strange. A timeout occurs on several servers overnight, and maybe a bunch of Lemmy instances are all run in the same timezone, so all their admins wake up around the same time and fix it.

Well it's a timeout, so by fixing it at the same time the admins have "synchronized" when timeouts across their servers are likely to occur again since it's tangentially related to time. They're likely to all fail again around the same moment.

It's kind of similar to the thundering herd where a bunch of things getting errors will synchronize their retries in a giant herd and strain the server. It's why good clients will add exponential backoff AND jitter (a little bit of randomness to when the retry is done, not just every x^2 seconds). That way if you have a million clients, it's less likely that all 1,000,000 of them will attempt a retry at the extract same time, because they all got an error from your server at the same time when it failed.

Edit: looked at the ticket and it's not exactly the kind of timeout I was thinking of.

This timeout might be caused by something that's loosely a function of time or resources usage. If it's resource usage, because the servers are federated, those spikes might happen across servers as everything is pushing events to subscribers. So, failure gets synchronized.

Or it could just be a coincidence. We as humans like to look for patterns in random events.

[–] hitagi@ani.social 3 points 1 year ago

Interesting. Never thought of it that way.

[–] Blaze@iusearchlinux.fyi 1 points 1 year ago

Interesting

[–] badmin@lemm.ee 3 points 1 year ago (1 children)

wrong issue lol

[–] hitagi@ani.social 1 points 1 year ago (1 children)

This probably makes more sense although the issue I was experiencing earlier had similar logs as the issue I linked and others have commented on it too around the same time. I'm guessing they're related.

[–] Shadow@lemmy.ca 2 points 1 year ago (1 children)

The original issue is just a symptom of all database threads being tied up. People just don't know how to follow an error message to the root cause.

The real source of the issue is db locking from triggers and cascading deletes on a major user change.

My report in https://github.com/LemmyNet/lemmy/issues/3649 has the offending query.

[–] hitagi@ani.social 2 points 1 year ago

Thanks for clarifying.

[–] ryry1985@lemmy.world 6 points 1 year ago* (last edited 1 year ago)

I had that too. Tried multiple instances myself. Nice username btw.

[–] ArchmageAzor@lemmy.world 4 points 1 year ago (1 children)

Another DDOS? There's been a few of them lately

[–] Blaze@iusearchlinux.fyi 8 points 1 year ago

Probably a sign that the platform is getting more popular, nobody wants to DDoS a place without any users

[–] C3D@lemmy.world 4 points 1 year ago

It seem like it's a bug which spread through more than a ddos.

Restarting the backend of an instance should fix the problem otherwise web users could try clearing the cookies

[–] whiskers@lemmy.world 3 points 1 year ago

I was facing issues with lemmy.world too, now seems to be okay

[–] rglullis@communick.news 3 points 1 year ago* (last edited 1 year ago) (1 children)

For those affected by these outages in the larger servers and who'd be interested in helping spread around: my instance will be free for the first 250 registrations. There are ~220 spots still left.

The catch is that my registration process is (purposefully) difficult to avoid squatters/spammers/bots. So you need to do one of the following:

sign up to my main portal site first. If you do that, please give your email address so that I can send a confirmation with the first credentials.
signup to communick.news directly, and send me a DM on reddit (/u/rglullis), or here, or on mastodon) with the username you used to signup. If you don't send the DM, I will assume it's a bot and will deny the application.

Edit: downvoters, please don't be so cynical. I've been offering this even before the reddit blackout. What is so bad about it?

[–] Blaze@iusearchlinux.fyi 4 points 1 year ago

Thanks for sharing this, don't mind the downvotes

[–] donut4ever@lemm.ee 3 points 1 year ago* (last edited 1 year ago) (1 children)

Only one working for me is ~~lemme~~lemm.ee. Lemmy.world and sh.itjust.works don't work

[–] YoBuckStopsHere@lemmy.world 2 points 1 year ago (1 children)

Lemmy.world was only down for ten minutes

[–] donut4ever@lemm.ee 1 points 1 year ago

Everything is back up now. I used connect for Lemmy and it actually tells you why an instance is not working. Lemmy.world was under maintenance, and the other one was just down.

[–] null@zerobytes.monster 2 points 1 year ago* (last edited 1 year ago) (1 children)

Overloaded servers (cpu,ram,disk), not enough bandwidth, endless possibillities...

[–] HedonismB0t@lemmy.ml 2 points 1 year ago (1 children)

Occuring on multiple instances at the same time? Unlikely.

[–] null@zerobytes.monster 1 points 1 year ago* (last edited 1 year ago)

idk, my server is in good shape 😅 But could be some bug in code which overload them.

[–] flyingcloud11@lemmy.world 2 points 1 year ago (2 children)

That’s the problem with the fediverse. They tend to go down a lot. Same thing happens with mastodon servers.

[–] master5o1 6 points 1 year ago

So does reddit.

[–] Blaze@iusearchlinux.fyi 4 points 1 year ago (1 children)

That's fine by me. Opportunity to do something else.

[–] morganpurr@lemmynsfw.com 5 points 1 year ago

But infinite contenttttt engagementtttt shareholderssssd

[–] jsveiga@feddit.nl 2 points 1 year ago* (last edited 1 year ago) (1 children)

When vlemmy.net disappeared (and it was the only one I had registered to), I registered at lemmy.world, sh.itjust.works, and while I was trying to register at lemmy.ML, I registered at feddit.NL by mistake. (then I requested a login at lemmy.ml, but never got a confirmation).

At the moment feddit.NL is the only instance I have a login at that I can use.

That was a happy mistake :-)

Maybe the problem is me. If feddit.nl goes down, I'll know for sure.

[–] Blaze@iusearchlinux.fyi 2 points 1 year ago

Ha ha ha

[–] freamon@endlesstalk.org 1 points 1 year ago

I get that message pretty much every time I visit a Community that I haven't visited before (or maybe one that no-else from my instance has visited before).

It get fixed on a refresh (like the message suggests), so I'm guessing it's a problem with time-out settings when my instance has to pull in a lot of new data.

load more comments