Lemmy World is down once again.

favrion@lemmy.ml to Fediverse@lemmy.ml – 189 points –

Ugh.

133

Move to a new, smaller, instance. You can still use lemmy.world as though it was still running at full speed. You can still post to lemmy.world or other federated communities and you experience won’t be so painful.

Lemmy.world is experiencing an influx of Redditors and with us good Redditors come our awful trolls. Growth, along with DDoS attacks have plagued the site since I began using it.

And if the thought of setting up another account annoys you, I've made a tool that will migrate your account settings, subscriptions, and blocks: https://github.com/CMahaff/lasim

Now that does require the source instance to be up long enough to download your profile, but after that you can upload to any instance you want and be running like nothing changed in like 2 minutes.

Tbh, I kinda like that we have these growing pains. Helps folks leave out older expectations of monolithic profit-oriented social platforms. And actually put some money down to help host the specific niche community they really want to exist.

I'll probably eat these words later but, as of this moment, I stand by it.

Yeah, this is the fediverse's way of telling us to break our old habit of piling on a single site and to spread out.

Except...it's being DDOS'd, so no, it isn't.

If anything this is basically establishing to everyone out there that if you want to kill an instance or encourage people to move to a different one (with different admins who might have different..."styles"), just DDOS it and promote your alternative instance as a refuge.

I'm sticking with .world because the admins there are chill. Don't feel like rolling the dice on a new instance where some power mods probably set up shop.

You do you, but I wouldn't say that lemm.ee, sh.itjust.works, sopuli.xyz or reddthat.col are power mods

Particularly sh.itjust.works. They're letting the community vote on instance rules and the Admin is really chill

A gaming instance would take a lot of load.

Isn't https://lemmy.zip/ gaming focused?

If you can't tell from the name it's not going to work well.

Blahaj is queer focused, but there is no way you can get that from the name

It's a tumblr meme and IKEA reacted to the association by using the shark as a prop in their pro-gay marriage ads in Switzerland. F1nnster has like twenty in a pile in the background as people keep buying them for him.

I didn't know blahaj was LGBTQ-related. That's so fucking cool!

I love IKEA.

I wish people would stop using this advice without some caveats. The instance you choose is also about the admins your choosing to have your account under.

I'll stay on Lemmy.world because I trust the admin there. Any time you jump to a new instance, you better hope it's run by levelheaded, fair-minded people.

You don't have to jump to tiny < 100 user instances though, any instance within the top 10 is a good alternative to lemmy.world. If everyone thinks the same as you do, then there would be no point in federation.

Any instance in the top 10 is not a “newer smaller instance.”

Y'know what, you're right, but in an ideal federated world, it is probably for the best if people branch out further than just the top 10 as well. The instance I'm on probably is not even within the top 50, but it's fast, performant, and has all my subscriptions. Not sure about the admins, but I also have alts on lemm.ee and sh.itjust.works in case this instance goes bust.

Yeah. There is little reason to sign up on the biggest server since you can see and interact with content from any of the servers.

There’s plenty of reason for a new user, but otherwise agreed. Big instances are training wheels… I say from the biggest kbin instance.

along with DDoS attacks

Is lemmy.world getting DDoSed? Who would profit from that? (honest question. I hadn't read this before.)

DDoS are sometimes just people thinking "because I can", not necessarily motivated by profit.

A smallish scale service like a lemmy server ran by volunteers seems like an easy target, so it wouldn't be surprising that being the case.

6 more...

For all the annoyance, a silver lining is that lemmy.world is testing lemmy at a relatively high scale lemmy doesn’t see anywhere else and so aiding in the development of the software and architectural guidelines for instance management.

Yes. These are growing pains. That’s a good thing.

If you want a fast, stable, and we'll funded instance, try some of the porn ones.

Only if it's Furry porn.

There is an instance centered around that actually, though I have no idea the state of it's stability or funding. I've made my account on a more general content furry-themed instance and found it to be pretty well run though, don't really have any complaints.

They're defederated with many of the instances though

Can you recommend any that are good? Lemmynsfw seems to only be celebrities and it sucks.

You have to be signed in to see the actual porn lol

Try pornlemmy.com

Are the porn instances not federated either? Seems like a mistake on their part

The porn instances got defederated by most of the "mainstream" instances, including this one, lemmy.ml.

That strikes me as a bad move, because free porn == user growth

I meant, are they defederated from each other - I have taken a look at lemmynsfw.com and don't recall seeing any content from pornlemmy.com. Though, I don't remember what my view settings were at the time do maybe I was just looking at local.

pornlemmy and lemmynsfw are both linked to each other, but there isn't much content on pornlemmy so far.

Honestly? That's great. It's stress-testing Lemmy and, to an extent, ActivityPub.

Growing pains. It's just gonna improve Lemmy in the long term.

If you don't like it, use another smaller instance like lemmy.zip or lemm.ee. You know, the entire point of decentralization.

I totally agree with your outlook and made a pretty similar post to yours a couple minutes ago. My only addition would be some concern as to why it seems like attacks are causing the downtime. The attacks do encourage improvement, but why do it in the first place. I'm hoping bored enthusiasts. At least it wouldn't be BS corporate attacks trying to eliminate competition.

Suffering from success.

Don't worry though they'll figure it out. The early days of Reddit were pretty unstable. Definitely sucks for now but it'll get better

That's why I use lemm.ee

I have accounts on a few instances, and lemm.ee is the quickest and most stable of them all. I don't know what they're doing, but it's great.

Can't find the post right now but sunaurus said that the main difference is that lemm.ee is the only instance using a horizontal setup, that is, there's multiple lemmy servers running on multiple servers behind a load balancer, all sharing a database (postgres itself clusters very well). The code isn't actually made for that so it's all rather custom and possibly specific to his hoster.

Less technical: sunaurus happens to be a beast of a devop and prolific contributor to the lemmy codebase. As such lemm.ee quite often runs code that's ahead of the release schedule, addressing stuff that he stumbled across while wearing his sysop hat.

lemdro.id also runs via horizontal scaling behind a load-balancer, soon to expand globally to keep response times down for people everywhere. We're very resilient :)

I got accounts on lemm.ee, sh.itjust.works and kbin.social. I had one on .world in the beginning, but the performance wasn't great. Probably too many users.

Probably too many users.

if local.lemmyusers > 15, crash constantly because of PostgreSQL nonsense logic and Rust ORM.

That’s why I use lemm.ee

1993: God, how we would love it if someone could tell us anything was “just that simple”, and then of course when you see a pie chart you go “Oh, a pie chart…”. I mean, it has more religious meaning now than a crucifix to see a pie chart. I mean, because…. why is that so popular? Because it reduces complexity. The complexity is very real but his little soundbites - 1993

@garpunkal@lemm.ee - do you know of the history of site_aggregates PostgreSQL table?

huh?

huh?

Please explain in detail what "huh" means in this context.

As I said in the comment you replied to: do you know of the history of site_aggregates PostgreSQL table?

Not OP, but I feel like it was Huh? as in what the heck are you talking about and why was it a reply to thier comment

no tell me more?

lemmy.ca staff was so frustrated with performance problems a couple weekends ago they cloned a copy of their database Running AUTO_EXPLAIN revealed site_aggregates logic in Lemmy was doing comment = comment + 1 counting against 1500 rows, for every known Lemmy instance in the database, instead of just writing 1 row.

Dude just move to a small, updated instance with good uptime. I joined aussie.zone and its never down plus feels so much snappier.

i tried this, is it normal to never receive the verification email? it says verification sent, i tried 4 diff smaller instances, and its been like 10 hours. i checked the spam too

🤣🤣 its just you dude. I have switched instances multiple times and never had an issue like you described. Try aussie.zone instance

i am not australian but why not lol

edit: meh nvm, dont wanna wait to be accepted. im american anyway,

Doesn't matter. Even I am not Australian but I am using aussie.zone

That’s why I joined my local “area of my country” instance. I just need to subscribe to all my communities here and I’ll be good. The app I use (Memmy) let’s you easily switch between profiles on any instance too.

Until that other one goes down, then memmy really struggles to let you in. At least that’s what happened to me.

I guess I’m betting they don’t go down at once. I guess it can happen tho.

No I mean there’s some weird bug in Memmy. My second account went down and I had to reinstall Memmy to access my first one

I wanted to stay with World. But they are literally constantly having severe problems (errors, voting, comments). Gave them many chances. I'm sorry World, but I probably would abandon Lemmy if I had to stay, it's just not an enjoyable experience.

Maybe it's not even possible to keep a large (targeted) instance working with current limitations and tools? Hope they figure it out. I'm rooming with Stux at Geddit, he's cool. So many cats...

Having glanced at the code and taken my own instance down a few times cleaning up a surprisingly small number of automated posts, it's definitely the combination of some design choices in the code and the scale of lemmy.world. Keeping an instance up that has so many posts and communities has been difficult on my instance, and I'm basically the only user. I can imagine with the scale and lemmy.world load and publicity, it's nearly impossible until some improvements to the data layer are made.

For one example, purging a community with 1k posts and 30k comments (I was messing with a bot) took my instance down for 2 hours with the postgres database pegged at a full core minimum. And then it took down my instance. And then I restarted the database but presumably this was done in a transaction so no progress had been made.

I'm personally impressed with the amount of uptime lemmy.world has managed. And I'm also impressed with lemmy overall, but it's pretty clear there has been some rapid growth that, as it usually does, exposed some of the limits of the design and requires some improvements for the current scale.

As a user with pretty limited knowledge on servers, I appreciate how easy it is to flip from instance to instance. One goes down and I use a backup account somewhere else. I'm really not keeping close attention to which one I'm on at the moment. It gives me access and lets me comment/post. I'm not trying to build any profile so it doesn't matter. I could see a major poster or mod want multiple accounts with the same name for consistency sake though.

You can check lemmy.world's status page to see when it’s down.

Problem is that many times it will say "partial outage" but the website doesnt even work so technically it's a full outage. I assume it's to keep the uptime % as high as they can. So that 98.XX% uptime isnt very accurate at all.

What is their motivation about lying about uptime? It isn't a business with advertisers, it is some dudes hobby server and some people who are donating despite what the uptime percentage is

correct, I dont know if it's automatic to partial outage and manual trigger to full or how that works in their backend. But almost every time I've seen a partial (orange) outage, it's a full blown outage.

I moved to sopuli.xyz because of this. I can still subscribe to all the communities I like so no point in staying in an instance that's constantly down.

Problem is, so many communities are on .world now, so it hurts even from another instance

I hope people on the Fediverse will finally learn not to choose the biggest instance all the time

I think it's more like the previous commentor said. It's the communities more than the users. Every post, comment, like needs to be sent to every other instance that subscribes to the community. I suspect it's definitely connected to federation. The reason being, at 20:00 utc yesterday lemmy.world stopped sending my instance anything (previously it was between 2 and 5 messages a second). It only started again at around 00:00 utc. I wonder if they were slowly adding instances back to federation?

In any case the load for that many communities with that many other instances must be huge. The advantages of the fediverse requires that communities AND users are spread between instances. In the current climate, the super instances have most of both and it must be becoming exponentially harder to keep up with hardware requirements for this.

That's a very valid point. Sometimes I question if very small instances (1-10 users) are not more detrimental than anything to the general performance

Whose fault is it though? If an instance is capable of 100 concurrent users but everyone flocks to the two or three big instances. What to do? Block instances so they shutdown? Then when the shit really hits the fan there's nowhere to distribute users to.

In the case of lemmy.world I might suggest they split the instance. Original lemmy.world keeps the communities but has no users. Create a new instance and transfer the users. That way the first instance is dedicated to federating the communities, moving the real time user database hits to a separate database. I'd also suggest preventing the creation of new communities on that instance.

In real terms it'd have been better if the communities were shared between instances more. Making a more even spread of the one to many distribution efforts.

sounds like a cool idea. hit Ruud up once they're less busy.

Ya still down for me. Even says my account doesn't even exist.

Can confirm, my main account and half my subscribed communities are down. Possibly unrelated, but my All is also screwy even though I'm on lemm.ee.

That's unfortunate. Hope that they are back soon.

Now that I've found a workable userstyle that gives kbin the same information density as old reddit (Narwhal) it may be time to switch over here. For better or worse kbin's funding situation seems a bit more ironclad. Also the fact that I can check Lemmy communities and do Mastodon at the same time is pretty attractive.

BACK ONLINE! YEY!

As of the time of this comment, now crashing

AGAIN Down

Name a more iconic duo, I'll not wait

Ruggus (a former reddit alternative) and outages.

It was perhaps the single biggest unifying meme among it's user base that their backend was an absolute dumpster fire (tm)

Right Now

Working, this comment time

yep, another big outage.

5801ms, terrible

This is what happens when people don't understand federation.

Yep. Sitting on Lemmy.today browsing Lemmy.world posts right now...so I don't know. Really advice people to not have just one account. :)

Do you know of the site_aggregates federation TRIGGER issue lemmy.ca exposed?

No. Care to explain please?

No. Care to explain please?

On Saturday July 22, 2023... the SysOp of Lemmy.ca got so frustrated with constant overload crashes they cloned their PostgreSQL database and ran AUTO_EXPLAIN on it. They found 1675 rows being written to disk (missive I/O, PostgreSQL WAL activity) for every single UPDATE SQL to a comment/post. They shared details on Github and the PostgreSQL TRIGGER that Lemmy 0.18.2 and earlier had was scrutinized.

You've become fixated on this issue but if you look at the original bug, phiresky says it's fixed in 0.18.3

The issue isn't who fixed it it, the issue is the lack of testing to find these bugs. It was there for years before anyone noticed it was hammering PostgreSQL on every new comment and post to update data that the code never read back.

There have been multiple data overrun situations, wasting server resources.

But now Lemmy has you and Phiresky looking over the database and optimizing things so things like this should be found a lot quicker. I think you probably underestimate your value and the gratitude people feel for your insight and input.

In layman's terms please?

Every time you perform an action like commenting, you expect it to maybe update a few things. The post will increase the number of comments so it updates that, your comment is added to the list so those links are created, your comment is written to the database itself, etc. Each action has a cost, let's say it costs a dollar every update. Then each comment would cost $3, $1 for each action.

What if instead of doing 3 things each time you posted a comment, it did 1300 things. And it did the same for everyone else posting a comment. Each comment now costs $1300. You would run out of cash pretty quickly unless you were a billionaire. Using computing power is like spending cash, and lemmy.world are not billionaires.

What if instead of doing 3 things each time you posted a comment, it did 1300 things. And it did the same for everyone else posting a comment.

Yes, that is what was happening in Lemmy before lemmy.ca called it out with AUTO_EXPLAIN PostgeSQL on Saturday, 8 days ago.

What are you asking for? lemmy.ml is the official developers server, and it crashes constantly, every 10 minutes it ERROR out, for 65 days in a row.

I don't know that it's a DB design flaw if we're talking about federation messages to other instances inboxes (which created rows of that magnitude for updates does sound like federation messages outbound to me). Those need to be added somewhere. On kbin, if installed using the instructions as-is, we're using rabbitmq (but there is an option to write to db). But failures do end up hitting sql still and rabbit is still storing this on the drive. So unless you have a dedicated separate rabbitmq server it makes little difference in terms of hits to storage.

It's hard to avoid storing them somewhere, you need to be able to know when they've been sent or if there are temporary errors store them until they can be sent. There needs to be a way to recover from a crash/reboot/restart of services and handle other instances being offline for a short time.

EDIT: Just read the issue (it's linked a few comments down) it actually looks like a weird pgsql reaction to a trigger. Not based on the number of connected instances like I thought.

(which created rows of that magnitude for updates does sound like federation messages outbound to me)

rows=1675 from lemmy.ca here: https://github.com/LemmyNet/lemmy/issues/3165#issuecomment-1646673946

It was not about outbound federation messages. It was about counting the number of comments and posts for the sidebar on the right of lemmy-ui to show statistics about the content. site_aggregates is about counting.

Yep I read through it in the end. Looks like they were applying changes to all rows in a table instead of just one on a trigger. The first part of my comment was based on reading comments here. I'd not seen the link to the issue at that stage. Hence the edit I made.

Latest, at the time of this comment: still over 4 SECONDS

Fresh as of comment time: