Lemmy World outages
Hello there!
It has been a while since our last update, but it's about time to address the elephant in the room: downtimes. Lemmy.World has been having multiple downtimes a day for quite a while now. And we want to take the time to address some of the concerns and misconceptions that have been spread in chatrooms, memes and various comments in Lemmy communities.
So let's go over some of these misconceptions together.
"Lemmy.World is too big and that is bad for the fediverse".
While one thing is true, we are the biggest Lemmy instance, we are far from the biggest in the Fediverse. If you want actual numbers you can have a look here: https://fedidb.org/network
The entire Lemmy fediverse is still in its infancy and even though we don't like to compare ourselves to Reddit it gives you something comparable. The entire amount of Lemmy users on all instances combined is currently 444,876 which is still nothing compared to a medium sized subreddit. There are some points that can be made that it is better to spread the load of users and communities across other instances, but let us make it clear that this is not a technical problem.
And even in a decentralised system, there will always be bigger and smaller blocks within; such would be the nature of any platform looking to be shaped by its members.
"Lemmy.World should close down registrations"
Lemmy.World is being linked in a number of Reddit subreddits and in Lemmy apps. Imagine if new users land here and they have no way to sign up. We have to assume that most new users have no information on how the Fediverse works and making them read a full page of what's what would scare a lot of those people off. They probably wouldn't even take the time to read why registrations would be closed, move on and not join the Fediverse at all. What we want to do, however, is inform the users before they sign up, without closing registrations. The option is already built into Lemmy but only available on Lemmy.ml - so a ticket was created with the development team to make these available to other instance Admins. Here is the post on Lemmy Github.
Which brings us to the third point:
"Lemmy.World can not handle the load, that's why the server is down all the time"
This is simply not true. There are no financial issues to upgrade the hardware, should that be required; but that is not the solution to this problem.
The problem is that for a couple of hours every day we are under a DDOS attack. It's a never-ending game of whack-a-mole where we close one attack vector and they'll start using another one. Without going too much into detail and expose too much, there are some very 'expensive' sql queries in Lemmy - actions or features that take up seconds instead of milliseconds to execute. And by by executing them by the thousand a minute you can overload the database server.
So who is attacking us? One thing that is clear is that those responsible of these attacks know the ins and outs of Lemmy. They know which database requests are the most taxing and they are always quick to find another as soon as we close one off. That's one of the only things we know for sure about our attackers. Being the biggest instance and having defederated with a couple of instances has made us a target.
"Why do they need another sysop who works for free"
Everyone involved with LW works as a volunteer. The money that is donated goes to operational costs only - so hardware and infrastructure. And while we understand that working as a volunteer is not for everyone, nobody is forcing anyone to do anything. As a volunteer you decide how much of your free time you are willing to spend on this project, a service that is also being provided for free.
We will leave this thread pinned locally for a while and we will try to reply to genuine questions or concerns as soon as we can.
Have you guys contacted law enforcement? It may surprise you. A startup I worked for had the same issue and contacted the FBI. They were able to quickly (within hours) find the person doing it despite him using VPNs and other tools for OpSec.
I’d imagine that there are a lot of users and communities on here that want law enforcement as far away from the Fediverse as possible…
And yet, and this will shock and amaze you, they're probably here already. Lemmy isn't a secret.
Found the fed… ;)
No doubt, but there’s a difference between a van trundling down the street and a welcome mat and a tray of tea cooling in the living room.
I get you. There's good and bad in law enforcement, especially when it comes to tech and social media. On the one hand, there's pretty serious crime happening online that needs to be stopped. On the other, wild invasions of privacy. There's no easy answer at this point and governments obviously won't police themselves.
Illegal activity is actually easier to track on the Fediverse than close source websites. Easy to program bots to run through open source code looking for it.
I assure you that the FBI knew of lemmy and had watchers here before we hit 5 digit user numbers
I hate to break the illusion but cybersecurity experts already know about every Fediverse instance and it gets scanned regularly. Just like they do discord, FB, twitter, etc.
aint no way
Lemmy isn't a private space. It's less private than Reddit in many regards.
I don't see why when illegal things are happening the government's offered services shouldn't be made use of
Given that the goal of this instance is to serve as a reference of the Fediverse, it is expected that it will continue to grow, and in turn, attract more attention, which due to a game of numbers also involves more trolls and enemies. Thus, the fact that the instance is being DDOS'ed right now shouldn't be seen as a conjunctural problem, but rather a challenge that is here to stay and sometimes be a problem.
While I think it's a good idea for lemmy.world to do it this time, relying on a police force to routinely come to our call and do something means periods during which the instance will be out while we wait for them for work. The instance, and Lemmy in general, should have more robust defenses so that calling for external help is only required at exceptional times.
Did it result in charges for the person doing it?
For this, I want to see the motivation for DDOSing Lemmy lol.
There was a user who made hundreds of communities and got pissy when they were banned, there's heavy speculation that it's them.
That, or it could be right-wing neo-nazi chuds from the detonating-craniums instance that are butthurt that nobody wants to federate with them.
Could be reddit , hiring people to kill the competition 😅 (jk)
You joke, but I wouldn't be surprised in the least.
You don't need motive to convict. Just the correct mental state (mens rea) and the commission of the relevant elements (actus reus). Motive helps, but it's not necessary.
But a DDOS attack would probably fall under the CFAA, possibly some other criminal statutes depending on the facts.
I know, I just want to know what the motive is.
In all seriousness, we all appreciate your work. These are the growing pains that are to be expected, and your hard work and transparency (and writing it up at a level that even I can understand) is welcome.
Im a data engineer with 20+ years of experience in sql and various databases, I do performance tuning on daily basis. How can I help? Please message me if you think you can use me. Id be very happy to help where I can!
Possibly not ideal for you as a data engineer, but you could try skimming down the GitHub database issues?
Id rather someone just point me to the problematic query obviously. Would be much easier and better use of my time than to run my own instance and fake data into tables to see where the bottle neck is …
I'm sure if you got in touch with the lemmy.world team and they made sure you weren't sketchy, they would help you get what you need to effectively find the root of the issues.
Also, hello from a less-experienced, not-sure-if-I-can-call-myself-a-data-engineer-yet data... person? Props for considering lending a hand to the instance!
People like you are the saviors
If you can start poking around in their GitHub.
I have huge respect for data engineers. Talk about unsung heroes. Thank you for everything you do.
Oracle gives me a headache thinking about it and once things get complicated with an enormous amount of tables and data, I leave it to people who know better. I will go back to programming PLCs, explaining how a warehouse control system works, and writing code in too many languages at once. That is my happy place. The big bad database can stay over there while I make machinery do my bidding.
Besides the actual developers of lemmy, none has done more for the lemmiverse than the maintainers of lemmy.word. When the Reddit shitstorm started and other leading servers shut down user registration, you guys held the ship steady and didn’t flinch from the sudden flood of new users. Discovering new bottle-necks in lemmy code, helping to resolve them and deploying hot fixes. All in super fast reaction time. About “lemmy.world shouldn’t be largest server” crap - it’s good for lemmy that one server is the easy entry point to lemmy. This is where the “mainstream” communities could/should be and new users will have an easier landing. Having dedicated servers with their own communities (like start trek, piracy, etc) is great but it’s not mandatory for all communities.
Hey! Lemmy.dbzer0.com stayed open as well! :)
🫡
You definitely did!
I did this before in this thread but there are actually some others who helped us quite a bit. Lemm.ee's admin @sunaurus@lemm.ee and Lemmy contributor @phiresky@lemmy.world to name a few.
The main issues I have with Lemmy.world is basically how culturally tied to Reddit some users are, and I just want to get away from that.
I hang out on smaller instances because there's less people trying to uphold reddit standards and BS. Stuff like keeping track of defederations, but then claiming they're all based around some drama. Stuff like that is ultimately unhealthy for the site and was a root cause of reddit becoming more and more toxic over time.
Lemmy is more or less a Reddit clone , at least in how users interact with the site/apps. The more people migrate, the more this will happen. Admittedly, that’s why I’m here but I’m not sure what you mean by upholding Reddit standards. Reddit was/is community operated and minus reddits moderation, the users here will shape the future of the site regardless of the instance in the same way. Subs get too big , and create more serious or niche ones, until those get too big.
/r/gaming and r/games come to mind as an example
Reddit has major issues of power hungry mods & admins demanding others think like they do or suffer the consequences.
It's also good that these attacks are being leveled at an instance with enough technical talent, time, and money to deal with them. My understanding is that what's happening to LW could happen at ANY Lemmy instance. Assuming that's true whoever is doing this could crush smaller instances that are unable to deal with it.
Y'all are motherfucking gangsters. Appreciate the work you're putting in. I don't do your kind of code or I'd pitch in. Much love. ♥️
Imagine having the free time to engineer attacks on a site. Fucking loser.
Or, they have a commercial interest or are paid by someone who does. Fucking losers either way
I've got my bets on who it is.
As the post pointed out: these are people who know how Lemmy works. There's a few troll-websites that have been defederated from Lemmy.world, and those troll-websites (and culture) is well known to retaliate in the form of DDOS attacks.
It sucks, but we shouldn't let them bully us. Instead, we can go to https://sh.itjust.works/c/lemmyworld@lemmy.world and... hey look, bringing down Lemmy.world temporarily doesn't actually stop us from talking or sharing our posts?
They're relying upon the fact that people are "used to" going to https://lemmy.world and don't know that every single member of the federation (sh.itjust.works, lemmy.ca, etc. etc. etc.) all serve as backups to Lemmy.world proper. The posts nor server is ever really down.
I couldnt care less. You provide a great forum at no charge to me. I thank yoy for your contribution to discourse, communication with the community, and look forward to the growth of lemmy.world
I’m with you. When LW is down, I just take that as a sign to go outside for a bit or do something else. The entitlement people are showing is so annoying. Lemmy is not some kind of vital infrastructure.
Thanks for being so transparent with us. Lemmy really does feel like home now to me. I wish the maintainers all the best as they continue to fight the forces of evil.
Reddit was down a lot too, and they stuck ads in my face. It’s not like I have a pacemaker that needs Lenny.world to be up in order to function. Keep up the good work and I hope whoever is behind the attacks steps on a Lego.
usually my reaction when a website I visit daily goes down is to probably visit that website less or think the backend team behind it is lazy. but when lemmy.world goes down or is under attack, I sympathize and just open it when it's back up. y'all prove that you're hardworking by providing clear communication and explanation on what's happening everytime. shout out lemmy team, you deserve the world!!
Thank you for your work 🫡
Have you guys tried NOT getting attacked? Might work.
Seriously, thanks for all your effort!
Have you heard of something called The Cloud? It sounds possible this will solve all of our issues!
(/s in case it's not terribly obvious.)
Blockchain!
No no they need Ai generated NFTs
Bitcoin solves this!
Just download more RAM bro
You wouldn't download
a carmore RAM would you?Why not?
Great stuff, thank you for all the good work.
btw, as a tip: please resize https://lemmy.world/pictrs/image/14f857e5-703a-4513-9c1a-f23031675be1.png in an image editor. It's on the homepage, and it's a frikking 4.5 megabyte image file.
I resized it. It's 1,2MB now
Well done!
I think you should take 5% of donations to pay yourselves personally. I appreciate your work!
Definitely need to pay themselves. Doing this for free is not sustainable over long periods.
I’m more than okay with the old Jimmy Wales treatment once or twice a year.
I would be happy to support a special fundraiser to get the admins some beers.
What else can we do to help Lemmy.world besides donate?
Asking for nothing but patience :)
Here you go.
And before Pipedbot comes in and berates me for not using it, I just wanna say Calm Down.
Here is an alternative Piped link(s): https://piped.video/ErvgV4P6Fzc
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I'm open-source, check me out at GitHub.
See?
Doesn't seem like even that would help. Patience seems to be the most useful option right now.
Definitely won't HURT though
If you got any programming skills, Lemmy's code is open source and improvements to these expensive calls (or just any call) would most likely help the server. I'm also sure moderation tools would probably make their job easier and just improvements to the platform as a whole would probably help (more users, more possible donations, especially if it gets closer to platforms like reddit)
But without any technical skills like that, probably just helping communicate stuff like this, like if someone's complaining, explaining this, is probably the best you can do (and it ain't much)
Cheers for the good work guys
Thanks for the update and the hard work behind the scenes to keep things online!
If you think it might help I've got a bit of a hack I've used in the past to cache a sql database in a compressed ramdisk using zram and bcache. Imagine stuffing a 50G DB into 20G of memory.
It won't fix the inefficient SQL queries but it would make it so frequently accessed tables get cached in a ram disk cutting query time significantly.
This might be enough to reduce the impact of these attacks until queries can be optimized.
This assumes your database isn't running on something like RDS though.
What is RDS?
Different meaning for different roles, but in this case, I'm guessing they mean Relational Database Service. What I'm not sure of is the limitation, if it's that it's a relational database or that RDS would indicate that it's hosted on a PaaS (Platform as a Service) and thusly cannot run the script because you dont have OS access. My money's on the latter.
RDS in this case refers to AWS's managed database service which doesn't let you touch the underlying software or do the level of configuration this would take
Ah so the Lemmy World server isn’t a Raspberry Pie? Nice.
No, it is an arduino with an ethernet shield.
Decapitated ancient Lenovo laptop
It's better; a PSP 2000.
PS2 running linux
I have to wonder why expensive SQL queries in Lemmy operations even exist. As Lemmy scales, won't those queries get executed more often just as part of normal operation? That would say to me that the Lemmy software needs optimization. Otherwise there will be scaling issues even if the attacks stop.
the version number is 0.18.4 that should give you a hint.
it's entirely possible that these simply haven't been optimized yet.
That's fascinating, I would have expected it to be more like 1.18.4-beta. I thought zeros were meant for unreleased product.
As the other reply mentioned, there are different versioning schemes, but traditionally version 1.0 means "feature complete," and of course one would traditionally wait until feature-complete to release to the public.
But this is a Free Software project, so it's different. You and I aren't "the public;" we're participants in the project. Essentially, everyone in the lemmyverse is a beta-tester -- except we're not testing a beta, or an alpha, or even some sort of "developer preview release," for that matter. We're testing an extremely early experiment!
Honestly it's all arbirtrary and there are several possible standards they could be following. Or they could be yoloing it since people all of sudden people started flying over (which you can't ever really account for)
That's exactly what is happening now. Lemmy is a very young codebase and up until very recently only had a tiny user base, so optimisation wasn't that important.
Over the last few months the Devs have been working hard to improve things, but there is a lot of ground to cover
I wonder whether writing the backend in Rust was a premature optimization in its own right in that case. Lemmy can be seen as a fairly simple CRUD app whose work is mostly in the database, plus some network communications with federated instances.
I think too that's the case, as it turns out the bottleneck was really the SQL queries and the DB design, not much the programming language.
Yet it's not optimizing prematurely...
Everyone who has to do a little bit more with databases, knows that it's often the database which is the bottleneck.
Rust is a great language not just because of its performance.
Why should writing something in Rust be a premature optimization? I don't choose Rust because of its performance (at least that's not the furst thing that comes to my mind) but because of language ergonomics and because of its strictness which makes maintenence much less painful.
The devs aren't DB experts (no harm in saying this), as for example a while ago someone spotted an SQL query where Lemmy used to do filtering after a huge join, instead of joining after filtering. SQL experts need to help them here.
I'm not an expert either unfortunately, but using EXPLAIN on slow queries can go a long way.
The most demystifying documents I know of about SQL query planning are actually from SQLite. Understanding them can help figure out how to optimize SQL in general, since they explain how SQL execution engines work:
I'm pretty good with SQL... well I used to be, been using a noSql db for a while now.
but is there a list somewhere of the worst queries?
I'm too busy to contribute to a project rn but I can optimise queries.
Here's one: https://github.com/LemmyNet/lemmy/issues/3845
They need all the help they can get
It sucks but there will always be some labor intensive queries to execute. Although, it can be limited and restricted which I'm sure they are already on top of it. Such as caching and security control put in place to make limits like "this type of request from this IP can only happen 1x per hour" or something along those lines.
If I had to guess, without looking into the source code yet and limited information provided I'd assuming it's mass account creation, image uploading and/or exploiting how the instant syncs with the fediverse. It's most certainly something that can be mostly prevented once the holes are made and then patched.
Also, I'm sure in the future something more efficient than SQL will be used.
I have to wonder what those queries actually do. Why is mass account creation a thing? Image uploading shouldn't cause significant db activity (add a row saying where the image is, don't put the image into a BLOB or anything like that). Syncing is no big deal either, given the quite low amount of traffic. I know that some websites use Postgres for fulltext search and I don't know how well that works under heavy loads. I've mostly used Solr (solr.apache.org, thus my username) but I think that is now considered old fashioned.
PostgreSQL itself is quite performant and should be able to handle high loads once the queries and schemas are optimized, there is some caching of obvious things, etc. One antipattern I've noticed is pagination: saying "page=5" like Lemmy does to get to the 5th page of /all is done with an OFFSET clause which is expensive because it has to count off that many rows. It is better to use timestamps or other markers like Reddit does, that can be an indexed column that can be accessed quickly.
Anyway thanks.
Thank you Dudes for all your hard work!
Love this transparency post and info, much appreciated
And rational (not emotion/greed-driven), mature (not emotion-driven), responsible (not emotion/greed-driven), adult (not emotion-driven) attitude towards problems. Thanks for your hard work.
I'm sure they don't want to reveal to much but I'm curious if the attackers were authenticated. If not it seems reasonable to rate limit anonymous users.
Rate limiting only goes so far. 10 requests for a 1 second operation is the same thing as 1 request for a 10 second operation. Any CDN, like CloudFlare, can't do too much about web requests that are super taxing on the database.
The bot nets that are doing this can be worse than a hydra. If you block one bot, several more pop up to take over where the other left off. Even worse, the requests that the bots are making are legitimate. If you start throttling the specific requests that are too taxing, you are likely going to cause issues for legitimate users that need the same data.
Additionally, the number of NAT'ed egress IP addresses are much higher than you might think. Blocking just one IP address could mean that you are blocking thousands of users behind that address.
Sometimes, the best option is to absorb the traffic and ensure that your application is running extremely efficiently.
This is a complex problem, for sure.
I found that LMAO/Angled (guy who was angry about being banned for community name squatting) has a YouTube that does techy stuff, he's always in the back of my mind as someone who could be contributing to the DDoS, total speculation though but the threat of "ruining your site" and then coming back to spam the trending communities with spam makes me suspicious lol
I mean, making threats does put a target on one’s back.
Cmon half the users here are tech nerds, get to work you lazy bastards, I'll be there as soon as I close this sprint--
Agree. But this sort of thing might actually make Lemmy way more resilient and nimble once optimised
They are inadvertently helping Lemmy become more robust
Yes! Same goes for those saying "Lemmy.world is too big". Having a large instance is good real world case for addressing scaling issues that might impact more and more instances as the overall Fediverse grows.
In the future a small Lemmy instance may be the size of today's Lemmy.world.
Thanks for all you guys do! While the lack of reliability can be frustrating your efforts do not go unnoticed. Thanks again.
A fantastic job is being done by you folks - obviously in the face of adversity. Given the amount of users on the instance is at a critical point, would it not be possible to 'move' accounts off it onto other less populated instances ?
Keep up the great work folks - I sympathise for ya.
The thing is, it's not. The admins are literally saying that lemmy.world is not down because "it can't handle the load". It actually can handle the load, the hardware is pretty badass and it has the most resources out of all instances currently thanks to the donations. It's down because of one guy or group DDOSing this instance, normal user activity is not what's overloading the database.
Hopefully all the attacks you guys endure end up helping lemmy patch those attack vectors and make lemmy an overall safer and more robust place.
Thank you for the update. Good work.
I guess it's that guy who said he was going to break the site. Remember that guy? Something about not being allowed to open loads of communities or something.
I was wondering why the CloudFlare protection doesn't work, this makes sense. Does CF have any point then? Lots of people don't like it.
It's weird someone would spend so much time to target LW. Ah well.
Cloudflare can only protect so much from the number of requests. The bad actors are spamming inefficiencies in the SQL backend not blasting the front end with web requests
Yes it makes sense after the admin's explanation.
There are plenty of different ways to DDoS. Judging by the post it's an entity which is currently sending specifically crafted requests to use as many system resources, targeting Lemmy the application.
Cloudflare blocks other less knowledgeable DDoS attacks. So yes, Cloudflare does have a point but it can't protect against everything
Take your time bros I don't need this shit 24/7 the downtime is fine and expected
I’m imagining spez is sending his flying monkeys and they’ve been trying to shut it all down. Doesn’t matter that you’re smaller than Reddit, Egos like spez’s can’t take even a minor rumble. Just look at how he has to ‘win’ against all his own users. Should tell you all you need to know on his motives.
Thank you for everything you do. You guys are doing a fantastic job, and a lot of us sincerely appreciate all your efforts!
Thanks for the transparency and the update! Downtime to me is useful, it prevents me from using Lemmy too much.
Glad you guys resisted the call to close signups. I think that's what they want in the end, to harm lemmy.world by killing it's growth.
Thank you for your hard work
Thank you for your time & efforts in maintaining this platform. I (and many others I'm sure) have great respect for the work you do in trying to combat this menace. The community is completely behind you and appreciates the value of this resource.
You guys are trying your best, and that's what matters. Thank you LW Team.
Very grateful for your focus and dedication. Bummer about the DDOS bullshit. Your efforts mean a lot to the communities.
Come on ddosers, we can always solve our differences with dialogue
That looks a lot like an autoexec.bat
Dialogue Tool for the Gacha Player
You're managing this well. Good work folks.
I wonder if reddit, the company, are the ones ddosing.
This is a conspiracy theory level hot take.
Of course it is, the problem is that it is very unlikely
I doubt it, lemmy is hardly worth its time
Lol that's a wild one. Compared to Reddit Lemmy has a completely negligible userbase and is (at this point) no competition at all. Why would Reddit waste any resources on this?
Because people suck, and CEOs suck more than normal.
Spez is a petty fuckface?
I'd guess it's related to unfederating. Someone is butt hurt and out for revenge
Doubt it as it's criminal
Another heartfelt thanks—both for the hard work, and for the transparency.
In a way, the ddos attacks are helping highlight the slow parts of the system causing a reason to optimise them? It's kind of a double edged sword
What doesn't kill you makes you stronger I guess.
Well thanks for the update and your hard work. I am currently using lemm.ee as a backup account so that I can at least have my fix.
Hope the bastard(s) who are ddossing the server get some nice tropical diseases.
Lemmy.world also was my first step into the fediverse.
In terms of the "expensive" SQL queries - is this an issue that the lemmy devs are working on? I.e. is this a problem that might solve itself in time?
It's being worked on, but the Lemmy core devs have another problem - their funding will actually get pulled if they don't meet some deliverables by the end of the year, and they're woefully off track for meeting them.
I am confused. What deliverable, and to whom? I thought they were funded by donations.
Lemmy is currently generously funded by NLNet, under the condition they meet some development goals they promised to meet back at the start of the year: https://nlnet.nl/
From what I've seen in GitHub repo some of the queries have been optimized by experienced sql developer (volunteer) recently, but a lot of them still need work
Endless DDOS attacks. Sigh.
appreciate the transparency!
It's really annoying that it's down but I've found another instance to use when I'm not able to use this one. I hope you're able to stop these losers at some point. It's very frustrating what's happening but at the same time Lemmy is young and I think and hope it will be optimised so that it won't be a issue in the future.
Stay strong fellow lemmies, we're going to get trough this. For those of you that is very annoyed now: make a new account at some other instance. I've already got 3 accounts across 3 different instances already. Check what instance to join here: https://join-lemmy.org/instances
As a note, there are also instances available that are not on join-lemmy. For example, my instance lemmy.thesanewriter.com is open for sign ups but not yet on join-lemmy.
I've hopped around a few times, really wanted to get into sh.itjust.works but either they are having technical difficulties or someone opted to not send me the email confirmation, so now I'm stuck in "please verify email" with no method to resend said email. Definitely checked spam folder of course.
I've found lemmy.zip to be stable so far but it hasn't even been a whole day yet... still, that's more than I was getting from lemmy.world.
I didn't bother to migrate etc, my account is pretty new and all I've done is block a bunch of porn subs so idc that much about migration. I DO wish there were separate nsfw/porn categories.. i don't want to remove all NSFW content buy fml i don't need that much porn for fucks sake. The internet is full of it already, why do we need more lmao
The downtime is causing an issue with posting content from other instances - I've seen this a handful of times from kbin. I post something to a lemmy.world community, and kbin thinks it's there, but lemmy.world doesn't see it. But, the delete request seems to need to go through lemmy.world, which doesn't agree that the content exists. So my profile is filled with posts people on kbin can see, but no one else can, and I can't delete them. Is there any kind of catch-up mechanic for instances to try to agree on what content should be present if content was altered during downtime? I can see this becoming a lot more confusing as people look at a community from multiple different instances and see different content, not realizing this is unintended behavior.
The biggest misconception I've seen on Reddit and elsewhere is that you need an account on every single instance if you want to interact with content on that instance, and it's not supposed to be true but while this bug continues, it kind of is true.
Thank you guys for the write up and for helping to keep things running.
How do I as a developer:
I’m an SRE by trade and would be happy to contribute my time in some way
Best bet is to check out the GitHub repo
As with most OS software, the usual method is by going to the github repo, checking out the README and Contributing guidelines and whatever else the documentation offers in that regard, and then looking through the open issues and see if there's something you'd like to solve. Sorting by 'good first issue' can be helpful too. You should definitely join the Matrix instance so you can ask something if you're stuck.
There are quite a few InfoSec people here. While I have never held an official InfoSec job I do have a degree. However, my degree is debatable about whether it actually educates me as intended.
Point being there are a lot of people that have more knowledge than me as well as experience but I want to learn. As someone who is always listening to security podcasts like Hacking Humans or Darknet Diaries, naked hacking, or even InfoSec journalism around popular ongoing issues in the world like Click Here. I always want to learn and get experience.
I currently work in IT for a hospital. Is there any way to help with this kind of thing to learn and build on knowledge to help? To volunteer time to potentially see what is going on?
IF you were a bad actor, this is exactly the argument to use to get more inside information to use in the next attack.
Establishing trust is the first problem to be overcome.
When will /u/spez just accept that he lost?
Nah imagine it's spez personally attacking us 💀
keep fighting the good fight <3
Ironic that they're effectively proving that you were right to not trust them...
Would it be possible to have the error page when you are being attacked/there is an outage point to some other lemmy instances to go to?
I think that could be a big help if there is an issue when a new user tries to check out .world for the first time. They will at least have a link to click on to check out what lemmy is like on another instance and maybe sign up there too.
Ive been waiting for this response from you guys. You have been a fantastic admin team so far. I still don't agree with some of the de-federating, but overall you guys truly show you care about this instance and the lemmy fediverse as a whole.
I know I wont be wavering because of butt hurt idiots in other instances. I will hold my ground and stick to Lemmy.World.
Keep it up and i hope that in due time, you guys can keep the DDOS attacks under control.
There's nothing wrong with making an alt account for when .world is down. In fact, it's very much in the spirit of the fediverse to do so.
Remember thanks to the Federal LW still appears when you go through another instance like this:
https://lemmy.world/c/lemmyworld@lemmy.world
Save it as a backup link.
Great explanation! And thanks for the many many hours you guys put in.
keep up the good work team; you're the linchpin to this renaissance
Are DDoS protection services like those from Akamai, Arbor Networks, Link22 etc an option? Those are tested as ok by the German Federal Office for Information Security.
I don't believe it would work for this case. Typical DDoS is just sending a ton of junk packets at a server at the max bandwidth of the network of bots an attacker has at their disposal. Very easy to block for a large cloud provider with multi-terabit connections and multiple redundant data centers. This is different, they're asking the server to send them large amounts of information on repeat, or process massive amounts of data. The attacker is targeting the servers hardware itself through legitimate processes, so a third party wouldn't really be able to do much.
Is Lemmy not throttling requests to APIs based on how computationally expensive they are? Or is it that many IP addresses are hitting those APIs and are within the throttling limits?
The first D in DDOS means distributed, as in the requests are distributed across many different machines and IPs; so the second option.
I understand what DDOS is. It could be both options.
What I am curious about in the second case is why they aren't throttling unauthenticated requests in a single bucket.
Great work guys! Keep going!
keep up the good work
Question, can we configure the nginx to return cached responses for all non-logged in queries for predetermined periods of time? (1min for example?)
Yeah some sort of rate limiting and caching seems like the first line of defense here. I'm sure they are aware of that though.
People should stick with the instance otherwise you're just encouraging those tankies and nazis to use DDOS attacks again to bring down instances that defederate with them, don't let them know that they're successful. This opportunistic concern trolling around lemmy.world's downtime needs to stop. As the admins said, sooner or later "small" instances would have 100k users and would start having these issues all at once if it weren't for lemmy.world experiencing them first hand. Some DB optimizations were pushed to Lemmy thanks to lemmy.world.
I know you're a prolific poster around here, so I feel like I need to tell you of a better solution.
All federated servers are part of an alliance. This means that https://sh.itjust.works/c/lemmyworld@lemmy.world is where you can go if Lemmy.world goes down. Eventually, sh.itjust.works will copy your data to lemmy.world (as per the Federation protocol). As long as Lemmy.world comes back up every now and then (and long enough for the posts to sync), then sh.itjust.works/c/whatever@lemmy.world will function as an effective proxy / secondary location to have discussions.
Or https://lemmy.ca/c/lemmyworld@lemmy.world, or whatever other community is hosted here.
Simply make your alt-account on a 2nd server, then promote this 2nd account to moderator for any communities you're a part of. Bam, now you got a 2nd place to work your communities even as the main site goes down. Leverage federation to our advantages.
Yeah, I'm with you. I've been wanting to see a post like this from the admins because if it turned out the issue was something related to resources or infrastructure, I'd use a different account as my main. But I didn't want to do that if it was resulting from attacks because I refuse to be manipulated by assholes and I refuse to let them win.
I'm staying with .world out of solidarity. I have another account on .ca that I use when world is down, but this one stays my main.
The conversation gets a bit scrambled/broken up by disruptive/toxic people but this is a comment chain on lemmy.ml two weeks ago about SQL issues and challenges in getting the Lemmy Dev team to address them that might be worth reading:
https://lemmy.ml/comment/2100093
The Lemmy Dev team have long ago stated they're no experts in PostgreSQL tuning, and that any help is welcome.
In the thread you linked, a guy is just accusing them of what they themselves admitted, then refusing to help. Meanwhile, others have been submitting SQL related PRs all the time, which have been merged.
I wonder what motivated any DOS attacks.
Cyber-jackasses or cyber terrorists, likely the first.
A cyberpirate wants money.
A cyber terrorist has ideology or want to watch the world burn
Most actually successful cyber attacks globally are just trolls who want to have fun. This is why many, with their automated attack patterns, try to avoid children's hospitals and critical infrastructure, but cyber terrorist with ideaology or want the world to burn attack those.
Giving lemmy is not that important yet, and theirs a ton of alternatives outside fediverse, it's all volunteer, it would be cyber-jackasses, or want to watch the world burn cyber terrorists. Not pirates, not governments, not corpos.
Some people just like to watch the world burn.
I think it would be good to not close registration and if once a month or something there could be a post by admins about migrating to smaller instances (this is made easy with the LASIM tool) so new users can easily sign up with no hurdle but we also prevent too much centralization.
Once again, thank you for the transparency and for keeping us (the users) informed, as well as for all the work you do to keep lemmy.world going.
For my part, I wish to offer a sincere thanks to the Lemmy.world admins for their efforts up to now and as they continue to deal with the targeted DDOS attacks. You are Lemmy heroes who are helping to create this promising future. Sometimes it's a fight against those who are struggling with their internal daemons and taking it out on others. Bright days are ahead for the Fediverse. Keep marching forward!
Thank you for your dedication! 🤓
The fun thing about the Fediverse is that when this goes down the other instances stay up, so whoever is doing the attacks isn't really doing much except promoting people to create accounts on multiple instances. Which makes the numbers look really big.
I just want to know why someone or someones are taking so much time and effort to DDOS lemmy.world. Is it a grudge? What is driving it?
No way to know for sure but someone(s) is either big mad at being de-federated or they are uptight at LW becoming the biggest Lemmy instance. I'd guess that the first one is the most likely.
I bet it's the person mentioned before who had been making thousands of junk communities and got banned. We already know they do troll stuff and have the technical aptitude for scripting.
Would it be possible to ratelimit connections/requests? Some sort of AI-based blocking? What are current technologies to battle such DDOS attacks?
Something something limited by cloudflare probably
Idk man, the days when almost every other action on Kbin was blocked by a Cloudflare CAPTCHA was infuriating
Appreciate it
This has been pinned a few days now. Site health was pretty dire with several long outages.
But subjectively in the last 48 hours things seem to be great. Noticeably responsive and login and activities haven't missed a beat.
StatusPage.io still looks very red though... Is the worst now mitigated?
Thanks to the stirling admins (and friends) for their work on this. Vive la Lemmy.World!
To my understanding Datadog is not FOSS. Would you guys consider using a FOSS alternative for motoring the status of lemmy.world such as Uptime Kuma? That way your who stack is closer to being FOSS.
https://uptime.kuma.pet/
The ship of "Lemmy must be entirely FOSS" has sailed. You can either invest time or money and even then there are some tradeoffs of things that can't be swapped out. Datadog and Cloudflare are two of those such things.
Lemmy (including lemmy.world) are at a critical junction to continue to grow or lose momentum. These DDOSs are one such thing that caused it and everyone going "FOSS, FOSS, FOSS!" are another. If they have time in the future there may be a possibility, but when playing the growth game sometimes you have to go with the best tool available even if it doesn't meet your ideals.
Sync for Reddit is another such tool. I've seen so much hate for it because it's not pure FOSS, pay no mind to the sheer number of people that have downloaded it, are using it and have helped drive traffic to Lemmy and the Fediverse in general.
Nothing is stopping you from using a fully FOSS front end with your own server, that's the beauty of the Fediverse, you can choose what you want and still interact with others, but don't get on their case when they select something you don't like.
Uptime Kuma is in no way comparable to what Datadog offers. The best FOSS alternative would be the whole Prometheus/Granada/Thanos/Loki/etc stack, and that would require at least a whole volunteer just to set up and administer.
There's nothing wrong with DD, they're a staple in the cloud industry and are absolutely trustworthy.
DataDog is far more comprehensive than Uptime Kuma. It would be more useful to compare the specific capability inside DataDog, considering they have so many services. In this case RUM or Synthetics from DataDog would be a comparable offering. For the SQL stuff, maybe DBM? I don’t have any preference either way, just wanted to bring light to the depth of DataDogs offering since I live that life at the office.
Edit to add that DataDog isn’t FOSS, but has some components they’ve acquired over the years that are. Vector is a good example. They’re offering a paid version called Observability Pipelines, but it rides on top of Vector and they’ve (so far) committed to keeping it FOSS.
Nam flashbacks to DALNet getting DDOSed to death for no reason
Thank you for all the works you do!
Thank you to the admins for all of your hard work maintaining Lemmy.World through the downtime. A lot of us are already so comfortable here that we rush to the Discord server to check in when it's down.
Point being, the members of Lemmy.World are really grateful to the admins, the mods, and fellow Lemmings who have been posting interesting content and participating in deep discussions!
Thank you so much for explaining the reason for the downtimes. I just thought it was some temporary issue caused by unforseen popularity. Knowing it was malicious does make me more understanding of how difficult this must be. I will continue to be patient. I am sadly not good enough with anything other than basic powershell scripts and learning proprietary software configurations.. 10 years of software support does that to a guy. I'll still check if there is anything I can do to help. I do want this project to succeed.
When I learned about the whole fediverse thing, I want to join but was hesitant due to many instances. But I realized that lemmy.world is the largest Lemmy instance with a HUGE margin so I just signed up. Thank you for keeping this place alive and kicking!
I think I initially signed up on your instance and then figured it out, signed up for a more local instance but then figured I made a mistake and ended up where I am.
Thank you again for being available to let me through the door. Once I figured out that there's lots of doors, it was much better.
Lemmy.world will always be a special place and you and anyone who volunteers for work hare is fuckin awesome. Thanks again ♥️
I would love to see this grow to the point where a full time sysadmin could be hired! Would need a lot of subscribers though
All support to Y'all, Keep Going!
Thanks for the update and keep up the good work! It seems like reddit went down a few times a week regularly for years. I have to think that some state sponsored actors are responsible for some of this. I’m sure that some topics being discussed here are not in line with the values of many regimes.
While I agree that certain state sponsored actors and private interest groups are most definitely involved in discourse manipulation on reddit, Lemmy simply isn’t big enough for this yet.
If we go by the numbers stated in the original post, the whole of Lemmy has less than 500k users at this point, whom are overwhelmingly <40 years old tech affine early adopter nerds from the United States and Western Europe.
Too insignificant to spend resources on, and also largely sceptical of corporate interests and authoritarian governments (except the tankies of course); so by default critical of the two top potential manipulators.
It's a shame you are having to go through this but it was bound to happen to an instance sooner or later. It's better that it happen to a large instance with the time, talent, and money to work through the challenges because this kind of consistent attack would bury any smaller instance pretty quickly.
With that said is there anything that users can do to help?
keep doing what you're doing. If it makes any difference, I'm with you guys and I appreciate what you do to keep this place running
Guess it's time to create an alt account on a different instance
Are you guys using a load balancer at all? How about a tool like CrowdSec?
I use that and the nginx Bad Bot Blocker to stop malicious shits on the sites I operate (medium-large e-commerce) to great success. We used to get scraped heavily by competitors but now they get the middle finger.
I presume you have fail2ban too?
Are bad actors able to access the database to execute queries or is it through the main front end site and accessing API endpoints over and over? Then surely they can be blocked at this point?
These attacks are just through the public API, not malicious SQL-injection attacks. They are just non-optimized queries regular users can execute thag will bog down the system enougg to make it crawl, at which point, intervention is needed to either kill the runnimg slow queries, or just restart the db.
Lemmy.world should just start charging to use the API. That'll stop them /s
Thanks for all your amazing work! I know just enough about SQL to know I know next nothing, but could someone intelligent explain how databases are publicly accessible for anyone to be able to make queries?
They don't need to be. When you're posting a comment, that's a database query. Not from you directly, but you're submitting a comment, which tells the frontend to tell the backend to tell the database to save that comment.
Now do that a thousand times and you created a thousand database queries. Now do something more elaborate, like filtering search results or something, and you put a bit more load on the database.
And apparently there seem to be some queries that a user can create that cause issues if submitted by the thousands.
🙏 🙌
Thanks for the update and the work you do! Lemmy.world introduced me to the Fediverse and it’s been awesome watching the site grow.
The Great Lemmy Wars
It's difficult to fix and not without changes in the code. Most solutions involve fixing those heavy SQL. Tuning them, caching them in redis or memcached or refactor the whole process from scratch.
Thinking on the DDoS part, implement short circuits so reaching those queries must follow a session pattern. It doesn't stop it but you force those script kiddies to make real connections. If they are anonymous then all the heavy queries should be cached due to lack of custom vars. If not, it's a matter of identifying users and banning them automatically.
@lwadmin Lemmy developers have features hardcoded for their own instance? WTF?
They're likely running beta or unmerged patches rather than 'exclusive' features.
It's a check if the hostname is 'lemmy.ml' and if yes then it will display a box with information about federation and link to Join-Lemmy.org on the signup page - so yes it is hardcoded for their instance. I'm not shaming the devs here I'm sure they would at one point make this an option for all instances since half the work is done. And it's something we at LW could use but I'm sure it's not interesting to everyone running an instance.
Lemmy.ml says they are running BE 0.18.4 which is the same as lemmy.world which mostly eliminates it as a beta version. It really can't be an unmerged patch either since its in the Lemmy source code.
I doubt it's nefarious though, probably just a leftover from when lemmy.ml was the primary lemmy instance.
Ah this is much needed response ! I switched servers but lemmy.world is still my go to server , it was down so much that I had to try alternative ones ! Good it know its not a load issue !
Is there any update on the instances that were unintentionally defederated from lemmy.world? I know that one of the fanaticus.social admins was trying to get that sorted out.
Thanks for the hard work!