Thoughts on Lemmy sorting options?

WatTyler@lemmy.sdf.org to Programming@beehaw.org – 226 points –

Hello all,

Wanted to open a discussion on Lemmy's post sorting options right now. I don't have any experience with implementing this type of thing but right now the algorithm appears... Off? For example, 'Active' gives me a lot of posts over a day old but 'Hot' may as well be 'New' i.e. more recent posts with little engagement.

I don't know if it's due to Lemmy still picking up steam or a fundamental flaw with the algorithm. Like I said, I'm really curious to hear the opinions of those more knowledgeable.

80

I think having "Active" as default is unfortunate too, if a new user sees the top post was from two days ago they might assume the whole platform is inactive, since they can't see that a new comment was added just seconds ago from that page view. Having "Hot" as the default would probably be better since those are usually newer posts with at least a few comments.

None of the megaguides I've seen actually show how to use the mobile apps/sites. I've been told what the Fediverse is and how it works enough times to build the backend myself, but I have no clue how to navigate.

That might be true. I'm glad you can set your own defaults but having the starting fknfkg to be a generally useful one is a good idea. Im curious if people find Local to be a good default? Subscribed doesn't make sense I guess because you start out with no subscriptions. But maybe all is nice to find new communities and explore?

Maybe instance owners could have a recommended subscriptions option that new users could default to. That way new users can experience seeing content on other instances right away. Different instances with different standards could customize their new user experience.

I noticed one of the Lemmy instances recommending everyone change their settings to use 'hot' by default in their site rules sidebar. Makes sense, the many instances look a lot more lively that way.

I don't know a lot about Lemmy's implementation but a difficult thing to deal with is how do you "rank" a post? Like you have a small community of a few active people, but there's federation with a massive community with lots of users - which posts are "better"?

Worse still, there's an inherent lag/delay with the federated posts, a post that was very active in the last hour might have only been federated to the server in the last 5mins - so what do you do, do you bubble up all those posts or ignore it because there's more recent and relevant things?

The kicker is that these decision points aren't instant either, any system that's doing this kind of ranking will have an algorithm as you describe, but that algorithm will take time to process all the data, while the data is coming in batches as each server federates with each other. It's a difficult problem to solve.

I'm aware, what I am getting at is that there's multiple "Right" answers to solving what is essentially a very difficult problem.

I see this as a really clear win over the likes of Reddit and Facebook of making the algorithm more understandable to users so they see WHY they're being fed the information they're getting.

Like you have a small community of a few active people, but there’s federation with a massive community with lots of users - which posts are “better”?

I think this is where federation will get the most interesting. It would be cool to add/remove weight to some instances and communities. And each instance's software could handle this differently with different algorithms.

I'm completely speculating because I'm high. I have no idea what Lemmy's roadmap looks like. Would be cool though man.

Also, it has to be designed in a way that limits federated servers from gaming the system.

My best guess from the way it looks is that Active is like a forum or imageboard sort, where new comments "bump" old threads to the top. Hot appears to be more like "rising", new posts that are interacted with more than others. Both probably make more sense with greater volume, but neither seem perfect for the structure of the site. I would also think "tuning the algorithms" is probably in the "when we get around to it" bucket for most of the people who would be handling that.

I'm on jerboa, and like others are saying "active" is pretty ass and 'hot' is okay. My main actual complaint is the inability to sort comments! Please God, I'm tired of seeing the rare hateful message with 1/2 up votes on top, and then a more normal good comment with 50 up votes beneath it. Going to be a real problem for onboarding people and optics.

Yup, this is my biggest issue with Jerboa as well. Using the browser, it has a comment sort but it also has some scrolling bug that was making me unable to read headlines or click on posts haha.

There is an option to hide posts that you've already read in settings, I know that's not exactly the same thing, and I haven't tested it out, but that might help solve the problem of seeing the same posts every time we open the app

I use "Top Day" and seem to be quite happy with it.

I just switched to this and it seems to be what I expect from 'Hot'. Hot seems to include popular posts from weeks to months ago and that's just not the definition of hot, really.

Yeah, I'm feeling a bit lost here too. I've been using mostly 'hot' (found this post using it), but there's also the issue with dynamic posts, that suddenly there's a cascade of posts being loaded and you lose where you were on the feed.

It is unintended, a bug that is already known and will possibly disappear when support for websockets is removed (i.e. only plain HTTP traffic from you web browser to the server will remain).

Thanks for that info, I just complained about this issue in a different thread. Guess I'll just have to be more patient during this phase.

My biggest issue is that posts stream in regardless of the sorting option you have picked. So, even if I am sorting by 'hot' posts I'll get a stream of posts that where made less than a minute ago.

Ranking is hard. My biggest gripe with this however, is that the front page doesn't seem to cache my filter/sorting options between page reloads. I've resorted to replacing the anchor href on the logo with my own preferred pre-filtered search result href, via JS injection.

I think you can set default sort and subbed/local/all in settings

Oh you're absolutely right! Thank you! I swear I checked for it a couple of days ago 😅

I'm a bit baffled - I love the active sort so far. Sure, it's different from reddit, but in a good way. It means the exact timing of a post, to line up perfectly with the peak active hours of U.S. users, or similar, isn't critical anymore, because a post can go relatively ignored for a day or two and then get picked up and filled with comments and discussion.

It gives people more time to get around to reading a long linked article or watching a linked long video first, before commenting, too, while still leaving them able to participate in discussion after.

It means that people who comment "late" still get replies. Commenting even slightly late on reddit and not getting any replies or votes felt awful, like you were talking to a wall. And the people who sorted by new and commented first tended to stick to the top of threads because they accumulated more upvotes by sheer force of time.

I hope we can avoid re-creating the constant, over-hurried content-churn of reddit, and keep this more patient feel.

That said, I'll concede that active search is best for discussion posts, like on Chat or Gaming (e.g. threads that ask people to share favorite xyz games and so on), or those based around longread articles, and not necessarily as good for Breaking News. But even then... I think it does us good to allow time to discuss news posts, too, and active search does that.

Tldr I use active search as default for everything and I absolutely love it like this, please don't get rid of it just to make the site more like reddit.

Maybe some kind of bathtub curve could be applied. Lots of activity at the start is weighted because the topic is exciting, but that dies down for a bit, and then if someone picks up the thread much later, it gets more of a bump because the topic has longevity.

Though there’s also the issue of two users basically having an ongoing discussion that’s gotten off track, but is keeping a stale post alive.

Can someone explain why if I choose All/Hot the first page will look "normal" for a bit, but then it starts updating/scrolling constantly? Like I can't keep my place, new items are continuously being added and I can't even finish reading a headline. I don't see any settings to prevent that. What am I doing wrong?

I'm not sure about the algos, but I'd kill if it would at least remember my choices between page refreshes.

Although it's a little less convenient than just remembering across refreshes, you can set a default sort in your profile if there's a particular sort you like.

That being said I find myself hopping between Top Day and Hot so I can only start on one or the other. Would be nice if it just remembered.

Remembering “subscribed” instead of “local” would be nice too.

I want to make 100% sure that in “Hot” you don’t just see big communities. I want to always see that one post from that one community with 1 user right at the top. So they should weight by community size.

I wish I could change the default comment sorting

Also defaulting to "All" communities would be nice :)

That's configurable. I've switched mine to Subscribed. I rarely ever look at All.

Biggest problem I think right now is how it auto-appends new posts to the top - browsing /new/ is a simply impossible task. Maybe just tell me there are new posts to fetch and let me refresh when I'm ready? Right now I can't read stuff before it yeets off the bottom of the page

So I think it's just growing pains of the software still being at a pretty early stage; I agree it's not ideal right now but I'm fairly confident it'll get worked out. That said, there's one option that I'd really like that may not be in the works: I'd like the ability for posts I've seen to get bumped way down in the ranks, so when I refresh the front page it's mostly new stuff. Mostly the reason people go next -> next -> next is from wanting more stuff; it'd be nice if we could cut out the middleman and just show them new stuff.

The "Top Day" sorting option does this, but posts fall off a cliff rather than falling off gradually. My understanding is that they'll remain on the page from hour 0 to hour 23, but then completely disappear starting in hour 24.

Instead of that, it would be ideal to implement a mathematical formula that pushes pages higher into the rankings with every upvote, comment, or view it gets, but pushes posts lower in the rankings with every additional hour passed. You have to tweak the specific parameters of that formula to get it right, but it essentially forces posts off the page after enough time has passed, while introducing new posts to replace the old. Unlike the "Top Day" sort where things are a step function, the idea with this is to make it gradual so that a popular post falls from #1 overall down to #2, then #3, etc. over the course of a day.

I always sort by "New Comments". "Active" just gives me the same posts over and over again.

I think it's because a few of the posts appear at the top for everyone, so people keep adding comments to those, which "bumps" them to the top of the feed like it used to work with forums back in their heyday 15-20 years ago.

And that made sense for Lemmy when it was a lot lower volume than it is now. But Lemmy has grown a bit in the last few days and there's more content to deal with now.

They need mark as read too. I keep seeing the same stuff.

If it helps, there's a "Show Read Posts" toggle in your account settings- I believe "Read" is based on having ⬆️ / ⬇️ed the post.

Hopefully the algorithm will improve over time - compared to the site which shall not be mentioned, I find that Hot or Active tend to show the same articles for days (despite there being plenty of newer content). Top Day works well the first time I log in for the day, but then that gets stale too. On the front page I feel like I need more like a "Top Hour" option ... so I can see what is new without the "drinking from the fire hose" that is "new".

These are the early days, Im not unhappy with where we are, I just hope this remains a work in progress and improves.

Agreed. Something like Top for the last 4 hours would be super easy to implement because Top for the last day already exists (just change 24 hours to 4 hours in the code that fetches comments). However, for those that are used to checking the site multiple times in a day, you don't want to ge served up the same content every time you check. Top for the past 4 hours would seemingly be a decent balance between giving posts that have some type of traction while not giving posts that are stale.

I'd really like to see a "Unread" filter, that shows posts that has not already been up/downvoted or clicked

It is already an option, go to profile settings and unmark "Show Read Posts".

Sorting by new comments gives posts with no comments! Pretty big annoyance for me tbh.

A lot of the discussion around sorting specifics makes me wonder if a plugin-like system for user created sorting algorithms would be useful.

It could allow you to curate your own feed in a way, based on age, activity, filters, basically any post metadata. The algorithms could be shared or maybe even federated through lemmy itself. I have a suspicion this would be closer to "neat" than "worth it" though. This is really just a brain dump of a random idea.

Reddit's top hourly was one of my favorite ways to browse. They need to implement that here. Just show me the top posts from the last hour.

You can do Top Day with Lemmy now. Maybe not enough content yet for hourly.

There's always enough content for top hourly. Top hourly is not enough

And it automatically shows posts added to the top makes using it a little bit janky

Personally I usually sort by "hot", works good for /all, but mixed results on subscribed

So far hot has worked pretty well for me. Haven't really ran into the issues you mentioned with posts not having that many interactions near the top of hot, unless it's from a smaller community. My main problem I've had is when opening Lemmy after a bit or refreshing I'll get the same posts over again even if they're marked as read which is annoying and usually means I end up sorting by new. Would be nice if there was an option to hide already read posts. (Or if there already is one I'd be very interested in knowing where it is, also I use Jerboa mainly on Android so not sure if that changes anything either.)

Definitely an option on jerboa I think it's towards the bottom of settings.

Just found it! Wasn't in the settings menu I was expecting, was under the options for my account, but I found it. Thanks for letting me know about it!

Something feels wrong but I normally just browsed "best" on reddit and that was a curated list of top subs. So it's weird to me to sort Lemmy by "Hot" and get a ton of posts from random communities with no upvotes or replies. Which they are obviously not HOT as no one has engaged with them except the OP.

I feel like something is "off" with the way it is curating things.