Can Lemmy posts be indexed by Google or other search engines?

GammaScorpii@lemmy.world to No Stupid Questions@lemmy.world – 144 points –

One of the best things about reddit was looking for answers or other users with the same problem as you, and since Google didn't really help with that anymore and instead insisted on giving you business results, the best practice was to put your search terms in followed by 'reddit' and you'd find your answer.

54

You are viewing a single comment

I'm working on a specialized search engine just for the fediverse. https://github.com/marsara9/lemmy-search

If anyone wants to help out, feel free to reach out, but I hope to have something ready to release soon.

The idea with my version is that it'll search as much of Lemmy / the fediverse as it can and you can select the preferred instance that you want to open any link with.

If you are looking to return relevant, well ranked results based on freeform queries you'd be better indexing into something like elasticsearch. Otherwise you'll be reinventing solutions to well understood problems, like stemming as a very basic example.

For the initial release the search is still fairly basic, but A LOT better than the built in search here.

Right now I just look for IF the individual words match ANY of the words in the post title or body and then rank based on the number of upvotes that the post has.

Future versions may look at using elastic search, etc... But for MVP it just looks for the number of hits + the score of the post as I assume the higher the score the more trustworthy the post, and obviously the more matches that to your query the more relevant the post is.

How is this different from just searching for posts on the original "seed instance"? Presumably you're crawling through everything on all of the instances that it's aware of, as opposed to the Lemmy built-in search which would only search communities that have a subscriber?

Search isn't working well for me at all, I never find anything.

So the built in search here is VERY basic and slow. For example if I search for "How this is" it wouldn't find your comment here as the word order has to match as well.

One of my main goals is that you'll be able to use my search engine like you would Google's + adding reddit to the end of the query. Then from the search results the link you open would open in your preferred instance instead of the instance Google happened to crawl. Lastly if you want to Google Lemmy posts today you have to add every known Lemmy instance to your search query and even then Google still will open the link on whichever instance it happened to find it on rather than the instance you have an account on.

I’ve been testing this and it’s the real deal!

Does it only index Lemmy instances? What about kbin and others?

If Kbin federates with the "seed" Lemmy server it'll pick up the posts that way, but at the moment you'll only be able to open links to Lemmy instances.

In the future I hope to have it working with Kbin and others as well.

3 more...