marsara9

@marsara9@lemmy.world
3 Post – 152 Comments
Joined 1 years ago

"Buy Me A Coffee"

I'm working on a specialized search engine just for the fediverse. https://github.com/marsara9/lemmy-search

If anyone wants to help out, feel free to reach out, but I hope to have something ready to release soon.

The idea with my version is that it'll search as much of Lemmy / the fediverse as it can and you can select the preferred instance that you want to open any link with.

10 more...

So I've been working on a solution for this.

As I see it Google and others are going to have a hard if not impossible time to incorporate the fediverse, and the fact that the same content can exist on multiple servers.

So I'm working on a search engine specifically build, for Lemmy at least. Where it'll take you to whatever your preferred instance is when tapping on a search result.

I hope to have a MVP up and running in a few more days.

19 more...

IMHO federation doesn't bring any real benefits to git and introduces a lot of risks.

The git protocol, if you will, already allows developers to backup and move their repositories as needed. And the primary concern with source control is having a stable and secure place to host it. GitHub already provides that, free of charge.

Introducing federation, how do you control who can and cannot make changes to your codebase? How do you ensure you maintain access if a server goes down?

So while it's nice that you can self host and federate git with GitLab, what value does that provide over the status quo? And how do those benefits outweigh the risks outlined above?

2 more...

I don't mean to keep self promoting but: https://github.com/marsara9/lemmy-search

It's still a work in progress, but it's going to be a search engine specifically built for Lemmy, and maybe eventually the entire fediverse.

Idea is that you'll be able to search just like you normally would in Google but

  1. the links it shows you would take you to your preferred instance
  2. it will search as many lemmy instances as it can find, so it won't find anything outside of the fediverse.
5 more...

Thanks for the shout-out.

But FYI I've run into some bugs that's preventing new content from being indexed. So you won't see anything new (from about a week ago) until I can find a new method to fetch new posts.

5 more...

With ActivityPub all of the primary ids contain the domain of the hosting server. So if you lose your domain none of the other instances know that you're the authority on those communities, posts, comments or users. So essentially federation breaks with all of the old data.

2 more...

Yes but that search doesn't take you to the instance that you are logged into already. Which is one of my main goals with this site. While that did give me the inspiration for this and has the power of Google behind it, it lacks knowledge about how the fediverse actually works.

https://www.search-lemmy.com

http://www.github.com/marsara9/lemmy-search

Just add community:!nostupidquestions@lemmy.world at the end of your query.

I'm doing tests in the next couple days. But I'm trying to build a search engine specifically for Lemmy.

  • It should in theory work similar-ish to Google / Bing.
  • You can filter by instance, community or author.
  • it only indexes Lemmy posts and it won't keep duplicates.
  • It'll also open any link you find in your instance.
  • You'll be able to self host it and point it to any instance you want as well.

I'm hoping I can open it to the public in a week or so.

8 more...

Not yet but I can add this feature

3 more...

Ya, now if everyone can stop finding bugs! So I can take some time off. /jk

1 more...

https://www.search-lemmy.com/

https://www.github.com/marsara9/lemmy-search

It only works for Lemmy, for now. And please feel free to post any feature requests or bugs to GitHub as it's still fairly new.

You can also check my comment/post history for more details.

Check out my post history.

But https://www.search-lemmy.com. It has a few bugs but it should work for you. Especially if you set your home instance to something large like Lemmy.world.

Edit: if you want to help contribute: https://www.github.com/marsara9/lemmy-search

Unless you have an account there's no easy way to get access to the content on the page. Once you have an account there's technically nothing stopping you from just saving the HTML file to your computer.

Something else you can try though, assuming you don't have an account, is to just turn off JavaScript. If the site lets you partially load the content and then asks you to create an account to read more, they usually just block the content by having JavaScript add an opaque overlay. With JavaScript disabled, obviously it's not there to add the overlay and you're able to keep reading.

4 more...

It might not have been crawled yet. The search engine will periodically search for new content but this isn't instant. So it may take a day or two to find it.

5 more...

I've already got some complaints about that. You can see one of the issues raised on GitHub.

At the moment, I'm only picking up mastodon posts that are federated to Lemmy, but you can't choose Mastodon as a preferred-instance, yet. When and/if I decide to add Mastodon support, I'll reach out to the admins over there to get feedback first.

Edit and note to any server admin: If you want to block the crawler from hitting your site, just add lemmy-search to your robots.txt and crawling will be prevented. But this doesn't stop cross-federation posts from being picked up on another instance.

10 more...

Best one I heard so far has just been "Lemmy Search" or "Let me Search".

You'll be able to get an initial set of data from federation but you won't get any data beyond that.

Federation works via "pushes", but since your instance would be behind a VPN the other instances in the federation wouldn't be able to see it to push content updates.

Sadly not yet. I'm in desperate need of a fronted dev, as the HTML/Javascript needs some serious work. But if I can't find someone soon I'll see what I can get put together and get it up and running soon enough.

Not necessarily. I have several servers behind Cloudflare for free. I'm just limited on analytics, some advanced firewall settings, advanced cache management and maybe a few other features that I don't use. But the basic service is free.

https://www.cloudflare.com/plans/free/

1 more...

See one of my other replies. But that was a thought originally. Just hook into the original database instead of crawling using the APIs. Problem is, the table structure required to search is much different than that of a community form. At least if you want to do searches quickly. It takes me almost 5-10 seconds just to process 50 posts at the moment, and I'm doing those in batches... but ya maybe in the future I can talk to the Lemmy devs and see about merging these two projects?

For the initial release the search is still fairly basic, but A LOT better than the built in search here.

Right now I just look for IF the individual words match ANY of the words in the post title or body and then rank based on the number of upvotes that the post has.

Future versions may look at using elastic search, etc... But for MVP it just looks for the number of hits + the score of the post as I assume the higher the score the more trustworthy the post, and obviously the more matches that to your query the more relevant the post is.

I just got a PR merged today that might help with this. I'll start experimenting with it more over the next week or so.

Basic theory is that I can detect at least lemmy posts in the comment bodies and then rewrite those to your local instance. Primarily question is going to be performance, as remote network calls will be necessary.

7 more...

A couple of options in my opinion, as I just did this myself:

You can use the CLI tool to "upload" them. You can even do this from the server itself. So upload times would be as fast as your network card can process or however fast your server is, whichever is slower. It does require that you create an API key for the user in question though.

Otherwise you can create an external library and link that to your account. Now Immich will still index this library but it won't move or manage the actual files. I'm not sure though if it looks at those files for duplicates (i.e. if you try and upload the same photo from your phone to the server). This external library will also prevent deleting photos as well, FYI.

There might be other options that I'm not aware of, as I've only been using Immich for about a month now.

Edit: link to the CLI documentation: https://immich.app/docs/features/command-line-interface/

Lookup Overseerr. https://overseerr.dev/

1 more...

Yep that's the new idea. The sad part is that with this method there's no way to get historical data. Only new posts. So if a server goes down, gets DDOSd etc... I'll lose posts forever.

Also building an ActivityPub implementation from scratch isn't trivial either. So that'll take some time.

I've got a few other ideas I'm playing with as well. Like just assuming that internal post IDs are all sequential and literally fetching them one by one. Or maybe some combination of both?

2 more...
  1. Yes most trackers have something on their website to let you know what your ratio is, what you're downloading and how long you've been seeding those files.
  2. With the trackers I'm familiar with yes -- seeding for 9d 23h 59m and 59s is the same as seeding for 0s. You'll still get tagged with a HnR (Hit and Run)
  3. You can shutdown as much as you like. But, again the trackers that I'm familiar with have a cap on the number of HnRs you can have on your account. So you might have action taken against you if you're seeding 5 different torrents and decide to shutdown.
  4. Don't know.
  5. The rest don't appear to be questions so not sure how to respond.
6 more...

I'm guessing Samsung. Google seems to show a handful of posts online about Samsung users asking about it. Or at least asking what is it, and how to uninstall it.

Pixel doesn't seems to have it pre-installed anyway.

I can't say anything about other manufacturers though.

Yep and I'm one of them. Go look me up on Reddit and I think I have maybe 20 posts over the 14+ years I was on the site. ...joined Lemmy and immediately got frustrated that I couldn't find anything. So I figured I take a crack at it. Especially since I couldn't see how Google would ever be able to link me to my instance. Let alone make it easy to search the entire fediverse without having to write out every possible site, with new ones popping up every day.

I'm also running Ubuntu as my main machine at home. (I have a Mac and do Android development for my day job).

But at home, I do a lot of website and backend dev.

  1. Code in VSCode
  2. Build using docker buildx
  3. Test using a local container on my machine
  4. Upload the tested code to a feature brach on git (self hosted server)
  5. Download that same feature branch on a RaspberryPi for QA testing.
  6. Merge that same code to develop 6a. That kicks off a CI build that deploys a set of docker images to DockerHub.
  7. Merge that to main/master.
  8. That kicks off another CI build.
  9. SSH into my prod machine and run docker compose up -d

But the API is instance specific.

The only ways I see this working is one of:

  1. You'd have to type in your instance name upon tapping the button
  2. You could have a settings page that lets you set your instance name and then twitch (or whatever service you're using would store that along side your other user data)
  3. It just assumes one of the most popular instances.

But without some central registry there's no way to know what is your home instance.

Edit: something like what Android does for Activities could work as well. But not sure how to handle that on a PC. ... In Android they could just start a generic Intent to view Lemmy and it could then launch whichever app you have installed to handle that intent.

Sounds good, and thanks for the hard work!

Let me introduce you to https://sense.com/ and help you create a new obsession.

P.s. it's not perfect as it uses machine learning to determine your appliances and it can't find electronics like your computer or TV but it'll help you find what might be chipping away at your power bill.

5 more...

Thanks. If you do some digging you can find the project on GitHub but note that it's a work in progress still. The UI is lacking and it's rough around the edges but it's "working". And I still need to do some optimizations on the crawler itself, etc....

It's also going to be completely self-hostable just like Lemmy, etc...

2 more...

How do you account for the duplicate Riker in TNG? Who's the real one and where did the extra matter come from then to assemble William vs Tom?

(It's been a long time since I've seen that episode so I don't remember if they covered that but on-screen)

A similar question could be raised for the Rascals episode...

2 more...

"some search string instance:lemmy.world".

Keywords are:

instance:<instance name>

community:!<community name>@<instance name>

and

author:@<author name>@<instance name>.

If an instance goes down (permanently), federation of all of the communities hosted by that instance essentially stop. The content that has already been posted remains but anything new added to those communities only remain on your home instance. The only way for federation to resume is for that instance to come back online with the same domain it started with.

1 more...

I can't give a timeframe on Kbin yet, as I want to get it as stable as possible with Lemmy. fist. But I think Kbin will probably be next on my radar as the overall structure of the two platforms is very similar.

Assuming the fediverse becomes mainstream one thing I hope to actually see is that existing company forums start to join the fediverse.

Think if you no longer needed to login to EA's website to post about bugs to the Sims. Or if Prusa's 3d printer community forums could also be found right here... Or any other existing community help forum.

The problem though is, that in order to get there, Meta and others have to bring the users and essentially show the way first.

just cheaper to get a .com than most of the other TLDs. Especially since I already have a registrar that I'm using for other sites.