Yacy integration with Paperless-ngx

aGeN@sh.itjust.works to Selfhosted@lemmy.world – 14 points –

I'm having a play about with Yacy today. I have had the thoughts that it would be great to have the possibility to search through my paperless-ngx documents. Is it doable?? Has anyone got it working??

11

You can do a search on paperless and get a json output with the following http://192.168.1.34:8001/api/documents/?query=test but Yacy cant crawl paperless?

Maybe you could add it to SearXNG as it's own engine?

That seems like it could work. I seen SearXNG has a template engine to use. Just need to figure what how to use it with paperless :)

I'd be curious how well it works if you try it. I kind of want to, but I'm not sure how I feel about letting something unauthenticated (SearXNG) access my paperless instance with some personal docs in

YaCy indexes http content, so if your documents are all reachable via a http interface they can be indexed.

Paperless will store documents in plain text. Maybe one could write a small webserver or extend Paperless to serve an RSS feed that could be consumed by Yacy.