For those self-hosting RSS feeds... what options do you have when the source doesn't have an RSS feed?

Showroom7561@lemmy.ca to Selfhosted@lemmy.world – 36 points –

With Twitter being worse than ever, I can no longer pull local news and municipal events through Nitter's RSS feature.

Since so many groups have stopped using RSS to deliver news, and have put all their eggs in the social media basket, it leaves a void that can't be replaced by signing up to a dozen newsletters.

Do you guys have any other solutions for maybe scraping websites to generate RSS feeds or something like that?

I'm using FresshRSS. It has web scraping, but seems to require a lot of manual syntax entry, and seems to error out regardless.

13

Don't know if this will achieve what you want, but I selfhost ChangeDetection.io to check if webpages have been updated, then subscribe to changedetection's RSS feed with FreshRSS.

Interesting option! I don't think it will suit my needs for this particular request, but I do have other uses for it =) Thank you for the suggestion.

Fairly simple using Python locally with no need for a server: requests-html to get the website front page, then loop through the articles using feedgenerator to increment a feed object, then pipe it as XML to a file.

Obviously this is not simple at all but it does work. I have been consuming an RSS-free site by RSS every day for the last year. Provided you ensure theguid for each item is its URL, the RSS reader will keep track of what you have seen already, in order, which of course is the magic feature of RSS.

For Lemmy (which doesn't have a native RSS feed) I'm using Open RSS. It might be worth entering the sites you're trying to access into that, and see if it can produce feeds for you.

Lemmy does have RSS feeds, just click the RSS icons in various places:

I usually just resort to webscraping