‘Reddit can survive without search’: company reportedly threatens to block GoogleAnActOfCreation@programming.dev to Technology@lemmy.world – 1224 points – 1 years agotheverge.com329Post a CommentPreviewYou are viewing a single commentView all commentsSpeaking of this, what parts of the fediverse have added the option to block training generative AI to their respective robots.txt? https://blog.google/technology/ai/an-update-on-web-publisher-controls/ https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers https://techcrunch.com/2023/09/28/medium-hints-at-a-nascent-media-coalition-to-block-ai-crawlers/ It looks like there's a handful of these lines you'd have to add to robots.txt Is there anywhere that keeps a comprehensive list of these?I've been trying to find a list as well to no avail. The ones I do know are on my own robots.txt, at volcanolair.co/robots.txtSomeone should make a github just to make it easier for people to find them all in one place with sources and update the list as we get new ones.
Speaking of this, what parts of the fediverse have added the option to block training generative AI to their respective robots.txt? https://blog.google/technology/ai/an-update-on-web-publisher-controls/ https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers https://techcrunch.com/2023/09/28/medium-hints-at-a-nascent-media-coalition-to-block-ai-crawlers/ It looks like there's a handful of these lines you'd have to add to robots.txt Is there anywhere that keeps a comprehensive list of these?I've been trying to find a list as well to no avail. The ones I do know are on my own robots.txt, at volcanolair.co/robots.txtSomeone should make a github just to make it easier for people to find them all in one place with sources and update the list as we get new ones.
I've been trying to find a list as well to no avail. The ones I do know are on my own robots.txt, at volcanolair.co/robots.txtSomeone should make a github just to make it easier for people to find them all in one place with sources and update the list as we get new ones.
Someone should make a github just to make it easier for people to find them all in one place with sources and update the list as we get new ones.
Speaking of this, what parts of the fediverse have added the option to block training generative AI to their respective robots.txt?
https://blog.google/technology/ai/an-update-on-web-publisher-controls/ https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers https://techcrunch.com/2023/09/28/medium-hints-at-a-nascent-media-coalition-to-block-ai-crawlers/
It looks like there's a handful of these lines you'd have to add to robots.txt
Is there anywhere that keeps a comprehensive list of these?
I've been trying to find a list as well to no avail. The ones I do know are on my own robots.txt, at volcanolair.co/robots.txt
Someone should make a github just to make it easier for people to find them all in one place with sources and update the list as we get new ones.