Chad scraperdramaticcat@sh.itjust.works to Lemmy Shitpost@lemmy.world – 1014 points – 1 years ago96Post a CommentPreviewYou are viewing a single commentView all commentsEveryone loves the idea of scraping, no one likes maintaining scrapers that break once a week because the CSS or HTML changed.I loved scraping until my ip was blocked for botting lol. I know there's ways around it it's just work thoughI successfully scraped millions of Amazon product listings simply by routing through TOR and cycling the exit node every 10 seconds.That's a good idea right there, I like thatThis guy scrapeslmao, yeah, get all the exit nodes banned from amazon.That’s the neat thing, it wouldn’t because traffic only spikes for 10s on any particular node. It perfectly blends into the background noise.Queue Office Space style error and scrape for 10 hours on each node.You guys use IP's?Token ring for me baybeeeI'm coding baby's first bot over here lol, I could probably do betterOr in the case of wikipedia, every table on successive pages for sequential data is formatted differently.Just use AI to make changes ¯_(ツ)_/¯Here take these: \\¯_(ツ)_/¯\\ Thanks3 more...
Everyone loves the idea of scraping, no one likes maintaining scrapers that break once a week because the CSS or HTML changed.I loved scraping until my ip was blocked for botting lol. I know there's ways around it it's just work thoughI successfully scraped millions of Amazon product listings simply by routing through TOR and cycling the exit node every 10 seconds.That's a good idea right there, I like thatThis guy scrapeslmao, yeah, get all the exit nodes banned from amazon.That’s the neat thing, it wouldn’t because traffic only spikes for 10s on any particular node. It perfectly blends into the background noise.Queue Office Space style error and scrape for 10 hours on each node.You guys use IP's?Token ring for me baybeeeI'm coding baby's first bot over here lol, I could probably do betterOr in the case of wikipedia, every table on successive pages for sequential data is formatted differently.Just use AI to make changes ¯_(ツ)_/¯Here take these: \\¯_(ツ)_/¯\\ Thanks3 more...
I loved scraping until my ip was blocked for botting lol. I know there's ways around it it's just work thoughI successfully scraped millions of Amazon product listings simply by routing through TOR and cycling the exit node every 10 seconds.That's a good idea right there, I like thatThis guy scrapeslmao, yeah, get all the exit nodes banned from amazon.That’s the neat thing, it wouldn’t because traffic only spikes for 10s on any particular node. It perfectly blends into the background noise.Queue Office Space style error and scrape for 10 hours on each node.You guys use IP's?Token ring for me baybeeeI'm coding baby's first bot over here lol, I could probably do better
I successfully scraped millions of Amazon product listings simply by routing through TOR and cycling the exit node every 10 seconds.That's a good idea right there, I like thatThis guy scrapeslmao, yeah, get all the exit nodes banned from amazon.That’s the neat thing, it wouldn’t because traffic only spikes for 10s on any particular node. It perfectly blends into the background noise.Queue Office Space style error and scrape for 10 hours on each node.
lmao, yeah, get all the exit nodes banned from amazon.That’s the neat thing, it wouldn’t because traffic only spikes for 10s on any particular node. It perfectly blends into the background noise.Queue Office Space style error and scrape for 10 hours on each node.
That’s the neat thing, it wouldn’t because traffic only spikes for 10s on any particular node. It perfectly blends into the background noise.Queue Office Space style error and scrape for 10 hours on each node.
You guys use IP's?Token ring for me baybeeeI'm coding baby's first bot over here lol, I could probably do better
Or in the case of wikipedia, every table on successive pages for sequential data is formatted differently.
Everyone loves the idea of scraping, no one likes maintaining scrapers that break once a week because the CSS or HTML changed.
I loved scraping until my ip was blocked for botting lol. I know there's ways around it it's just work though
I successfully scraped millions of Amazon product listings simply by routing through TOR and cycling the exit node every 10 seconds.
That's a good idea right there, I like that
This guy scrapes
lmao, yeah, get all the exit nodes banned from amazon.
That’s the neat thing, it wouldn’t because traffic only spikes for 10s on any particular node. It perfectly blends into the background noise.
Queue Office Space style error and scrape for 10 hours on each node.
You guys use IP's?
Token ring for me baybeee
I'm coding baby's first bot over here lol, I could probably do better
Or in the case of wikipedia, every table on successive pages for sequential data is formatted differently.
Just use AI to make changes ¯_(ツ)_/¯
Here take these: \\
¯_(ツ)_/¯\\ Thanks