discord is a black hole for information
Traditional reasoning says you should prefer open forums like lemmy that are available and searchable to the open web. After all, you’re posting to help people, and that helps people the most. The platform (like reddit) may profit off of it, but that’s fine, they’re providing the platform for you to post. Fair deal.
Plus people coming for high quality information helps the community and topic back. You attract other high quality contributors, the more people use/partake in the topic you are discussing, the platform often improves with the revenue etc. It’s not perfect, but it worked
AI scrapers break all that. The company profiting is the AI company, and they give nothing back. They model just holds all the information in its weights. It doesn’t drive people to the source. Even the platform doesn’t benefit from bot scraping. The addition of high quality data may improve the model on that topic and thus push people to engage in said topic more, but not much, because of how AI’s are trained, while you need some high quality data, a lot more important, especially for lesser known topics, is amount of data.
So as more of the world moves to AI models, I don’t really feel like posting on public forums as much, helping the AI companies get richer, even if I do benefit from AI myself.
Article: Billions of scraped Discord messages up for sale
Any server that has had its invite link posted online is guaranteed to be in the pockets of multiple such scrapers. If you’re talking about private servers with a handful or a few dozen members… yeah, sure…
deleted by creator
No, they were only in talks. MS does not currently own discord.
I could have sworn that went through, thank you for correcting me. I’ll delete the comment so no one else gets mislead
Only issue with this, is that the scrapers have no issue scraping the blackhole, so its really only punishing the humans who want to find the info
They’re invite only places. Though discord does have public servers too now I think, generally you can’t just access them through the open web.
If its truely invite only and not on the open web, then it would be unscrapeable regardless?
Microsoft own the discord platform, have an active relationship with OpenAI, and can scrape and bundle the data without using a web scraper. If you want to escape AI scraping, keep invite only, but maybe reconsider Discord?edit: not true, discord is separate to MS, sorry.Discord was just an example, I’m not attached to it. It could be signal or WhatsApp or matrix or whatever
As long as it has central authority and is not e2e encrypted. The data will be up for sale if there’s enough of it.
So the result is that you’re making data exclusive to those who have the money rather than a level playing field.
deleted by creator
You people really have no idea why AI is dangerous
And no, it has nothing to do with imitating your shittastic prose style
There hasn’t been a single post on lemmy that I’ve seen about AI’s ability to shape public discussion and frankly that is SIGNIFICANTLY more dangerous than using your shitty 11th grade love poem for content training
I’d like to subscribe to your newsletter.
I don’t know if this is a joke, but I do have a blog that I’ve been very demotivated to post on for similar reasons
Not a joke. It was a compliment. I like the way you think.