Bluesky may have said it won’t use user data to train generative AI, but someone else just published a dataset of million Bluesky posts for “machine learning research”. Already very popular dataset, your data may be scraped
You must log in or register to comment.
- The same can and will happen with the Fediverse right? - Probably already happened 
- deleted by creator - I see. Probably mastodon.social gets scraped, then 🫣 
- Is that a problem for a proper scraper? Give the machine a list of domains and some hints about the relevant protocols, and then the computer runs until the end of the list. 
 
 
- tbh this can happen with everything now so… - i’m not sure what would be the solution, sadly. 



