discord is a black hole for information

Traditional reasoning says you should prefer open forums like lemmy that are available and searchable to the open web. After all, you’re posting to help people, and that helps people the most. The platform (like reddit) may profit off of it, but that’s fine, they’re providing the platform for you to post. Fair deal.

Plus people coming for high quality information helps the community and topic back. You attract other high quality contributors, the more people use/partake in the topic you are discussing, the platform often improves with the revenue etc. It’s not perfect, but it worked

AI scrapers break all that. The company profiting is the AI company, and they give nothing back. They model just holds all the information in its weights. It doesn’t drive people to the source. Even the platform doesn’t benefit from bot scraping. The addition of high quality data may improve the model on that topic and thus push people to engage in said topic more, but not much, because of how AI’s are trained, while you need some high quality data, a lot more important, especially for lesser known topics, is amount of data.

So as more of the world moves to AI models, I don’t really feel like posting on public forums as much, helping the AI companies get richer, even if I do benefit from AI myself.

  • CameronDev@programming.dev
    link
    fedilink
    English
    arrow-up
    9
    ·
    3 days ago

    Only issue with this, is that the scrapers have no issue scraping the blackhole, so its really only punishing the humans who want to find the info

    • morrowind@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 days ago

      They’re invite only places. Though discord does have public servers too now I think, generally you can’t just access them through the open web.

      • CameronDev@programming.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        3 days ago

        If its truely invite only and not on the open web, then it would be unscrapeable regardless?

        Microsoft own the discord platform, have an active relationship with OpenAI, and can scrape and bundle the data without using a web scraper. If you want to escape AI scraping, keep invite only, but maybe reconsider Discord? edit: not true, discord is separate to MS, sorry.

        • morrowind@lemmy.mlOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 days ago

          Discord was just an example, I’m not attached to it. It could be signal or WhatsApp or matrix or whatever

          • Benaaasaaas@group.lt
            link
            fedilink
            English
            arrow-up
            2
            ·
            edit-2
            2 days ago

            As long as it has central authority and is not e2e encrypted. The data will be up for sale if there’s enough of it.

            So the result is that you’re making data exclusive to those who have the money rather than a level playing field.

  • Angry_Autist (he/him)@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    2 days ago

    You people really have no idea why AI is dangerous

    And no, it has nothing to do with imitating your shittastic prose style

    There hasn’t been a single post on lemmy that I’ve seen about AI’s ability to shape public discussion and frankly that is SIGNIFICANTLY more dangerous than using your shitty 11th grade love poem for content training