• 9 Posts
  • 301 Comments
Joined 2 years ago
cake
Cake day: June 23rd, 2023

help-circle







  • Many people believe that the ToS was added to make Mozilla legally able to train AIs on the collected data.

    “Don’t attribute to malice what is easily explained by incompetence”

    So yea Mozilla wrote some terms that where ambiguous and could be interpreted in different ways, and ‘many people believed’ that they did this intentionally and had the worst intentions possible by their interpretation of the new ToS

    Then Mozilla rewrote that ToS after seeing how people were interpreting the original ToS:
    https://www.theverge.com/news/622080/mozilla-revising-firefox-terms-of-use-data

    And yea, now ‘many people will believe’ that ‘Mozilla revised their decision to do this after the backslash’ - OR, it was never their intention and now phrased it better after the confusion

    People just want to get their pitchforks out and start drama at any possible opportunity without evidence of wrongdoing… Mozilla added stupid stuff to the ToS, ok yea fair enough - but if they actually did “steal user data” - this would be very easily detectable with Wireshark or something





  • It probably depends on the level of the criminals and organized crime groups. I saw this Youtube video a couple weeks ago that talks about the history of how organized crime groups were using encrypted communication https://www.youtube.com/watch?v=gigIOc_0PKo (And how they were honey-potted by the FBI to use an FBI-hosted service, lol)

    Organized crime groups that make 100s of millions should be capable enough to hire skilled developers and sysops to host self-managed services. At some point if they make enough money, investing in self-managed communication becomes preferable over using telegram or signal.




  • Also some feedback, a bit more technical, since I was trying to see how it works, more of a suggestion I suppose

    It looks like you’re looping through the documents and asking it for known tags, right? ({str(db.current_library.tags)}.)

    I don’t know if I would do this through a chat completion and a chat response, there are special functions for keyword-like searching, like embeddings. It’s a lot faster, and also probably way cheaper, since you’re paying barely anything for embeddings compared to chat tokens

    So the common way to do something like this in AI would be to use Vectors and embeddings: https://platform.openai.com/docs/guides/embeddings

    So - you’d ask for an embedding (A vector) for all your tags first. Then you ask for embeddings of your document.

    Then you can do a Nearest Neighbor Search for the tags, and see how closely they match




  • I haven’t used json(b) in a Spring app, so I can’t say much about that.

    Json vs Jsonb depends on the use-case. Inserting json is faster than inserting Jsonb. Reading json (based on searching for specific json properties) Jsonb is faster, because Jsonb is parsed into a more optimized tree.

    From my experience, I don’t really like doing selects based on json properties. If I know I’ll be selecting a certain property, I usually add an additional column next to the json with the data, and insert that property there (At least in c#/dotnet, with EF) The frameworks don’t have that much support for selecting within json (you can do it, it’s just a lot more natively supported to use proper columns)



  • I’m not entirely sure what you hope to achieve: have a GPG encrypted subject, and have ThunderBird automatically understand that it’s encrypted, so it can be automatically decrypted?

    Since you’re saying you’re building software to support this, what are you building? A ThunderBird plugin that can do this? Or just standalone software that you want to make compatible with ThunderBird default way of handling encryption?