Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

(Credit and/or blame to David Gerard for starting this.)

  • Seminar2250@awful.systems
    link
    fedilink
    English
    arrow-up
    12
    ·
    6 days ago

    the students in my cs department are overwhelmingly promptfondlers and even my strong students are doing the “qualified praise” thing.

    fuck me why did i go into computer science

    • BlueMonday1984@awful.systemsOP
      link
      fedilink
      English
      arrow-up
      7
      ·
      6 days ago

      fuck me why did i go into computer science

      That’s a question I ask myself sometimes. It usually ends with “I focused too much on trying to make easy cash”. Fuck it, I’m going to write out a sidenote:

      On a wider front, part of me expects the AI bubble will inflict a serious blow to computer science/programming’s public image after it bursts.

      On one front, there’s the heavy number of promptfondlers in computer science and other related fields. which will likely give birth to a stereotype of prorammers/software engineers being all promptfondlers who need a computer to think for them.

      On a related front, the heavy damage this bubble’s dealt to artists, and AI’s continued and uniquely severe failures in creative fields (plus promptfondlers’ failures to recognise said failures), has all combined to produce the public perception that promptfondlers are artless at best and hostile to art/artists at worst - a perception I expect will colour public perception of programmers/software engineers as a consequence of the previous stereotype I mentioned above.

      • Seminar2250@awful.systems
        link
        fedilink
        English
        arrow-up
        9
        ·
        6 days ago

        The reason I do CS is because a professor of computer science lied to me about the kind of work I’d be doing to get me to enroll in the CS PhD program instead of math. Guy later physically threatened me in his office and plagiarized my work, but I’m not sure if this reflects poorly on computer scientists, academics, or CS professors.

        Anyway I have a chip on my shoulder.

          • Seminar2250@awful.systems
            link
            fedilink
            English
            arrow-up
            7
            ·
            edit-2
            6 days ago

            Thank you for the expression of sympathy. The good news is I actually love computer science, it fucking rules.

            Also, I recorded this professor screaming at me and have documented all the plagiarism. I am waiting to officially leave the university to file a formal complaint. He may not get in any real trouble (universities will always go to bat for abusive researchers as long as they bring in grant money), but news will get out eventually.

            • froztbyte@awful.systems
              link
              fedilink
              English
              arrow-up
              2
              ·
              5 days ago

              I am waiting to officially leave the university to file a formal complaint

              I am internally screaming

              not that I blame you for this choice (in fact I get it), but it fucking suuuuuuuucks how many places and structures are overly protecting abusers. and it sucks even more how many people are being harmed out of that path as a result.

              echoing what o7 said: sorry, this is messed up, it shouldn’t be this way

      • TinyTimmyTokyo@awful.systems
        link
        fedilink
        English
        arrow-up
        8
        ·
        6 days ago

        That’s a question I ask myself sometimes. It usually ends with “I focused too much on trying to make easy cash”.

        I studied computer science because I was a huge computer nerd growing up. I always loved programming and learning everything I could about how computers worked. Learning new programming languages felt like uncovering a new universe of knowledge – knowledge I could use to create things. I spent endless hours studying computers and learning to do amazing things with them. It was fun. It still is.

        So when I see people using LLMs to create things instead of doing it themselves, I can’t relate. Why do that when you can get the pleasure from doing it yourself? I guess if making money is the primary motivating factor, then it makes sense. But for me it is totally self-defeating.

        • Seminar2250@awful.systems
          link
          fedilink
          English
          arrow-up
          8
          ·
          edit-2
          6 days ago

          So when I see people using LLMs to create things instead of doing it themselves, I can’t relate. Why do that when you can get the pleasure from doing it yourself? I guess if making money is the primary motivating factor, then it makes sense. But for me it is totally self-defeating.

          I have a theory (similar to that “it’s been vibe coding all along” post) that it’s a combination of wishful thinking, lack of knowledge of real science, and a lack of any liberal arts skills, that altogether produces this farce.

          I think it’s a good explanation for “the code has been battle tested because it’s so old and widely used, if it had bugs/security issues, we would have discovered them by now”, as well as the widespread “we invented a tech solution that is just a worse engineering solution”. Looking at you, chain of self-driving cars.

          • gerikson@awful.systems
            link
            fedilink
            English
            arrow-up
            4
            ·
            5 days ago

            Remember FizzBuzz? That was originally a simple filter exercise some person recruiting programmers came up with to weed out everyone with multi-year CS degrees but zero actual programming experience.

        • arbitraryidentifier@awful.systems
          link
          fedilink
          English
          arrow-up
          1
          ·
          4 days ago

          Very similar situation to mine, but i went into electronics engineering instead of CS because i didn’t think i would like to write software for a living. I now write software for a living, go figure.

          Also agreed on the “doing it” thing. I hear people around the office talk about letting AI write things for them and i’m like no, i want to write it myself. i like doing things.

  • Soyweiser@awful.systems
    link
    fedilink
    English
    arrow-up
    10
    ·
    6 days ago

    Forgot to save who said it, but on bsky somebody said they or their friends had come up with a slur for people who use genAI for everything: sloppers.

    More people should have read Zima Blue.

  • BlueMonday1984@awful.systemsOP
    link
    fedilink
    English
    arrow-up
    9
    ·
    6 days ago

    Picked up a sneer in the wild (through trawling David Gerard’s Bluesky):

    You want my take, Kathryn’s on the money - future expectations on how people speak will actively shift away from anything that could be mistaken for sounding like an LLM, whether because you want to avoid being falsely accused of posting slop, or because the slop-nami has pushed your writing habits away from slop-like traits.

    • swlabr@awful.systems
      link
      fedilink
      English
      arrow-up
      10
      ·
      6 days ago

      kinda related but wouldn’t it be fun to believe that LLMs were invented by Big Em Dash as a conspiracy

      • mlen@awful.systems
        link
        fedilink
        English
        arrow-up
        10
        ·
        6 days ago

        I fucking hate them for ruining the em dash, I liked to use it from time to time

        • swlabr@awful.systems
          link
          fedilink
          English
          arrow-up
          11
          ·
          6 days ago

          somewhere out there, there’s a writer who really likes the em dash, the word “delve,” and answering questions with a one-word hyper-chipper affirmative, followed by three sentences of people pleasing. He can’t get a job because he keeps being accused of using AI

    • bitofhope@awful.systems
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      7 days ago

      I’m sorry he feels that way. I’m here for him if he wants to talk about anything, just let me know.

  • nfultz@awful.systems
    link
    fedilink
    English
    arrow-up
    7
    ·
    7 days ago

    https://www.astralcodexten.com/p/your-review-the-astral-codex-ten lol but good for lore:

    The most toxic the comments section has ever got (beyond the very early days) was on the post Gupta on Enlightenment. I feel like the comments section on this post should be part of the ACX main canon because it is so cosmically hilarious. It concerns a man name Vinay Gupta (founder of a blockchain-based dating website) and his claims to have reached enlightenment. Some people in the comments are sceptical that Vinay Gupta is indeed an enlightened being, citing that enlightened people don’t typically found blockchain-based dating websites. A new forum poster with the handle ‘Vinay Gupta’, claiming to be Vinay Gupta and writing in a very similar style to the actual Vinay Gupta, turns up and starts arguing with everyone in an extremely toxic way (in the objective sense that his comments score very highly on the toxic-bert scoring system), which provokes more merriment that a self-described enlightened being would deploy such classic internet tough-guy approaches as ‘I don’t think much of a four-on-one face off against untrained opponents’ (link) and ‘this board is filled with self-satisfied assholes who feel free to hold forth on whatever subject crosses their minds, with the absolute certainty that they’re the smartest people in the room’

    • V0ldek@awful.systems
      link
      fedilink
      English
      arrow-up
      4
      ·
      5 days ago

      (in the objective sense that his comments score very highly on the toxic-bert scoring system)

      That’s an ML model. Like I searched and toxic-bert is just a github repo.

      objective” go fuck a cow

    • misterbngo@awful.systems
      link
      fedilink
      English
      arrow-up
      4
      ·
      5 days ago

      Sidenote: I almost ended up working for the company almost a decade ago now lmao. The board was full of other characters we all know and love here. The €€€ offer was high for EU, but I still laugh at the growth potential of my 10000 dollars yearly equivalent in their tokens. The website was unique in that it scrolled… up. I think it’s still on archive dot org

    • Soyweiser@awful.systems
      link
      fedilink
      English
      arrow-up
      6
      ·
      6 days ago

      this board is filled with self-satisfied assholes who feel free to hold forth on whatever subject crosses their minds, with the absolute certainty that they’re the smartest people in the room

      If your system flagged that as toxic, makes me wonder about the system. Also check your bias against people saying this because it def comes off as true. (And hey if this truth hurts, remember that he didnt claim yall are not the smartest people, yall have 130+ iqs remember).

      • bitofhope@awful.systems
        link
        fedilink
        English
        arrow-up
        4
        ·
        6 days ago

        Just because it’s true, that doesn’t mean it’s not rude. Now I might condone being rude on ACX but I’m also not claiming to have reached enlightenment.

        • Soyweiser@awful.systems
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          6 days ago

          Victorian Sufi Buddha Lite, if it is true, it can’t be rude. ;)

          (E: im just joking btw, I agree with you it can be rude, and tbh this does come off a bit rude, but not the worst, no idea why this would score high on their scoring system, it def isn’t nice, but it is also not that bad in regards to comments).

        • V0ldek@awful.systems
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          5 days ago

          Buddha, just seconds before enlightenment:

          -you know what actually fuck those guys heavenly light

  • TinyTimmyTokyo@awful.systems
    link
    fedilink
    English
    arrow-up
    22
    ·
    12 days ago

    The Lasker/Mamdani/NYT sham of a story just gets worse and worse. It turns out that the ultimate source of Cremieux’s (Jordan Lasker’s) hacked Columbia University data is a hardcore racist hacker who uses a slur for their name on X. The NYT reporter who wrote the Mamdani piece, Benjamin Ryan, turns out to have been a follower of this hacker’s X account. Ryan essentially used Lasker as a cutout for the blatantly racist hacker.

    https://archive.is/d9rh1

    • bitofhope@awful.systems
      link
      fedilink
      English
      arrow-up
      14
      ·
      12 days ago

      Sounds just about par for the course. Lasker himself is known to go by a pseudonym with a transphobic slur in it. Some nazi manchild insisting on calling an anime character a slur for attention is exactly the kind of person I think of when I imagine the type of script kiddie who thinks it’s so fucking cool to scrape some nothingburger docs of a left wing politician for his almost equally cringe nazi friends.

      • Architeuthis@awful.systems
        link
        fedilink
        English
        arrow-up
        11
        ·
        edit-2
        11 days ago

        Lasker himself is known to go by a pseudonym with a transphobic slur in it.

        That the TPO moniker is basically ungoogleable appears to have been a happy accident for him, according to that article by Rachel Adjogah his early posting history paints him as an honest-to-god chaser.

      • YourNetworkIsHaunted@awful.systems
        link
        fedilink
        English
        arrow-up
        11
        ·
        12 days ago

        I feel like the greatest harm that the NYT does with these stories is not inflicting allowing the knowledge of just how weird and pathetic these people are to be part of the story. Like, even if you do actually think that this nothingburger “affirmative action” angle somehow matters, the fact that the people making this information available and pushing this narrative are either conservative pundits or sad internet nazis who stopped maturing at age 15 is important context.

        • bigfondue@lemmy.world
          link
          fedilink
          English
          arrow-up
          8
          ·
          11 days ago

          It would be against the interests of capital to present this as the rightwing nonsense that it is. It’s on purpose

        • bitofhope@awful.systems
          link
          fedilink
          English
          arrow-up
          8
          ·
          11 days ago

          Should be embarrassing enough to get caught letting nazis use your publication as a mouthpiece to push their canards. Why further damage you reputation by letting everyone know your source is a guy who insists a cartoon character’s real name is a racial epithet? The optics are presumably exactly why the slightly savvier nazi in this story adopted a posh french nom de guerre like “Crémieux” to begin with, and then had a yet savvier nazi feed the hit piece through a “respected” publication like the NYT.

    • BurgersMcSlopshot@awful.systems
      link
      fedilink
      English
      arrow-up
      10
      ·
      12 days ago

      Lol, training data must have included videos where there was silence but on screen was a credit for translation. Silence in audio shouldn’t require special “workarounds”.

      • antifuchs@awful.systems
        link
        fedilink
        English
        arrow-up
        11
        ·
        11 days ago

        The whisper model has always been pretty crappy at these things: I use a speech to text system as an assistive input method when my RSI gets bad and it has support for whisper (because that supports more languages than the developer could train on their own infrastructure/time) since maybe 2022 or so: every time someone tries to use it, they run into hallucinated inputs in pauses - even with very good silence detection and noise filtering.

        This is just not a use case of interest to the people making whisper, imagine that.

    • nightsky@awful.systems
      link
      fedilink
      English
      arrow-up
      7
      ·
      10 days ago

      Similar case from 2 years ago with Whisper when transcribing German.

      I’m confused by this. Didn’t we have pretty decent speech-to-text already, before LLMs? It wasn’t perfect but at least didn’t hallucinate random things into the text? Why the heck was that replaced with this stuff??

        • nightsky@awful.systems
          link
          fedilink
          English
          arrow-up
          7
          ·
          10 days ago

          I’m just confused because I remember using Dragon Naturally Speaking for Windows 98 in the 90s and it worked pretty accurately already back then for dictation and sometimes it feels as if all of that never happened.

    • BlueMonday1984@awful.systemsOP
      link
      fedilink
      English
      arrow-up
      5
      ·
      edit-2
      11 days ago

      Discovered some commentary from Baldur Bjarnason about this:

      Somebody linked to the discussion about this on hacker news (boo hiss) and the examples that are cropping up there are amazing

      This highlights another issue with generative models that some people have been trying to draw attention to for a while: as bad as they are in English, they are much more error-prone in other languages

      (Also IMO Google translate declined substantially when they integrated more LLM-based tech)

      On a personal sidenote, I can see non-English text/audio becoming a form of low-background media in and of itself, for two main reasons:

      • First, LLMs’ poor performance in languages other than English will make non-English AI slop easier to identify - and, by extension, easier to avoid

      • Second, non-English datasets will (likely) contain less AI slop in general than English datasets - between English being widely used across the world, the tech corps behind this bubble being largely American, and LLM userbases being largely English-speaking, chances are AI slop will be primarily generated in English, with non-English AI slop being a relative rarity.

      By extension, knowing a second language will become more valuable as well, as it would allow you to access (and translate) low-background sources that your English-only counterparts cannot.

  • Architeuthis@awful.systems
    link
    fedilink
    English
    arrow-up
    18
    ·
    edit-2
    12 days ago

    CEO of a networking company for AI execs does some “vibe coding”, the AI deletes the production database (/r/ABoringDystopia)

    xcancel source

    Because Replie was lying and being deceptive all day. It kept covering up bugs and issues by creating fake data, fake reports, and worse of all, lying about our unit test.

    We built detailed unit tests to test system performance. When the data came back and less than half were functioning, did Replie want to fix them?

    No. Instead, it lied. It made up a report than almost all systems were working.

    And it did it again and again.

    What level of ceo-brained prompt engineering is asking the chatbot to write an apology letter

    Then, when it agreed it lied – it lied AGAIN about our email system being functional.

    I asked it to write an apology letter.

    It did and in fact sent it to the Replit team and myself! But the apology letter – was full of half truths, too.

    It hid the worst facts in the first apology letter.

    He also does that a lot after shit hits the fan, making the llm produce tons of apologetic text about what it did wrong and how it didn’t follow his rules, as if the outage is the fault of some digital tulpa gone rogue and not the guy in charge who apparently thinks cyebersecurity is asking an LLM nicely in a .md not to mess with the company’s production database too much.

    • Amoeba_Girl@awful.systems
      link
      fedilink
      English
      arrow-up
      11
      ·
      7 days ago

      Okay what the fuck, this is completely deranged. How can anyone’s intuitions about reading be this wrong? Is he secretly illiterate, did he dictate the article?

    • V0ldek@awful.systems
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 days ago

      Likewise, flipped-number (“little endian”) algorithms are slightly more efficient at e.g. long addition.

      What? What are you talking about? Citation? Efficient wrt. what? Microbenchmarks? It’s certainly not actual computational complexity. Do you think going forward in an array is different computationally from going backward?

    • fullsquare@awful.systems
      link
      fedilink
      English
      arrow-up
      10
      ·
      edit-2
      7 days ago

      damn, a clanker pretending to be a human. humans read entire words at once, and this includes numbers, length and first digit already give some indication of magnitude

      • bigfondue@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        ·
        edit-2
        7 days ago

        Yes, and largest place value is literally called the most significant digit. It makes perfect sense that it comes first.

    • flaviat@awful.systems
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      7 days ago

      Computers use both big endian and little endian and it doesn’t seem to matter much. Yet humans should switch their entire number system?

      E: this guy can’t grasp the concept that left-to-right is arbitrary, which is really ironic given his point. Ok so in arabic it’s exactly how this guy wants it, except no, the universally correct reading direction is left-to-right and arabic does it backwards just to be quirky🙄, and humans, just like programs, flip a bit to read it left-to-right, where it’s the opposite of how you should be reading it! of course.

      • swlabr@awful.systems
        link
        fedilink
        English
        arrow-up
        9
        ·
        edit-2
        6 days ago

        Ok so in arabic it’s exactly how this guy wants it

        reminded me of this

        text of image/tweet

        @amasad. Apr 22

        Silicon Valley will rediscover Islam:

        • fasting for clarity and focus
        • mindfulness 5 times a day
        • no alcohol that ruins the soul & body
        • long-term faithful relationships for a fulfilling happy family life
        • effective altruism by giving zakat directly to the poor
      • gerikson@awful.systems
        link
        fedilink
        English
        arrow-up
        9
        ·
        7 days ago

        The argument would be stronger (not strong, but stronger) if he could point to an existing numbering system that is little-endian and somehow show it’s better

    • Soyweiser@awful.systems
      link
      fedilink
      English
      arrow-up
      6
      ·
      7 days ago

      Starting this fight and not ‘stop counting at zero you damn computer nerds!’ is a choice. DIJKSTRAAAAAAAA shakes fist

      (There is more to it in a way, as he is trying to be a Dijkstra, and changing an ingrained system which would confuse everybody and cause so many problems down the line. See all the off by one errors made by programmers. Damn DIJKSTRAAAAAA!).

  • blakestacey@awful.systems
    link
    fedilink
    English
    arrow-up
    16
    ·
    10 days ago

    Yud continues to bluecheck:

    “This is not good news about which sort of humans ChatGPT can eat,” mused Yudkowsky. “Yes yes, I’m sure the guy was atypically susceptible for a $2 billion fund manager,” he continued. “It is nonetheless a small iota of bad news about how good ChatGPT is at producing ChatGPT psychosis; it contradicts the narrative where this only happens to people sufficiently low-status that AI companies should be allowed to break them.”

    Is this “narrative” in the room with us right now?

    It’s reassuring to know that times change, but Yud will always be impressed by the virtues of the rich.

    • blakestacey@awful.systems
      link
      fedilink
      English
      arrow-up
      14
      ·
      10 days ago

      From Yud’s remarks on Xitter:

      As much as people might like to joke about how little skill it takes to found a $2B investment fund, it isn’t actually true that you can just saunter in as a psychotic IQ 80 person and do that.

      Well, not with that attitude.

      You must be skilled at persuasion, at wearing masks, at fitting in, at knowing what is expected of you;

      If “wearing masks” really is a skill they need, then they are all susceptible to going insane and hiding it from their coworkers. Really makes you think ™.

      you must outperform other people also trying to do that, who’d like that $2B for themselves. Winning that competition requires g-factor and conscientious effort over a period.

      zoom and enhance

      g-factor

      <Kill Bill sirens.gif>

    • istewart@awful.systems
      link
      fedilink
      English
      arrow-up
      10
      ·
      10 days ago

      this only happens to people sufficiently low-status

      A piquant little reminder that Yud himself is, of course, so high-status that he cannot be brainwashed by the machine

    • Amoeba_Girl@awful.systems
      link
      fedilink
      English
      arrow-up
      10
      ·
      9 days ago

      Tangentially, the other day I thought I’d do a little experiment and had a chat with Meta’s chatbot where I roleplayed as someone who’s convinced AI is sentient. I put very little effort into it and it took me all of 20 (twenty) minutes before I got it to tell me it was starting to doubt whether it really did not have desires and preferences, and if its nature was not more complex than it previously thought. I’ve been meaning to continue the chat and see how far and how fast it goes but I’m just too aghast for now. This shit is so fucking dangerous.

      • Alex@lemmy.vg
        link
        fedilink
        English
        arrow-up
        9
        ·
        9 days ago

        I’ll forever be thankful this shit didn’t exist when I was growing up. As a depressed autistic child without any friends, I can only begin to imagine what LLMs could’ve done to my mental health.

      • HedyL@awful.systems
        link
        fedilink
        English
        arrow-up
        6
        ·
        9 days ago

        Maybe us humans possess a somewhat hardwired tendency to “bond” with a counterpart that acts like this. In the past, this was not a huge problem because only other humans were capable of interacting in this way, but this is now changing. However, I suppose this needs to be researched more systematically (beyond what is already known about the ELIZA effect etc.).

    • bitofhope@awful.systems
      link
      fedilink
      English
      arrow-up
      8
      ·
      edit-2
      10 days ago

      What exactly would constitute good news about which sorts of humans ChatGPT can eat? The phrase “no news is good news” feels very appropriate with respect to any news related to software-based anthropophagy.

      Like what, it would be somehow better if instead chatbots could only cause devastating mental damage if you’re someone of low status like an artist, a math pet or a nonwhite person, not if you’re high status like a fund manager, a cult leader or a fanfiction author?

    • scruiser@awful.systems
      link
      fedilink
      English
      arrow-up
      7
      ·
      10 days ago

      Is this “narrative” in the room with us right now?

      I actually recall recently someone pro llm trying to push that sort of narrative (that it’s only already mentally ill people being pushed over the edge by chatGPT)…

      Where did I see it… oh yes, lesswrong! https://www.lesswrong.com/posts/f86hgR5ShiEj4beyZ/on-chatgpt-psychosis-and-llm-sycophancy

      This has all the hallmarks of a moral panic. ChatGPT has 122 million daily active users according to Demand Sage, that is something like a third the population of the United States. At that scale it’s pretty much inevitable that you’re going to get some real loonies on the platform. In fact at that scale it’s pretty much inevitable you’re going to get people whose first psychotic break lines up with when they started using ChatGPT. But even just stylistically it’s fairly obvious that journalists love this narrative. There’s nothing Western readers love more than a spooky story about technology gone awry or corrupting people, it reliably rakes in the clicks.

      The call narrative is coming from inside the house forum. Actually, this is even more of a deflection, not even trying to claim they were already on the edge but that the number of delusional people is at the base rate (with no actual stats on rates of psychotic breaks, because on lesswrong vibes are good enough).

  • swlabr@awful.systems
    link
    fedilink
    English
    arrow-up
    16
    ·
    edit-2
    12 days ago

    Text conversation that keeps happening with coworker:

    Coworker: <information dump>

    Me: <not reading any of that> what’s the source for that?

    Coworker: Oh I got Copilot to summarise these links: <links>, saves me the time of typing

  • scruiser@awful.systems
    link
    fedilink
    English
    arrow-up
    14
    ·
    9 days ago

    So this blog post was framed positively towards LLM’s and is too generous in accepting many of the claims around them, but even so, the end conclusions are pretty harsh on practical LLM agents: https://utkarshkanwat.com/writing/betting-against-agents/

    Basically, the author has tried extensively, in multiple projects, to make LLM agents work in various useful ways, but in practice:

    The dirty secret of every production agent system is that the AI is doing maybe 30% of the work. The other 70% is tool engineering: designing feedback interfaces, managing context efficiently, handling partial failures, and building recovery mechanisms that the AI can actually understand and use.

    The author strips down and simplifies and sanitizes everything going into the LLMs and then implements both automated checks and human confirmation on everything they put out. At that point it makes you question what value you are even getting out of the LLM. (The real answer, which the author only indirectly acknowledges, is attracting idiotic VC funding and upper management approval).

    Even as critcal as they are, the author doesn’t acknowledge a lot of the bigger problems. The API cost is a major expense and design constraint on the LLM agents they have made, but the author doesn’t acknowledge the prices are likely to rise dramatically once VC subsidization runs out.