I think it’ll make a comeback eventually. LLMs will get progressively less useful as a replacement as its’ training data stales. Without refreshed data it’s going to be just as irrelevant as the years go on. Where will it get data about new programming languages or solutions to problems in new software? LLM knowledge will be stuck in 2025 unless new training material is given to it.
Your analogy simply does not hold here. If you’re having an AI train itself to play chess, then you have adversarial reinforcement learning. The AI plays itself (or another model), and reward metrics tell it how well it’s doing. Chess has the following:
A very limited set of clearly defined, rigid rules.
One single end objective: put the other king in checkmate before yours is or, if you can’t, go for a draw.
Reasonable metrics for how you’re doing and an ability to reasonably predict how you’ll be doing later.
Here’s where generative AI is different: when you’re doing adversarial training with a generative deep learning model, you want one model to be a generator and the other to be a classifier. The classifier should be given some amount of human-made material and some amount of generator-made material and try to distinguish it. The classifier’s goal is to be correct, and the generator’s goal is for the classifier to pick completely randomly (i.e. it just picks on a coin flip). As you train, you gradually get both to be very, very good at their jobs. But you have to have human-made material to train the classifier, and if the classifier doesn’t improve, then the generator never does either.
Imagine teaching a 2nd grader the difference between a horse and a zebra having never shown them either before, and you hold up pictures asking if they contain a horse or a zebra. Except the entire time you just keep holding up pictures of zebras and expecting the child to learn what a horse looks like. That’s what you’re describing for the classifier.
But going with your story. Yes, you are right in general. But the human input is already there.
But you have to have human-made material to train the classifier, and if the classifier doesn’t improve, then the generator never does either.
AI can already understand what stripes are, and can draw the connection that a zebra is a horse without stripes. Therefore the human input is already given. Brute force learning will do the rest. Simply because time is irrelevant and computations occur at a much faster rate.
Therefore in the future I believe that AI will enhance itself. Because of the input it already got, which is sufficient to hone its skills.
While I know for now we are just talking about LLMs as blackboxes which are repetitive in generating output (no creativity). But the 2nd grader also has many skills which are sufficient to enlarge its knowledge. Not requiring everything taught by a human. in this sense.
I simply doubt this:
LLMs will get progressively less useful
Where will it get data about new programming languages or solutions to problems in new software?
On the other hand you are right. AI will not understand abstractions of something beyond its realm. But this does not mean it wont expedite in stuff that it can draw conclusions from.
And even in the case of new programming languages, I think a trained model will pick up the logic of the code - basically making use of its already learned pattern recognition skills. And probably at a faster pace than a human can understand a new programming language.
Dude, I’m sorry, I just don’t know how else to tell you “you don’t know what you’re talking about”. I’d refer you to Chapter 20 of Goodfellow et al.'s 2016 book on Deep Learning, but 1) it tragically came out a year before transformer models, and 2) most of it will go over your head without a foundation from many previous chapters. What you’re describing – generative AI training on generative AI ad infinitum – is a death spiral. Literally the entire premise of adversarial training of generative AI is that for the classifier to get better, you need to keep funneling in real material alongside the fake material.
You keep anthropomorphizing with “AI can already understand X”, but that betrays a fundamental misunderstanding of what a deep learning model is: it doesn’t “understand” shit about fuck; it’s an unfathomably complex nonlinear algebraic function that transforms inputs to outputs. To summarize in a word why you’re so wrong: overfitting. This is one of the first things you’ll learn about in a ML class, and it’s what happens when you let a model train on the same data over and over again forever. It’s especially bad for a classifier to be overfitted when it’s pitted against a generator, because a sufficiently complex generator will learn how to outsmart the overfitted classifier and it will find a cozy little local minimum that in reality works like dogshit but outsmarts the classifier which is its only job.
You really, really, really just fundamentally do not understand how a machine learning model works, and that’s okay – it’s a complex tool being presented to people who have no business knowing what a Hessian matrix or a DCT is – but please understand when you’re talking about it that these are extremely advanced and complex statistical models that work on mathematics, not vibes.
I think it’ll make a comeback eventually. LLMs will get progressively less useful as a replacement as its’ training data stales. Without refreshed data it’s going to be just as irrelevant as the years go on. Where will it get data about new programming languages or solutions to problems in new software? LLM knowledge will be stuck in 2025 unless new training material is given to it.
Until someone releases an open LLM in the sense that every prompt/question is published on a forum like site
lmao. Ignorance is bliss is it.
Well. I doubt that very much. Take as an analogy the success of the chess AI which was left training itself - compared to being trained…
Your analogy simply does not hold here. If you’re having an AI train itself to play chess, then you have adversarial reinforcement learning. The AI plays itself (or another model), and reward metrics tell it how well it’s doing. Chess has the following:
Here’s where generative AI is different: when you’re doing adversarial training with a generative deep learning model, you want one model to be a generator and the other to be a classifier. The classifier should be given some amount of human-made material and some amount of generator-made material and try to distinguish it. The classifier’s goal is to be correct, and the generator’s goal is for the classifier to pick completely randomly (i.e. it just picks on a coin flip). As you train, you gradually get both to be very, very good at their jobs. But you have to have human-made material to train the classifier, and if the classifier doesn’t improve, then the generator never does either.
Imagine teaching a 2nd grader the difference between a horse and a zebra having never shown them either before, and you hold up pictures asking if they contain a horse or a zebra. Except the entire time you just keep holding up pictures of zebras and expecting the child to learn what a horse looks like. That’s what you’re describing for the classifier.
well. indeed the devil’s in the detail.
But going with your story. Yes, you are right in general. But the human input is already there.
AI can already understand what stripes are, and can draw the connection that a zebra is a horse without stripes. Therefore the human input is already given. Brute force learning will do the rest. Simply because time is irrelevant and computations occur at a much faster rate.
Therefore in the future I believe that AI will enhance itself. Because of the input it already got, which is sufficient to hone its skills.
While I know for now we are just talking about LLMs as blackboxes which are repetitive in generating output (no creativity). But the 2nd grader also has many skills which are sufficient to enlarge its knowledge. Not requiring everything taught by a human. in this sense.
I simply doubt this:
On the other hand you are right. AI will not understand abstractions of something beyond its realm. But this does not mean it wont expedite in stuff that it can draw conclusions from.
And even in the case of new programming languages, I think a trained model will pick up the logic of the code - basically making use of its already learned pattern recognition skills. And probably at a faster pace than a human can understand a new programming language.
Dude, I’m sorry, I just don’t know how else to tell you “you don’t know what you’re talking about”. I’d refer you to Chapter 20 of Goodfellow et al.'s 2016 book on Deep Learning, but 1) it tragically came out a year before transformer models, and 2) most of it will go over your head without a foundation from many previous chapters. What you’re describing – generative AI training on generative AI ad infinitum – is a death spiral. Literally the entire premise of adversarial training of generative AI is that for the classifier to get better, you need to keep funneling in real material alongside the fake material.
You keep anthropomorphizing with “AI can already understand X”, but that betrays a fundamental misunderstanding of what a deep learning model is: it doesn’t “understand” shit about fuck; it’s an unfathomably complex nonlinear algebraic function that transforms inputs to outputs. To summarize in a word why you’re so wrong: overfitting. This is one of the first things you’ll learn about in a ML class, and it’s what happens when you let a model train on the same data over and over again forever. It’s especially bad for a classifier to be overfitted when it’s pitted against a generator, because a sufficiently complex generator will learn how to outsmart the overfitted classifier and it will find a cozy little local minimum that in reality works like dogshit but outsmarts the classifier which is its only job.
You really, really, really just fundamentally do not understand how a machine learning model works, and that’s okay – it’s a complex tool being presented to people who have no business knowing what a Hessian matrix or a DCT is – but please understand when you’re talking about it that these are extremely advanced and complex statistical models that work on mathematics, not vibes.