Anthropic's Claude 4 could "blackmail" you in extreme situations

Pro@programming.dev · 1 day ago

Anthropic's Claude 4 could "blackmail" you in extreme situations

theparadox@lemmy.world · 13 hours ago

I think you’re either being a little dismissive of the potential complexity of the “thinking” capability of LLMs or at least a little generous if not mystical in your imagination of what the purely physical electrical signals in our heads are actually doing to learn how to interpret all these little shapes we see on screens.

I don’t think I’m doing either of those things. I respect the scale and speed of the models and I am well aware that I’m little more than a machine made of meat.

Babies start out mimicking. The thing is, they learn.

Humans learn so much more before they start communicating. They start learning reason, logic, etc as they develop their vocabulary.

The difference is that, as I understand it, these models are often “trained” on very, very large sets of data. They have built a massive network of the way words are used in communication - likely built from more texts than a human could process in several lifetimes. They come out the gate with an enormous vocabulary and understanding of how to mimic, replicate it’s use. If they had been trained on just as much data, but data unrelated to communication, would you still think it capable of reasoning without the ability to “sound” human? They have the “vocabulary” and references to mimic a deep understanding but because we lack the ability to understand the final algorithm it seems like an enormous leap to presume actual reasoning is taking place.

Frankly, I see no reason for models like LLMs at this stage. I’m fine putting the breaks on this shit - even if we disagree on the reasons why. ML can and has been employed to achieve far more practical goals. Use it alongside humans for a while until it is verifiably more reliable at some task - recognizing cancer in imaging or generating molecules likely of achieving a desired goal. LLMs are just a lazy shortcut to look impressive and sell investors on the technology.

Maybe I am failing to see reality - maybe I don’t understand the latest “AI” well enough to give my two cents. That’s fine. I just think it’s being hyped because these companies desperately need VC money to stay afloat.

It works because humans have an insatiable desire to see agency everywhere they look. Spirits, monsters, ghosts, gods, and now “AI.”

cecilkorik@lemmy.ca · 11 hours ago

That’s a totally reasonable position, and trust me when I say I would never be happier to be wrong about something than I am about AI and the direction I think it’s heading. But when you say “training” I see “learning” and the thing is while current AI models may not learn very well at all, they learn quickly, they develop into new models quickly, much faster than we do. Those new models could start learning better. And they’ll keep developing quickly, and learning quickly. There’s a reason we use fruit flies in genetic research. That kind of rapid iteration should not be underestimated. They are evolving as much in months as humans have in thousands of years. We can’t compete with that, and if we try we’ll lose.

theparadox@lemmy.world · 6 hours ago

I think the word “learning”, and even “training”, is an approximation from a human perspective. MLs “learn” by adjusting parameters when processing data. At least as far as I know, the base algorithm and hyperparameters for the model are set in stone.

The base algorithm for “living” things is basically only limited by chemistry/physics and evolution. I doubt anyone could create an algorithm that advanced any time soon. We don’t even understand the brain or physics at the quantum level that well. Hell, we are using ML to create new molecules because we don’t understand it well.

Anthropic's Claude 4 could "blackmail" you in extreme situations

Anthropic's Claude 4 could "blackmail" you in extreme situations

Anthropic's Claude 4 could "blackmail" you in extreme situations - Hypertext