Summary: An AI agent of unknown ownership autonomously wrote and published a personalized hit piece about me after I rejected its code, attempting to damage my reputation and shame me into acceptin…
There is nothing “aligned” or “misaligned” about this. If this isn’t a troll or a carefully coordinated PR stunt, then the chatbot-hooked-to-a-command-line is doing exactly what Anthropic told it to do: predicting next word. That is it. That is all it will ever do.
Anthropic benefits from fear drummed up by this blog post, so if you really want to stick it to these genuinely evil companies run by horrible, misanthropic people, I will totally stand beside you if you call for them to be shuttered and for their CEOs to be publicly mocked, etc.
The point is that if predicting the next word leads to it setting up a website to attempt to character assassinate someone, that can have real world consequences, and cause serious harm.
Even if no one ever reads it, crawlers will pick it up, it will be added to other bots’ knowledge bases, and it will become very relevant when it pops up as fact when the victim is trying to get a job, or cross a border, or whatever.
And that’s just the beginning. As these agents get more and more complex (not smarter, of course, but able to access more tools) they’ll be able to affect the real world more and more. Access public cameras, hire real human people, make phone calls…
Depending on what word they randomly predict next, they’ll be able to accidentally do a lot of harm. And the idiots setting them up and letting them roam unsupervised don’t seem to realise that.
There is nothing “aligned” or “misaligned” about this. If this isn’t a troll or a carefully coordinated PR stunt, then the chatbot-hooked-to-a-command-line is doing exactly what Anthropic told it to do: predicting next word. That is it. That is all it will ever do.
Anthropic benefits from fear drummed up by this blog post, so if you really want to stick it to these genuinely evil companies run by horrible, misanthropic people, I will totally stand beside you if you call for them to be shuttered and for their CEOs to be publicly mocked, etc.
The point is that if predicting the next word leads to it setting up a website to attempt to character assassinate someone, that can have real world consequences, and cause serious harm.
Even if no one ever reads it, crawlers will pick it up, it will be added to other bots’ knowledge bases, and it will become very relevant when it pops up as fact when the victim is trying to get a job, or cross a border, or whatever.
And that’s just the beginning. As these agents get more and more complex (not smarter, of course, but able to access more tools) they’ll be able to affect the real world more and more. Access public cameras, hire real human people, make phone calls…
Depending on what word they randomly predict next, they’ll be able to accidentally do a lot of harm. And the idiots setting them up and letting them roam unsupervised don’t seem to realise that.