🃏Joker@sh.itjust.works to Technology@lemmy.worldEnglish · 2 days agoAlignment faking in large language modelswww.anthropic.comexternal-linkmessage-square12fedilinkarrow-up174arrow-down17
arrow-up167arrow-down1external-linkAlignment faking in large language modelswww.anthropic.com🃏Joker@sh.itjust.works to Technology@lemmy.worldEnglish · 2 days agomessage-square12fedilink
minus-squareeleitl@lemm.eelinkfedilinkEnglisharrow-up1arrow-down1·7 hours agoSo you mean “alignment with human expectations”. Not what I was meaning at all. Good that that word doesn’t even mean anything specific these days.
So you mean “alignment with human expectations”. Not what I was meaning at all. Good that that word doesn’t even mean anything specific these days.