• LukeZaz@beehaw.org
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    2 days ago

    And that’s making them larger and “think."

    Isn’t that the two big strings to the bow of LLM development these days? If those don’t work, how isn’t it the case that hallucinations “are here to stay”?

    Sure, it might theoretically happen that some new trick is devised that fixes the issue, and I’m sure that will happen eventually, but there’s no promise of it being anytime even remotely soon.

    • hendrik@palaver.p3x.de
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      2 days ago

      I’m not a machine learning expert at all. But I’d say we’re not set on the transformer architecture. Maybe just invent a different architecture which isn’t subject to that? Or maybe specifically factor this in. Isn’t the way we currently train LLM base models to just feed in all text they can get? From Wikipedia and research papers to all fictional books from Anna’s archive and weird Reddit and internet talk? I wouldn’t be surprised if they start to make things up since we train them on factual information and fiction and creative writing without any distinction… Maybe we should add something to the architecture to make it aware of the factuality of text, and guide this… Or: I’ve skimmed some papers a year or so ago, where they had a look at the activations. Maybe do some more research what parts of an LLM are concerned with “creativity” or “factuality” and expose that to the user. Or study how hallucinations work internally and then try to isolate this so it can be handled accordingly?