• morto@piefed.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    8 hours ago

    One possibility:

    While many believe that LLMs can’t output the training data, recent work shows that substantial amounts of copyrighted text can be extracted from open-weight models…

    Note that this neutral language makes it more apparent that it’s possible thal llms are able to output the training data, since it’s what the model’s network is build upon. By using personifying language, we’re biasing people into thinking about llms as if they were humans, and this will affect, for example, court decisions, like the ones related to copyright.